Make ExchangeService more or less thread safe#428
Make ExchangeService more or less thread safe#428candrews wants to merge 2 commits intoOfficeDev:masterfrom
Conversation
HttpClientContext isn't thread safe, and isn't meant for reuse, so create a new one each time it's needed Use a shared cookied store per ExchangeService instance, clearing when necessary ExchangeService is now thread safe, as long as certain properties (meant to be only set once at initialization time) are not changed oftentime while ExchangeService is being used Safely manage HttpClient instances so they're lazy initialized used double checked locking Made fields that can be final final, which improve understability and encourages thread safe design Resolves OfficeDevgh-371
|
Hi @candrews, I'm your friendly neighborhood Azure Pull Request Bot (You can call me AZPRBOT). Thanks for your contribution! TTYL, AZPRBOT; |
There was a problem hiding this comment.
Because synchronized keyword locking is reentrant. If we wanted to avoid use of the synchronized keyword and implement the locking manually then ReentrantLock, as you said, would be the way to go... But I don't see the need for that complexity (the keyword approach is more readable).
|
@candrews thanks for your contribution. I think this needs a little more review. Any chance of adding some tests to prove thread-safety? |
|
What do you envision such a test to look like / how would you like it to work? (I don't think there is a way to prove thread safety in unit or integration tests - we can do some concurrency testing but it wouldn't be a proof). BTW, the double check locking implementation is textbook - there's no creativity/originality there so I don't believe it to be risky or flawed in any way. For the code that this essentially is a copy of, see https://en.m.wikipedia.org/wiki/Double-checked_locking#Usage_in_Java I'm very eager to get this fix in as the library is more or less unusable without it in any meaningful way, so please help me in getting this merged :-) Also, it's unfortunate that this locking in necessary at all. If ExchangeService was refactored to be immutable and have its instances created using a builder pattern, then no locking would be necessary improving runtime performance and simplicity beyond what the current design allows. Would you be open to such an API change? |
|
As this api is a simple portation form c# to java I am completely open to anything which improves the design and usability of it. We also had lots of PR that where already targeting these kind of issues. Nevertheless I know that there are plenty of places that are far away from good design. |
There was a problem hiding this comment.
while refactoring please avoid magic numbers.
There was a problem hiding this comment.
As you noted, I only refactored this magic number containing code; I don't know the significance of the magic numbers (256 and 8). Do you know or do you know how I can find out so I make them into appropriately named constants?
I see these lines were added in https://github.com/OfficeDev/ews-java-api/blame/45e3fe6301e5824a1186e9e279200df5d57d73fe/src/main/java/microsoft/exchange/webservices/data/ExchangeServiceBase.java but the commit comment ("Initial Commit") isn't helpful, and I don't know where else to look.
There was a problem hiding this comment.
Since this is an existing issue and is unrelated, we should keep it as is. If you desire to fix it, then do it as a separate PR. Let's keep PRs focused on one issue.
There was a problem hiding this comment.
Okay, we'll handle the changing of "256" and "8" into constants in another issue/PR. Sounds good!
|
I'll look into implementing a builder style ExchangeService - thanks for the info. In the mean time, is this something you could merge? |
There was a problem hiding this comment.
Shouldn't we have space after the "if" and around the != and after the ")"? "if (httpPoolingClient != null) {". This applies to several places in the modified code.
There was a problem hiding this comment.
I can certainly do that.
If you feel this strongly about such details as spaces, can you please document what your requirements are in the README.md and also add an automated tool to enforce your requirements and fail the build if they're violated, such as Check Style? Otherwise, it's just frustrating for contributors as it's virtually impossible to find formatting violations without a tool indicating those violations.
There was a problem hiding this comment.
requirements are defined in Googles Java Guide which has been discussed to be our project reference. This is also as well mentioned in the CONTRIBUTING.md. Using checkstyle has as well been discussed but comes along with several problems:
- usability of checkstyle-plugins (for different IDEs) differ from the checkstyle-google-styleguide
- project contains multiple formattings that where not touched by IntelliJ for some reasons these would have to be handled manually, maybe there are also bugs in the
codeStyleSettings.xmldidnt had that much time to step in.
If you have any ideas on how to do this "at low cost" feel free to discuss at Coding standards #4. You are pushing at open doors with this. But whether or not we have these automated checks every PR has to match these criterias.
|
Just a few thoughts:
|
|
@vbauer I don't think it's guessing - to prove thread safety, you have to do mathematical proofs. Thread safety isn't provable using unit tests. If you search the web for this thread safety testing, you'll find that this is the case - for example, http://programmers.stackexchange.com/questions/149374/testing-thread-safety explains the situation a bit. I could add some tests, but I really don't want to get bogged down adding tests that really don't prove anything and IMHO don't add any value. I think that most people who are familiar with multithreaded programming would agree that this code is thread safe - perhaps you could ask someone familiar with multithreaded programming whom you trust to review this code instead? Furthermore, I'm trying to work with you and get my contribution accepted for my own benefit and everyone else's - if there's some significant amount of other work you want done, please feel free to help as well - I'll do whatever I reasonably can, but I'm not going to spend an exorbitant amount of time on this issue. I already have deadlines that I'm in threat of missing so I will have to abandon this PR and fork the library at some point, I'd just very much rather not do that. So please help me help all of us by providing some clarity and help. Regarding synchronized blocks, I completely agree with you that it's not a good idea to design code around unnecessary synchronized blocks. However, there are no such instances of unnecessary synchronized blocks here. There are 4 uses of synchronized: 2 to get the http clients (one each for pooling and non-pooling) and 2 to close the http clients (again, one each for pooling and non-pooling). In both cases, double checked locking is used so that the synchronized keyword is only hit (and therefore the lock is only acquired) only when absolutely necessary - under normal operation, there is zero locking. If you have a suggestion as to how to improve this design further (I already suggested a builder/immutable pattern, but such an API change is beyond the scope of this work), please share and if it's a reasonable effort I'll definitely implement it. This thread safety issue is HUGE - the library is currently useless. Can you please clearly tell me what I need to do to get this PR accepted? So far, my understanding of the status is:
|
|
Any news? Thanks! I hope you enjoy the long weekend |
|
to answer your questions:
Some sidenodes:
There is no doubt that you will have tested your code. In general the reason to add unit-tests is to guarantee quality also over the next changes that will possibly be done by others. Regarding your point of a 'trusted person' we so far put our PRs under review by the community so everyone is able to comment. And as you might have noticed there is some related feedback improving the quality of this PR. To solve this I would suggest to include your fix in the next version. So we will have this merged after 2.0 has been released. So maybe we will be able to also include some follow up PR on this one. Any comments? |
|
I'm still not sold on the need for this PR, so I added comment with my question and explaining my point. I may be convinced though. |
I also replied to @vboctor's comment/concerns at #428 (comment) Thanks again |
|
Are there any updates for this PR? We're actively blocked by this issue at the moment. |
|
I cloned off of candrews fix and tested, and this solution is necessary but not sufficient to fix the concurrency issue I'm seeing. There appears to be a bug in Apache's CloseableHttpClient, namely that, although it's marked as thread safe, it uses the not threadsafe NTLMScheme class. Concurrent access can cause out of bounds exceptions when writing to internal buffers due to static objects in NTLMEngineImpl. I've opened this issue with apache: https://issues.apache.org/jira/browse/HTTPCLIENT-1686 |
|
The issue I reported about HttpClient has been closed as invalid, although I think the maintainer of that code doesn't quite see my point, unless I'm not following what this PR is attempting to do. Fundamentally, this PR does the right thing by creating a separate HttpClientContext for each concurrent access. That should be sufficient according to the comments in HTTPCLIENT-1686, but it isn't in my testing because of the static field in NTLMEngineImpl.java. I'd appreciate any other insight into testing or use of this code in this PR that others are seeing. |
|
@candrews Thank you for this pull request, but it doesn't solve all concurrency problems: I am stumbling upon the exactly same problem @codewheeney described. In our case we have two threads (one HTTP worker thread, one background thread) which are accessing the Exchange web service through multiple instances of ExchangeService. occurs. It doesn't matter if I use the same instance of ExchangeService, create multiple instances for both threads or set the maximumPoolingConnections to 1 in any of those combinations. At the moment, @codewheeney suggestion to modify the HttpClient seems to be the only working solution - which I do want to avoid b/c of obvious reasons. |
|
I have not re-opened https://issues.apache.org/jira/browse/HTTPCLIENT-1686 yet. At the moment, I've forked HTTP Client and EWS into our repository and I'm working from patched versions. Fixing HTTP client is the right thing to do, so I've stuck with that. It'd be great if @schakko commented on https://issues.apache.org/jira/browse/HTTPCLIENT-1686 and re-opened it, pointing out that the static data member in NTLMEngineImpl will never work properly in a multithreaded environment. I failed to get that point across and haven't had cycles to get in there and re-argue the case. |
|
@codewheeney I left a comment on issues.apache.org but it's hard to argue if they don't want to make it thread-safe (and more reliable). |
|
Hello @schakko I see the changes you have made to make the API more threadsafe. However, I am confused about your comment that you have done so without touching the HTTPClient when the fork you are on, had the commit #5541396 from lamielle which indicates that it's being built with the forked httpclient (httpclient-netc, upgrade to 4.5.1) and it's still present in pom.xml in your commit. Do the changes in your fork require everything done previously by candrews and lamielle - including the httpclient update? Thanks for your help. |
HttpClientContext isn't thread safe, and isn't meant for reuse, so create a new one each time it's needed
Use a shared cookied store per ExchangeService instance, clearing when necessary
ExchangeService is now thread safe, as long as certain properties (meant to be only set once at initialization time) are not changed oftentime while ExchangeService is being used
Safely manage HttpClient instances so they're lazy initialized used double checked locking
Made fields that can be final final, which improve understability and encourages thread safe design
Resolves gh-371