New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-15635: don't repeat close hooks if SRI cleared twice #376
Conversation
Hi, @dsmiley @NazerkeBS! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
I don't like |
When I was working on SOLR-15555, the current behavior of closing the SRI helped me find a bug, so I'm reasonably confident that we don't want to change things to strict. If we do, then we should treat it like SolrCore close, where if there are too many calls to it then we log it and ask the person to create an issue - it more than likely means a bug in our code already.
I'm still speaking in generalities here, but I'm getting more and more convinced that the problem isn't with SRI. |
@madrob, honestly, I hardly follow.
we do logging when refcount==-1, but when refcount==0 it's regular closing, and core should be closed even with no WARN log. And that's what happen here when |
I thought the problem here was that we end up calling close too many times, right? So like on SolrCore, if we call close on the hook multiple times we could log about it instead of silently ignoring the other times. clean closing SRI is fine because it's supposed to be mirrored by creating a new entry on the SRI stack during |
Exactly
That's almost what's done in wrapper for closeable. PR: mkhludnev@36fd4aa#diff-7b4f31855280c776cee1dde2117f04a2fa9fcaf971d2a1b15cfd3f9978896394R210 ok. I can move
Not really. We have single SRI instance, it get into request's thread stack and into pool's thread stack. That's how single SRI have |
rolled back
Opinions, @madrob, @dsmiley , @NazerkeBS ? |
Thanks for pushing forward on this Mikhail! I ran tests and found that |
Sigh.. |
we have (at least) two tread pool usages https://github.com/mkhludnev/solr/blob/SOLR-15635-clear_SRI_twice/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/DaemonStream.java#L208
it both occurrences there are just shutdown/Now() without awaitTermination() it introduces races in closing SRI. @joel-bernstein in your opinion, can we call awaitTermination() there both? In this case request thread could close SRI synchronously. As an alternative idea, we can do SRI nesting before launching those pools, but it should be a special kind of SRI, which will be closed by the pool? Is it a good band-aid?
|
In at least daemon() -- no; it misses the point. Thinking about this more... I see a way to fix the race I described in the scenario I described last. In org.apache.solr.common.util.ExecutorUtil.InheritableThreadLocalProvider#store (which is called by the thread submitting to the threadPool, increase a refCount there. Then, it doesn't matter wether the Solr request thread finishes first or if the pool's Runnable completes first. Whichever happens last will do the actual close'ing. Make sense? |
Yep. I'm missing the point. 1) Why that daemon is shut down right there? Why it's done asynchronously w/o awaitTerm?
well... not really. In ITHP.store() there's no incremented refcount yet. Pool just can initiate core closing via hook. I can try to stack new SRI right before spinning the pools, delegating closing to those pools. |
By definition, a daemon lives beyond the request thread, and that's what this particular streaming expression is expressly for (by-design). Reminder: ExecutorService.shutdown() doesn't cancel the already running tasks, but it ensures it closes itself once the task(s) are done.
I'm making a proposal; there are no refCounts yet -- indeed. I have some WIP that I could commit here. Let me know if you'd like to see.
I don't really follow but maybe you can show or explain further? If you're saying this pool will create a new SolrRequestInfo (special in some way)... I'm not sure I like It but we'll see I guess. |
This reverts commit 8064ad8.
Anyway. I tried. it didn't worked out. I'd be happy to have a look on your approach. |
Also: * trace logging * protected -> private
This reverts commit 3d2ded5.
…dnev/solr into SOLR-15635-clear_SRI_twice
Well cool! You've got it, @dsmiley. I slightly improved the test. When it's a time to commit it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concerns @madrob ? Mikhail and I are good with it.
private static void closeHooks(SolrRequestInfo info) { | ||
if (info.closeHooks != null) { | ||
for (Closeable hook : info.closeHooks) { | ||
private synchronized void close() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the whole method need to be sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reducing synchronized scope to condition seems working. Is it worth to push into PRs' branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be the whole method -- for the refCount & closeHook manipulation. Besides; it shouldn't be contented.
Co-authored-by: David Smiley <dsmiley@apache.org>
https://issues.apache.org/jira/browse/SOLR-15635
Description
User impact:
/export?q={!join NOscore_param fromIndex=is_closed_by_this_req ...}...
Solution
Clear SRI.closeHooks when they re invoked.
Tests
Added test for
/export?q={!join}
. And unit test for SRI particularly.Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.