New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow rebalancing primary shards on shared filesystems #10585
Conversation
@s1monw this PR is still missing a test where replication failure is simulated (like you asked for), but I wanted to vet the idea of a "soft-close" of the Engine, I had to make some changes after #10452 was merged to allow an Engine to be closed without closing the Translog, because closing the translog caused all future recovery to hang when Let me know what you think and I will work on adding more tests tomorrow. |
@s1monw wouldn't that leave the translog in a never-closed state then? Or is the translog closed somewhere else? Does that just rely on the injector being closed to close it? |
@dakrone the translog is close later - the sync is the important part.. |
a76d62c
to
3fc625a
Compare
Relates to elastic#10585
@dakrone I think this is good functionality wise but I think the implementation needs to be less intrusive ie. I think we should implement this as a subclass of the engine instead of adding all these settings and changing how the recovery handler works and calling back into it. I took a quick step copying your test and adding this quick and dirty to the engien factory and I think it's cleaner what do you think about this s1monw@5fd56da |
@s1monw I updated this to use the method that you came up with |
@@ -310,6 +322,176 @@ public void testPrimaryRelocation() throws Exception { | |||
} | |||
|
|||
@Test | |||
@TestLogging("_root:DEBUG,index:TRACE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need that?
left minor comments LGTM otherwise |
@dakrone this is the on that only goes to 1.x, right? asking because it still has the 2.0.0 label on it... |
yeah, makes total sense, just checking because of the label. Thx, |
/** | ||
* TODO: document me! | ||
*/ | ||
public class MockSharedFSEngine extends MockInternalEngine { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a subclass of SharedFSEngine
and use the new MockEngineSupport
from #10700
5fc9fd3
to
24bf3de
Compare
sweet! LGTM |
84e8a70
to
cd57ed7
Compare
Instead of failing the Engine for a shared filesystem, this change allows a "soft close" of the Engine, where only the IndexWriter is closed so that the replica can open an IndexWriter using the same filesystem directory/mount. Fixes elastic#10469
This has been merged to 1.x only, I will rewrite and open a new PR once #10624 is merged into master, since it refactors much of the recovery process. |
Instead of failing the Engine for a shared filesystem, this change
allows a "soft close" of the Engine, where only the IndexWriter is
closed so that the replica can open an IndexWriter using the same
filesystem directory/mount.
Fixes #10469