New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a hybrid directory default using mmapfs
/ niofs
#6636
Conversation
FYI - I think the name here can be improved... I just wanted to get the code out asap. |
The name |
One idea is just |
I like |
|
I don't think "what directory type are you using?" I like |
@dakrone but that situation already exists today, no? Even if we name it |
@rmuir yep, it totally does, I do think that naming it |
but if our default is always |
Yep, there is none, so |
Okay, after more discussion we agreed on |
`niofs` for the rest. | ||
operating environment will be automatically chosen: `mmap` on | ||
Windows 64bit, `simplefs` on Windows 32bit, and `default` | ||
(hybrid `niofs` and `mmap`) for the rest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still be mmapfs
(in both places) since that's what the name of the setting is, otherwise someone could set it to mmap
and it wouldn't take effect.
LGTM, some comments were already made :) |
@@ -46,26 +92,32 @@ public IndexStoreModule(Settings settings) { | |||
@Override | |||
public Iterable<? extends Module> spawnModules() { | |||
Class<? extends Module> indexStoreModule = NioFsIndexStoreModule.class; | |||
// Same logic as FSDirectory#open ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI this is a bit out of date with respect to FSDirectory.open. Actually we default to mmap on all 64-bit systems now, as its faster on the macos X too.
I pushed several new commits |
I think it's ready |
looks good. |
+1, LGTM |
`mmapfs` is really good for random access but can have sideeffects if memory maps are large depending on the operating system etc. A hybrid solution where only selected files are actually memory mapped but others mostly consumed sequentially brings the best of both worlds and minimizes the memory map impact. This commit mmaps only the `dvd` and `tim` file for fast random access on docvalues and term dictionaries. Closes elastic#6636
file_switch
directory default using mmapfs
/ niofs
mmapfs
/ niofs
`mmapfs` is really good for random access but can have sideeffects if memory maps are large depending on the operating system etc. A hybrid solution where only selected files are actually memory mapped but others mostly consumed sequentially brings the best of both worlds and minimizes the memory map impact. This commit mmaps only the `dvd` and `tim` file for fast random access on docvalues and term dictionaries. Closes #6636
There is already a similar issue in Lucene: https://issues.apache.org/jira/browse/LUCENE-1743 This one was not about random access (it did not exist at that time), but the idea is the same. A file switch by file name is more natural to me. |
@s1monw Maybe add "cfs" to the list of extensions? |
We should not add .cfs in my opinion. if such cfs options are enabled, it means we are mmaping the whole index again, which we want to avoid (purpose of this issue). By default, the only thing using .cfs are tiny segments flushed from indexwriter. Because they are small performance is not really critical there. |
@s1monw @rmuir @kimchy One cool thing for the WeakRef haters: This reduces also load to GC, because we dont create so many weak refs when cloning MMapIndexinputs: Random access inputs dont really need to be cloned all the time and on every request (no state involved, as position is not needed). Stuff like posting lists are cloned all the time, but those are now not weak ref'ed because they are read using NIO. |
mmapfs
/ niofs
mmapfs
/ niofs
mmapfs
/ niofs
mmapfs
/ niofs
In elastic#6636 we switched to a default FileSwitchDirectory that made .listAll run twice on the same underlying file system directory. This fixes listAll to do a single directory listing again. Closes elastic#9666
mmapfs
is really good for random access but can have sideeffects ifmemory maps are large depending on the operating system etc. A hybrid
solution where only selected files are actually memory mapped but others
mostly consumed sequentially brings the best of both worlds and
minimizes the memory map impact.
This commit mmaps only the
dvd
andtim
file for fast random accesson docvalues and term dictionaries.