Bootstrap: Implement VirtualLock on Windows #9186

gmarz · 2015-01-07T23:03:15Z

This PR implements mlockall like functionality on Windows by leveraging the native VirtualLock function.

As explained in #8480, unlike mlockall on *nix, VirtualLock requires a base memory address and the size of a region to lock. The only sane approach, to my knowledge, was to use VirtualQueryEx to iterate the address space of the JVM and lock each page individually.

To test this, I used a combination of a few tools:

VMMap
Testlimit (for stressing system memory and monitoring page faults)
Resource Monitor

Here's what the results look like in resource monitor when starting Elasticsearch with: -Xmx4g -Xms4g:

with bootstrap.mlockall = false

the JVM is initialized with ~4GB of virtual memory (Commit), but only ~200MB is actual physical memory (Working Set).

with boostrap.mlockall = true

the working set is now also ~4GB upon start up of elasticsearch.

Additionally, I've stressed my system using Testlimit and observed up to ~100 page faults/s with mlockall disabled, and 0 page faults/s with mlockall enabled.

These results indicate to me that this is working, but it would be great to get some additional eyes/testing on this.

Closes #8480

Mpdreamz · 2015-01-07T23:10:59Z

🎉 🎈 ☀️ 🌟

kimchy · 2015-01-07T23:31:29Z

src/main/java/org/elasticsearch/bootstrap/Bootstrap.java

@@ -59,7 +60,11 @@

    private void setup(boolean addShutdownHook, Tuple<Settings, Environment> tuple) throws Exception {
        if (tuple.v1().getAsBoolean("bootstrap.mlockall", false)) {
-            Natives.tryMlockall();
+            if (Platform.isWindows()) {


I would use org.apache.lucene.util.Constants.WINDOWS

Good call @kimchy, didn't know that existed.

tlrx · 2015-01-08T09:02:48Z

I can observe the same behavior on Windows 2012 R2 / 64 bit: Working Set and Private are almost identical. Got a high number of Hard fault/sec with mlockall but I think the cause is that nearly 100% of the memory was allocated.

Good job :)

gmarz · 2015-01-08T21:23:22Z

@tlrx great news, thanks for testing and for the review :).

When did you notice the page faults? Initially during start up or afterwards while ES has been running for a while? Did you use any tools like Testlimit mentioned above?

tlrx · 2015-01-09T07:57:06Z

@gmarz I tested it again with more RAM (8gb) allocated to the virtual machine and 4gb for ES. Page faults appears at startup time - up to 150 pf/sec - then decreases slowly to 0.

I don't know how to use TestLimit, but if you want me to check page faults with this tool I'll be happy to have some example commands to run :)

gmarz · 2015-01-12T17:15:11Z

@tlrx I think the high number of page faults initially is normal since we are accessing pages. The important thing is that they eventually decrease to 0 and stay there.

Simplest way to use testlimit is to run it with the -r flag, which will reserve memory 1MB at a time:

.\testlimit64.exe -r

Once you reach > 90% memory usage, send a bunch of index and query requests to ES.

My results with -Xmx8g -Xms8g

mlockall disabled:

mlockall enabled:

tlrx · 2015-01-14T08:23:47Z

Just tested with testlimit64.exe -r and -Xms4g -Xmx4g.

mlockall disabled:

hard faults increases up to 10 then decrease to 0 and stay constant
Commit ~4Gb, Working ~600, PrivateKb ~600Kb

mlockall disabled:

hard faults increases up to ~50 then decrease to 0 and stay constant
Commit ~4Gb, Working ~4Gb, PrivateKb ~4Gb

Everything looks OK to me 👍

gmarz · 2015-01-14T15:36:45Z

Thanks @tlrx !

clintongormley · 2015-01-15T20:26:21Z

nice work @gmarz!

Closes elastic#8480

henakamaMSFT · 2015-01-29T19:23:52Z

Passing on some feedback:
Emulating mlockall(MCL_CURRENT) using VirtualQuery+VirtualLock will materialize/pull into memory even those pages that would otherwise not be accessed. If they call VirtualLock on every valid VA range in the process it will materialize a lot of unnecessary demand-zero pages (for private allocations) or cause unnecessary disk reads (for memory-mapped files/DLLs).

For mlockall(MCL_FUTURE) the closest thing on Windows is SetProcessWorkingSetSizeEx with the QUOTA_LIMITS_HARDWS_MIN_ENABLE flag. This seems like a more reasonable option to me.

Note that the minimum working set size works as a sort of memory reservation, so for example on a 64 GB system you can’t have two processes asking for 32 GB each. The combined size of all working set minimums has to be smaller than total RAM, and it can’t be very close to that limit, otherwise unrelated reservations or non-pageable allocations can start failing. The exact threshold depends on what else is running on the system, but it’s probably a good idea to leave at least 5-10% of RAM available to other reservations/allocations.

gmarz · 2015-01-29T21:09:44Z

@henakamaMSFT thanks for the feedback! I have a few questions/comments. Any further clarification would be really appreciated.

For mlockall(MCL_FUTURE) the closest thing on Windows is SetProcessWorkingSetSizeEx with the QUOTA_LIMITS_HARDWS_MIN_ENABLE flag. This seems like a more reasonable option to me.

The current implementation of mlockall in elasticsearch for *nix is mlockall(MCL_CURRENT). Since we recommend setting ES_HEAP_SIZE and initialize the JVM with a fix amount of memory, I don't believe there's a need to lock future allocations - only the currently mapped pages that are initialized by the JVM. That said, where does SetProcessWorkingSetSize fit in terms of emulating mlockall(MCL_CURRENT)? Are we correct in increasing the working set size by ES_HEAP_SIZE before attempted to lock pages in the working set?

Note that the minimum working set size works as a sort of memory reservation, so for example on a 64 GB system you can’t have two processes asking for 32 GB each. The combined size of all working set minimums has to be smaller than total RAM, and it can’t be very close to that limit, otherwise unrelated reservations or non-pageable allocations can start failing. The exact threshold depends on what else is running on the system, but it’s probably a good idea to leave at least 5-10% of RAM available to other reservations/allocations.

That makes sense. Since it's recommended that ES_HEAP_SIZE doesn't exceed 50% of the total RAM, I don't think this is an issue.

Emulating mlockall(MCL_CURRENT) using VirtualQuery+VirtualLock will materialize/pull into memory even those pages that would otherwise not be accessed. If they call VirtualLock on every valid VA range in the process it will materialize a lot of unnecessary demand-zero pages (for private allocations) or cause unnecessary disk reads (for memory-mapped files/DLLs).

Are we saying that VirtualQuery+VirtualLock is not viable? If so, is there a another way, or a work around to avoid materializing such pages?

henakamaMSFT · 2015-03-07T00:17:07Z

That said, where does SetProcessWorkingSetSize fit in terms of emulating mlockall(MCL_CURRENT)? Are we correct in increasing the working set size by ES_HEAP_SIZE before attempted to lock pages in the working set?

If the total amount of memory you want to VirtualLock is X then you need to call SetProcessWorkingSetSize and increase the minimum working set size to X plus a small overhead. X + 1 MB should work.

Are we saying that VirtualQuery+VirtualLock is not viable?

You can measure how much extra IO and memory usage VirtualLock is causing in your case (compared to simply accessing most of the data and code you think you’re going to need), and decide whether it’s acceptable.

If so, is there a another way, or a work around to avoid materializing such pages?

If you can somehow find all the allocations you care about, you can just VirtualLock those, instead of locking every committed region in the process.

Alternatively, instead of VirtualLock, you can use SetProcessWorkingSetSizeEx with the hard minimum flag, as I mentioned previously. This way only pages that are actually accessed by your app will be materialized/read from disk. But the OS will guarantee that it will not trim your process as long as its working set size stays below the limit. The end result is similar to what you’re doing now, but with less overhead.

s1monw · 2015-03-24T09:49:39Z

src/main/java/org/elasticsearch/common/jna/Kernel32Library.java

+
+    public static final int MEM_COMMIT = 0x1000;
+
+    public static class MEMORY_BASIC_INFORMATION extends Structure {


does this class name need to be all upper and not camel cased?

if so please add some documentation :)

No, doesn't need to be. It was just a convention to emphasize that it represents a native structure, but it can and probably should be camel cased instead.

gmarz · 2015-04-29T18:34:09Z

Closing in favor of #10887

Thank you @henakamaMSFT and team for the helpful feedback!

gmarz added review v1.5.0 v2.0.0-beta1 labels Jan 7, 2015

kimchy reviewed Jan 7, 2015
View reviewed changes

gmarz added 2 commits January 22, 2015 21:07

Bootstrap: Implement VirtualLock on Windows

8903333

Closes elastic#8480

Use Constants.WINDOWS in favor of Platform.isWindows()

c8ebc25

gmarz force-pushed the master branch from 86b860b to c8ebc25 Compare January 23, 2015 02:10

drewr force-pushed the master branch from dcc3da0 to 7c20a8a Compare February 20, 2015 16:48

s1monw added v1.6.0 and removed v1.5.0 labels Mar 17, 2015

s1monw reviewed Mar 24, 2015
View reviewed changes

s1monw assigned tlrx Mar 24, 2015

gmarz assigned gmarz and unassigned tlrx Mar 24, 2015

clintongormley mentioned this pull request Apr 6, 2015

ElasticSearch percolator cannot find local query if doc_value=true #9714

Closed

gmarz mentioned this pull request Apr 29, 2015

bootstrap.mlockall for Windows (VirtualLock) #10887

Merged

gmarz closed this Apr 29, 2015

kevinkluge removed the review label Apr 29, 2015

spinscale removed v1.6.0 v2.0.0-beta1 labels May 5, 2015

tsachiherman mentioned this pull request Sep 23, 2020

Windows version algorand/go-algorand#1540

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bootstrap: Implement VirtualLock on Windows #9186

Bootstrap: Implement VirtualLock on Windows #9186

gmarz commented Jan 7, 2015

Mpdreamz commented Jan 7, 2015

kimchy Jan 7, 2015

gmarz Jan 8, 2015

tlrx commented Jan 8, 2015

gmarz commented Jan 8, 2015

tlrx commented Jan 9, 2015

gmarz commented Jan 12, 2015

tlrx commented Jan 14, 2015

gmarz commented Jan 14, 2015

clintongormley commented Jan 15, 2015

henakamaMSFT commented Jan 29, 2015

gmarz commented Jan 29, 2015

henakamaMSFT commented Mar 7, 2015

s1monw Mar 24, 2015

s1monw Mar 24, 2015

gmarz Mar 24, 2015

gmarz commented Apr 29, 2015


		public static final int MEM_COMMIT = 0x1000;

		public static class MEMORY_BASIC_INFORMATION extends Structure {

Bootstrap: Implement VirtualLock on Windows #9186

Bootstrap: Implement VirtualLock on Windows #9186

Conversation

gmarz commented Jan 7, 2015

Mpdreamz commented Jan 7, 2015

kimchy Jan 7, 2015

Choose a reason for hiding this comment

gmarz Jan 8, 2015

Choose a reason for hiding this comment

tlrx commented Jan 8, 2015

gmarz commented Jan 8, 2015

tlrx commented Jan 9, 2015

gmarz commented Jan 12, 2015

tlrx commented Jan 14, 2015

gmarz commented Jan 14, 2015

clintongormley commented Jan 15, 2015

henakamaMSFT commented Jan 29, 2015

gmarz commented Jan 29, 2015

henakamaMSFT commented Mar 7, 2015

s1monw Mar 24, 2015

Choose a reason for hiding this comment

s1monw Mar 24, 2015

Choose a reason for hiding this comment

gmarz Mar 24, 2015

Choose a reason for hiding this comment

gmarz commented Apr 29, 2015