Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap.mlockall for Windows (VirtualLock) #10887

Merged
merged 1 commit into from May 8, 2015

Conversation

gmarz
Copy link
Contributor

@gmarz gmarz commented Apr 29, 2015

This PR supersedes #9186. The original PR became a bit stale and was also working off my master branch.

I've incorporated all the feedback from #9186 as best as I could. The biggest difference from the original PR are the changes that had to be made in order to work properly with the refactoring that was done in #9923.

I believe this is now good to go, but would love to get some final eyes/testing before merging it in.

/cc @clintongormley @pickypg @ppf2 @Mpdreamz @tlrx @kimchy

Closes #8480

// By default, Windows limits the number of pages that can be locked.
// Thus, we need to first increase the working set size of the JVM by
// the amount of memory we wish to lock (Xmx), plus a small overhead (1MB).
SizeT size = new SizeT(JvmInfo.jvmInfo().getMem().getHeapInit().getBytes() + (1024 * 1024));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to use Xmx here? getHeapInit is Xms, I mean, typically, this will be the same, but by default, they are different in our config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are different then mlockall will not really work on unix either. That is because it may map additional stuff later!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the comment is wrong...using Xms/getHeapInit here is intentional. The assumption is that they're the same if mlockall is enabled. I think I chose Xms to be on the safe side in case they aren't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the comment" as being the one in code, not yours @kimchy :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmuir @kimchy come to think of it, probably makes more sense to log an error and abort locking if Xmx != Xms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine if we only lock HeapInit, thats why I was confused, either by the comment in the code, or the code :). We should assume that HeapInit==HeapMax in prod deployment since we ask people to set the heap which automatically set both of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just add max(HeapInit, HeapMax) feels like a quick win here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mpdreamz That feels a little wrong to me since it implies that it's okay for them to be different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarz @kimchy but it is okay for them to be different simply because we allow it :)

The worse case is that we kill starting with MinHeap (growing to max immediately) instead of otherwise the worse case being mlockall'ing too little.

@kimchy
Copy link
Member

kimchy commented Apr 29, 2015

left one comment, hard to tell if it works on my end :), I assume we tested it on windows, so I am good

@@ -20,12 +20,14 @@
package org.elasticsearch.common.jna;

import com.google.common.collect.ImmutableList;
import com.sun.jna.Native;
import com.sun.glass.ui.Size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

com.sun.glass.ui.Size appears to be unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks @pickypg

@pickypg
Copy link
Member

pickypg commented Apr 29, 2015

Really minor stuff. Excited to see this get in! LGTM

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

I'm being bit by the following:

[2015-05-01 17:42:00,360][INFO ][node                     ] [Troll] version[2.0.0-SNAPSHOT], pid[5676], build[e0560cf/2015-05-01T15:25:35Z]
[2015-05-01 17:42:00,361][INFO ][node                     ] [Troll] initializing ...
[2015-05-01 17:42:00,367][INFO ][plugins                  ] [Troll] loaded [], sites []
[2015-05-01 17:42:00,442][INFO ][env                      ] [Troll] using [1] data paths, mounts [[(c:)]], net usable_space [31.7gb], net total_space [80.2gb], spins? [unknown], types [NTFS]
[2015-05-01 17:42:03,192][INFO ][node                     ] [Troll] initialized
{2.0.0-SNAPSHOT}: Initialization Failed ...
- ExceptionInInitializerError
        AccessControlException[access denied ("java.io.FilePermission" "C:\Users\mpdreamz\AppData\Local\Temp\jna-280264762" "read")]

Both whether mlockall is enabled or not and whether I run as Admin or not.

@dakrone
Copy link
Member

dakrone commented May 1, 2015

@Mpdreamz that's the security manager, if you are running from an IDE you will need to add -Des.security.manager.enabled=false to the arguments

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

@Mpdreamz can you confirm whether this happens when you run bin/elasticsearch? Our policy file grants access to java.io.tmpdir (which it looks like is where you hit the issue), so if it happens then we need to open an issue, something is not right on windows maybe.

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

It happens on the command line (bin/elasticsearch) not tested inside IntelliJ

-Des.security.manager.enabled=false does fix it now wondering wether its something in this PR that causing this or current master.

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

I suspect its caused by this issue, because @rjernst tested security manager changes on a windows VM before they went in. So we need to do some investigation.

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

Yeah current HEAD in master runs without a problem, seems like something introduced in this branch.

@gmarz
Copy link
Contributor Author

gmarz commented May 1, 2015

My fork isn't up to date with master and is missing f599c23. I wonder if that could be the issue?

Strangely though, I'm not running into this issue running from the cmd line, and same snapshot build as @Mpdreamz.

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

It makes some sense: as described in that commit, mlockall tests don't run with security manager enabled, so that is why we mlockall first (so we know it works, since we dont test it).

What is maybe not explained is why you dont have access to what appears to be a temp dir. To satisfy curiosity, can you run ES with an additional VM argument, the windows equivalent of this?:

JAVA_OPTS=-Djava.security.debug=access:failure bin/elasticsearch

This guarantees full stacktrace on failure, which we don't have above.

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

Output listed here: https://gist.github.com/anonymous/be30448ed6b0e0fa58e8

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

Thank you, so its as i feared, something is wrong with temporary directory permissions in your case. Can you run the same way with current master, I want to see if we are still unable to delete our own temporary file there too (ES will not fail here, but something is not right)

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

https://gist.github.com/Mpdreamz/9da3e21ab98bb413a54d

Does go into started state:

[2015-05-01 19:09:13,288][INFO ][node ] [Taurus] started

But still getting an access error:

access: access denied ("java.io.FilePermission" "C:\Users\mpdreamz\AppData\Local\Temp\5508702577511658466.tmp" "delete") java.lang.Exception: Stack trace

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

Thanks @Mpdreamz I created #10925 to debug this, its important but unrelated to this issue, so we can figure it out over there.

@Mpdreamz
Copy link
Member

Mpdreamz commented May 1, 2015

LGTM 👍 sInce a rebased build does not throw the exception and goes into started mode where it locks the memory as expected

@rmuir moved my issue over to #10925

@clintongormley
Copy link

@gmarz is this ready to go in?

@gmarz
Copy link
Contributor Author

gmarz commented May 8, 2015

@clintongormley Yes :) Merging shortly.

gmarz added a commit that referenced this pull request May 8, 2015
mlockall for Windows (VirtualLock)
@gmarz gmarz merged commit dbba761 into elastic:master May 8, 2015
@clintongormley clintongormley changed the title mlockall for Windows (VirtualLock) bootstrap.mlockall for Windows (VirtualLock) May 29, 2015
@clintongormley clintongormley added :Core/Infra/Settings Settings infrastructure and APIs >feature and removed >enhancement labels May 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support VirtualLock on Windows
7 participants