-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow low level paging in LeafDocLookup #93711
Conversation
Accessing a field from a script may require lots of low level operations for which scripts do not have access. Initial loading of doc values allows this access with appropriate doPrivileged calls, but advancing to the curent document does not. This commit also protects advancing docs in the same way.
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Hi @rjernst, I've created a changelog YAML for you. |
It would be nice if we could somehow test this, but that may have to be an integ test |
@@ -18,6 +18,7 @@ | |||
import org.elasticsearch.script.field.Field; | |||
|
|||
import java.io.IOException; | |||
import java.lang.reflect.AccessibleObject; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this AccessibleObject
import necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, accidentally imported, removed now
I wonder if you could add a script field to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Accessing a field from a script may require lots of low level operations for which scripts do not have access. Initial loading of doc values allows this access with appropriate doPrivileged calls, but advancing to the curent document does not. This commit also protects advancing docs in the same way.
I think this change has introduced an important performance regression. Here is how the flamegraph of a vector tiles API calls look like after this change: The API is executing a search and sorting the data using a script. It is running over ~13 million documents. Could we do something to mitigate the performance penalty here? |
@ChrisHegarty do you have thoughts on how we could do these doPriv calls differently? I had thought doPriv should not have a performance penalty |
I assume that the Additionally, I would try making the |
I run Elasticsearch locally and pass the following jvm argument: Here is the full command:
|
JDK assertions and user-level assertions are handled differently, e.g. the JDK assertions are disabled by:
Right. I'm not drawing any conclusions yet, just trying to understand the stacks here (which still don't make full sense to me). |
TIL! That explain what I am seeing. |
I open #93971 to update the testing instructions. |
My guess is that the issue here is that docId is not final so the virtual method does not get inlined. Would that copy the docId to a final variable fix the issue? private void advanceToDoc(DocValuesScriptFieldFactory factory) {
// advancing to the current doc could trigger low level paging, network loading, etc, so do so outside scripting privileges
// copy the docId to a final variable so the virtual method below get inlined.
final int tempDocId = this.docId;
AccessController.doPrivileged((PrivilegedAction<Void>) () -> {
try {
factory.setNextDocId(tempDocId);
} catch (IOException ioe) {
throw ExceptionsHelper.convertToElastic(ioe);
}
return null;
});
} |
Yeah. Not quite inlining, but there is a less-than-ideal situation. The implementation results in a capturing lambda, which captures (among other things) the LeafDocLookup - so a lambda instance is created for each and every LeafDocLookup lookup. This can be seen in the below debug trace (when stopped at a breakpoint). Notice:
|
This is better, but take a look at the lambda after this change. I suspect that the constructor params has just changed to factory and int docId ? ( still many instances? one instance per LeafDocLookup instance ? ) |
@iverase your suggestion, to store the docId in a local, is better than the current code (since the whole LeafDocLookup is not captured), but may not prevent the lambda instance creation. Alternatively, the advanceToDoc method could be made static, and pass the docId as an arg. Still the lambda instances will be created :-( |
@iverase Still worth experimenting with though. It is possible that some simplifications, like the ones above, will help with the JIT inlining and/or escape analysis. |
I guess we can always wrap the
|
Accessing a field from a script may require lots of low level operations for which scripts do not have access. Initial loading of doc values allows this access with appropriate doPrivileged calls, but advancing to the curent document does not. This commit also protects advancing docs in the same way.
Accessing a field from a script may require lots of low level operations for which scripts do not have access. Initial loading of doc values allows this access with appropriate doPrivileged calls, but advancing to the curent document does not. This commit also protects advancing docs in the same way.