-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CollectionUtils#ensureNoSelfReferences creates too much allocation. #49277
Comments
Pinging @elastic/es-core-infra (:Core/Infra/Scripting) |
Can you describe if this has a measureable impact for you in a real-world scenario? Coming up with good defaults here depends both on the size and the depth of the iterable here, and a value thats too small will lead to higher cost for resizing later? |
The self reference check is on the objects produced by each script, not the params to the scripts. In the scripted metric agg case, in the past the returned state was conflated into the params map, but was changed to the |
@rjernst i named the wrong variable, i meant i understand this method needs to be called to validate against self referencing collections/objects, but does it needs to be revalidated this many time at each leaf bucket? thanks |
@rjernst any update on this? |
@wenhoujx |
We're calling How many segments are you working with? |
@stu-elastic As this has been open >1 month pending feedback, do you think it's still worth keeping it open? |
@stu-elastic i am on es 6.8.x and reading from the code it seems the method is called for every leaf aggregator? As in you have to check for self-reference for the state in each bucket, am i misunderstanding it somehow? |
Per @rjernst's comment above #49277 (comment) we need to check whenever an object is returned or modified from a script to avoid a stack overflow.
If you're referring to the check after executing the Next steps We can keep a thread local |
@stu-elastic kk, i think i understand the problem and why it's necessary to allocate this |
This has been open for quite a while, and we still believe the cost of ensureNoSelfReferences is worth it to ensure a stack overflow does not occur. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed. |
Describe the feature:
Elasticsearch version (
bin/elasticsearch --version
):6.8.3
Plugins installed: []
JVM version (
java -version
):java 11
OS version (
uname -a
if on a Unix-like system):Linux il-pg-alpha-4421120.use1.palantir.global 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
while using scripted metrics aggregation, i noticed this method is creating lots of allocations. My script only has one parameter, but this method allocated a default sized
IdentityHashMap()
, which creates anew Object[64]
regardless, this multiples quicks with the number of parent buckets.I wonder if we can call
new IdentityHashMap(4)
or come up with a better way to check self-reference.Steps to reproduce:
I don't think a repro is needed, it's right there in the code.
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: