Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest Node and number of script compilation #49763

Closed
tsg opened this issue Dec 2, 2019 · 11 comments
Closed

Ingest Node and number of script compilation #49763

tsg opened this issue Dec 2, 2019 · 11 comments
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@tsg
Copy link

tsg commented Dec 2, 2019

The Beats Cisco ASA module has a fairly complex Ingest Node pipeline. The pipeline is generated by the module from YAML, but the resulting pipeline can be found in this gist. Due to the large number of if conditions, and because each is an individual script, we run into "Too many dynamic script compilations" errors.

As explained by @adriansr, Elasticsearch has a cache of 100 compiled scripts and a default limit of 75 script compilation per 5 minutes. That pipeline alone has close to 100 scripts, so the cache runs out quickly enough.

We have written docs to document the workaround, which is mostly to increase the limit and the cache.

While we hit this limit with our own module, based on the huge Logstash configs that I've seen, I suspect users and customers are going to hit similar problems as they adopt Ingest Node more.

Questions: should we adjust the cache and limit on the Elasticsearch side? Or should we consider refactoring the pipeline to use a large Painless script? Are there other options?

@tsg tsg added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Dec 2, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@clintongormley
Copy link

clintongormley commented Dec 5, 2019

Elasticsearch has a cache of 100 compiled scripts and a default limit of 75 script compilation per 5 minutes. That pipeline alone has close to 100 scripts, so the cache runs out quickly enough.

A cache limit of 100 compiled scripts is going to add a lot of overhead to ingest, where we use a lot of very short scripts. Can we not base the cache on memory usage instead? Or alternatively, change the if statements to a json structure with some standard operators ==, !=, >, >=,<, <=, is_null, is_not_null?

@jakelandis
Copy link
Contributor

jakelandis commented Dec 12, 2019

We discussed this today and agree that we need to address this issue. How we do it though is still a question.

While it would be possible to change the if statement from a painless script there is concern that it might be too limiting to make the necessary decision, and something like adding a simple and/or support could be a rabbit hole not sure we want to go down. However, adding something like "simple_if" is a possibility.

On the topic of max size and compilation throttle - The cache currently has configurable max size, but does does not offer size based evictions.The throttle is configurable but as seen one large pipeline can easily exceed the throttle.

We wonder if these if scripts warrant special handling ?

a) How much actual memory is used by these normally small if scripts ? Does it make sense to provide special handling w.r.t max number of these in cache if the footprint is very small ?

b) Similarly, should these small if scripts from the pipelines be exempt from the script compilation throttle ?

@jdconrad @stu-elastic - thoughts ?

@jdconrad
Copy link
Contributor

One initial thought to consider is if an ingest script processor didn't allow the use of stored scripts, the script could be compiled when the ingest processor is parsed, so it could then be cached locally and compilation would only occur when the processor changes.

However, to address the questions posed by @jakelandis the memory footprints for compiled scripts should be relatively small and the difference between a large script and a small script shouldn't be much, so it probably doesn't make sense to have special handling for small conditional ingest scripts specifically. As to the second point we do agree that the throttle limit should be removed for conditional ingest scripts because they are not changed on a per request basis.

After discussion with @stu-elastic and @rjernst we would propose making the cache size and compilation limits per context. This way ingest could be set to have a much larger cache and no compilation limit by default for this type of use case.

@martijnvg
Copy link
Member

One initial thought to consider is if an ingest script processor didn't allow the use of stored scripts, the script could be compiled when the ingest processor is parsed, so it could then be cached locally and compilation would only occur when the processor changes.

So today conditional scripts and scripts in the script processor are compiled lazy, when a pipeline is actually used. If we would change it to compile on parse time, then we need to make sure that we don't into any limit. Otherwise a pipeline with scripts may not be useable, because it failed to load due limitation error on one ore more nodes. If we can increase the compilation limit then this would help, but wouldn't it be better to exclude scripts in the ingest context from this compilation per time window limit? Essentially these scripts are stored scripts, but instead stored in a pipeline.

@rjernst
Copy link
Member

rjernst commented Dec 13, 2019

wouldn't it be better to exclude scripts in the ingest context from this compilation per time window limit?

That's exactly part of the purpose of #50152. By splitting the cache per context, we can control the rate limit separately per cache, and have the ability to disable the rate limit completely. I don't think this is different whether they are compiled at parse or pipeline runtime.

Additionally, while I do think we should move to parse time compilation for ingest condition scripts, I'm not sure we can until we limit the existing API. These can currently be stored scripts (just referenced by name), which means they might change after the pipeline is parsed. If we want to compile at parse time, I think we need to limit these scripts to inline only, which makes sense to me because they are essentially already stored scripts, just stored in a different place than the global stored scripts.

@martijnvg
Copy link
Member

and have the ability to disable the rate limit completely.

👍

I don't think this is different whether they are compiled at parse or pipeline runtime.

Agreed, for the ingest context this shouldn't be different and disabling the rate limit for the ingest context shouldn't depend on moving the compilation from runtime to parse time.

I think we need to limit these scripts to inline only, which makes sense to me because they are essentially already stored scripts, just stored in a different place than the global stored scripts.

Yes, only allowing inline scripts is required in order to move compilation to from runtime to parse time. This would be a breaking change, since both script processor and code that parses the conditional invoke Script.parse(...) method which allows to use of stored scripts.

@martijnvg
Copy link
Member

I've removed the team-discuss label. When #50152 is implemented then we should disable the rate limit for the ingest context.

@leehinman
Copy link

For 7.X would it be possible to improve the error message when the cache is too small? If we have more evictions per minute than the max_compilations_per_minute we could recommend increasing cache.max_size. A real world example where that would have helped is elastic/beats#16293

@stu-elastic
Copy link
Contributor

That's a reasonable request @leehinman, let's continue that discussion in #52497

@stu-elastic
Copy link
Contributor

Closed by #59267

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

9 participants