Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up ingest geoip processors #92372

Merged
merged 7 commits into from
Dec 14, 2022

Conversation

joegallo
Copy link
Contributor

Previously, we were running the relatively expensive SpecialPermission.check() / AccessController.doPrivileged(...) pair on every call to the cache, whether it was a cache hit or a cache miss. Inverting that code in my second commit made the expensive parts only run where absolutely needed (on cache misses).

However, going still further, it turns out that in version 2.15.0 of the geoip library it stopped interacting with jackson-databind in a way that required us to have special permissions to begin with -- so we can drop the additional permissions from our policy file and we no longer need the SpecialPermission.check() / AccessController.doPrivileged(...) pair for cache misses, either.

In one of our nightly benchmarks, this speeds up geoip processors by ~50%, but AFAICT those benchmarks use a less-cacheable random distribution of ip addresses -- I would expect the performance bump for real world use of the geoip processor to be even better than that.

A cache hit is just a map lookup, there's nothing interesting about
that SecurityManager-wise, so if we fold the check and doPrivileged
calls into the inside of putIfAbsent then we only need to pay those
costs on cache misses when we actually are going to do geoip database
lookup work.
Since the release of com.maxmind.geoip2/geoip2 2.15.0, it doesn't seem
that it uses jackson-databind anymore for these operations, so we no
longer need additional permissions to run this code.
@joegallo joegallo added >feature :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.7.0 labels Dec 14, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 14, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

@joegallo
Copy link
Contributor Author

For increased confidence with the sanity of this change, I've taken the trouble of downgrading geoip2 to 2.14.0 locally, and indeed everything blows up and gnashes its teeth and wails horribly about the lack of permissions and 17 tests fail. Re-upgrading from 2.14.0 to 2.15.0 results in all tests passing and no permissions issues.

@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've updated the changelog YAML for you. Note that since this PR is labelled release highlight, you need to update the changelog YAML to fill out the extended information sections.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@joegallo joegallo merged commit 1c06ed8 into elastic:main Dec 14, 2022
@joegallo joegallo deleted the ingest-geoip-security-manager branch December 14, 2022 20:43
joegallo added a commit to joegallo/elasticsearch that referenced this pull request Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP release highlight Team:Data Management Meta label for data/management team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants