Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix database manager with multiple pipelines #12862

Merged
merged 35 commits into from
Jun 16, 2021

Conversation

kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented Apr 29, 2021

This PR goes together with plugin PR logstash-plugins/logstash-filter-geoip#181
To the reviewer: I rewrote the whole flow of geoip database service. So, it is easier to review the raw file.
Please check here

Background

Currently, each geoip pipeline creates one DatabaseManager to download database. To prevent duplicate download, this PR changes DatabaseManager to singleton to make sure only one instance manages database and one scheduler downloads database.

Acceptance criteria

  1. in multiple pipelines, only one set of database is downloaded
  2. the scheduler fetches new database if there is a new release, and checks database age every day
  3. when there is new release, database manager should push the new database path to all geoip plugins
  4. when Logstash fail to touch infra endpoint for 30 days and user is using EULA version, Logstash see the database as expired and trigger the expiry action of the geoip plugin
  5. when user hasn't set the database path (online mode) and couldn't download database, Logstash fallback to CC version indefinitely until Logstash can download database again, and should not experience any interruption
  6. Logstash only keep two versions of datatabse, CC and the latest EULA. Old EULA database should be removed once a new version is available
  7. when user provides database path (offline mode), Logstash should not access endpoint or download database

metadata example

database type update at gz md5 directory name is eula
ASN 1620246514 2f09b8db7a7f993d642e450d3ebcf339 1620245143 true
City 1620246514 678ec92e897050fdb32ad56ffb91e487 1620245143 true

Fixed: #12856

@kaisecheng kaisecheng marked this pull request as ready for review May 5, 2021 22:09

case
when days_without_update >= 30
if @states[database_type].is_eula && @states[database_type].plugins.size > 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(5) this check ensures Logstash can use CC database indefinitely. we only notify plugins to do expiry action when the plugin is using EULA and age >= 30

"Please check the network settings and allow Logstash accesses the internet to download the latest database, "\
"or switch to offline mode (:database => PATH_TO_YOUR_DATABASE) to use a self-managed database "\
"which you can download from https://dev.maxmind.com/geoip/geoip2/geolite2/ ")
@states[database_type].plugins.dup.each { |plugin| plugin.expire_action if plugin }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(4) the actual expiry logic is in plugin

@jsvd jsvd self-requested a review May 6, 2021 12:51
…ple_pipelines

� Conflicts:
�	x-pack/lib/filters/geoip/database_manager.rb
�	x-pack/lib/filters/geoip/database_metadata.rb
�	x-pack/lib/filters/geoip/util.rb
�	x-pack/spec/filters/geoip/database_manager_spec.rb
�	x-pack/spec/filters/geoip/database_metadata_spec.rb
…ple_pipelines

# Conflicts:
#	x-pack/lib/filters/geoip/database_manager.rb
#	x-pack/spec/filters/geoip/database_manager_spec.rb
Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this review suggests some changes so that the implementation is done in a more Ruby-like way, namely adopting some patterns like the Observer and Singleton patterns.

x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Show resolved Hide resolved
x-pack/lib/filters/geoip/database_metadata.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_metadata.rb Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_metadata.rb Show resolved Hide resolved
@jsvd
Copy link
Member

jsvd commented Jun 9, 2021

A couple of notes of things we should address but that don't block this PR:

  1. The periodic check will always print the same message once the Database is deleted:
[2021-06-09T12:10:17,893][WARN ][logstash.filters.geoip   ][pipeline_17] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}

This message gets printed every day, and in that mindset it's confusing to say "geoip plugin will stop / will tag", as it's been doing it already for perhaps days or weeks

  1. with multiple pipelines and a single database manager, the log entries are very confusing, as only 1 pipeline is identified (likely the last to start), but we print N messages for N pipelines using the geoip filter. A few examples:
[2021-06-09T12:12:02,296][INFO ][logstash.filters.geoip   ][pipeline_17] Using geoip database {:path=>"/Users/joaoduarte/elastic/logstash/.bare/data/plugins/filters/geoip/1623237108/GeoLite2-City.mmdb", :healthy_database=>true}
[2021-06-09T12:12:02,297][INFO ][logstash.filters.geoip   ][pipeline_17] Using geoip database {:path=>"/Users/joaoduarte/elastic/logstash/.bare/data/plugins/filters/geoip/1623237108/GeoLite2-City.mmdb", :healthy_database=>true}
[2021-06-09T12:12:02,297][INFO ][logstash.filters.geoip   ][pipeline_17] Using geoip database {:path=>"/Users/joaoduarte/elastic/logstash/.bare/data/plugins/filters/geoip/1623237108/GeoLite2-City.mmdb", :healthy_database=>true}
[2021-06-09T12:12:02,297][INFO ][logstash.filters.geoip   ][pipeline_17] Using geoip database {:path=>"/Users/joaoduarte/elastic/logstash/.bare/data/plugins/filters/geoip/1623237108/GeoLite2-City.mmdb", :healthy_database=>true}
[2021-06-09T12:12:02,297][INFO ][logstash.filters.geoip   ][pipeline_17] Using geoip database {:path=>"/Users/joaoduarte/elastic/logstash/.bare/data/plugins/filters/geoip/1623237108/GeoLite2-City.mmdb", :healthy_database=>true}

And:

[2021-06-09T12:21:26,613][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,613][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}
[2021-06-09T12:21:26,614][WARN ][logstash.filters.geoip   ][pipeline_3] geoip plugin will stop filtering and will tag all events with the '_geoip_expired_database' tag. {:healthy_database=>false}

Or even in some cases there's only 1 message, but still gets identified with a pipeline (rufus changed to execute every 30 seconds):

[2021-06-09T12:28:25,285][ERROR][logstash.filters.geoip.databasemanager][pipeline_2] Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known) {:cause=>#<SocketError: Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known)>}
[2021-06-09T12:28:55,274][ERROR][logstash.filters.geoip.databasemanager][pipeline_2] Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known) {:cause=>#<SocketError: Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known)>}
[2021-06-09T12:29:25,559][ERROR][logstash.filters.geoip.databasemanager][pipeline_2] Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known) {:cause=>#<SocketError: Failed to open TCP connection to geoip.elastic.co:443 (initialize: name or service not known)>}

Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
tested manually multiple scenarios, online and offline, with > 20 pipelines.
Logstash + the plugin were always able to either download or delete the database correctly.

@kaisecheng
Copy link
Contributor Author

  1. The periodic check will always print the same message once the Database is deleted

I will update plugin fail_filter to only print msg when @heathy_database turn from true to false

  1. with multiple pipelines and a single database manager, the log entries are very confusing, as only 1 pipeline is identified (likely the last to start)

Can you explain your setup? I have tested 10x pipelines. Each pipeline prints the corresponding msg. How do you set it to n pipelines but only one pipeline is identified?

@kaisecheng
Copy link
Contributor Author

I found two issues.

  1. if the database is expired before Logstash start, the fail path doesn't run because the fail action is taken before observable adds plugin
  2. observable is not thread-safe. add_observer and delete_observer should work with lock

@jsvd
Copy link
Member

jsvd commented Jun 15, 2021

One final remark: when restarting logstash when offline after it deleted an expired database, we can't see any information that the plugins will not enrich data. if you're familiar with the implementation you can figure out that the following line tells you that:

[2021-06-15T13:57:36,392][INFO ][logstash.filters.geoip   ][pipeline_12] Using geoip database {:path=>nil}

Can we also throw a warning in this situation? the user should see that they're starting a new pipeline that won't enrich data

@kaisecheng kaisecheng requested a review from jsvd June 15, 2021 14:46
Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@jsvd
Copy link
Member

jsvd commented Jun 15, 2021

@kaisecheng I don't see master failing on the tests that this PR is failing in, can you do a check and find if these changes are causing an intengration test to fail?

@kaisecheng
Copy link
Contributor Author

kaisecheng commented Jun 15, 2021

@kaisecheng I don't see master failing on the tests that this PR is failing in, can you do a check and find if these changes are causing an intengration test to fail?

the integration test failed because we don't have a match geoip plugin release.
We need to first merge master and then release the plugin gem release the plugin gem and trigger the test in this branch

@kaisecheng
Copy link
Contributor Author

jenkins test this please

1 similar comment
@kaisecheng
Copy link
Contributor Author

jenkins test this please

@kaisecheng kaisecheng merged commit 42c4bba into elastic:master Jun 16, 2021
kaisecheng added a commit to kaisecheng/logstash that referenced this pull request Jun 16, 2021
This PR adds support to geoip multiple pipelines which have a single instance
to manage database download to avoid individual download per pipeline
kaisecheng added a commit that referenced this pull request Jun 17, 2021
This PR adds support to geoip multiple pipelines which have a single instance
to manage database download to avoid individual download per pipeline
kares added a commit to kares/logstash that referenced this pull request Jul 1, 2021
* master: (41 commits)
  Test: resolve integration failure due ECS mode (elastic#13044)
  Feat: event factory support (elastic#13017)
  Doc: Add geoip database API to node stats (elastic#13019)
  Add geoip database metrics to /node/stats API (elastic#13004)
  ecs: on-by-default plus docs (elastic#12830)
  ispec: fix cross-spec leak from fatal error integration specs (elastic#13002)
  Fix UBI source URL (elastic#13008)
  update fpm to allow pkg creation on jdk11+jruby 9.2 (elastic#13005)
  Add unit test to grant that production aliases correspond to a published RubyGem (elastic#12993)
  Fix logstash.bat not setting exit code (elastic#12948)
  Use the OS separator to invoke gradlew from Rake script (elastic#13000)
  Allow per-pipeline config of ECS Compatibility mode via Central Management (elastic#12861)
  Update jinja2 dependency in docker build (elastic#12994)
  fix database manager with multiple pipelines (elastic#12862)
  Fix Reflections stack traces when process yml files in classpath and debug is enabled (elastic#12991)
  Fix/log4j routing to avoid create spurious file (elastic#12965)
  Deps: update JRuby to 9.2.19.0 (elastic#12989)
  Doc: Add tip for checking for existing field (elastic#12899)
  Added test to cover the installation of aliased plugins (elastic#12967)
  CI: Update logstash_release.json after 7.3.12 (elastic#12986)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GeoIP database manager could fail in multiple pipelines
4 participants