Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add setting to disable the GeoIP database downloader #14823

Merged

Conversation

edmocosta
Copy link
Contributor

@edmocosta edmocosta commented Dec 23, 2022

Release notes

Added a logstash.yml setting xpack.geoip.downloader.enabled to disable the GeoIP databases auto-update feature.

What does this PR do?

This PR adds a config xpack.geoip.downloader.enabled to globally disable the database auto-update, applying the following rules:

auto_update: true, database: nil, same as current behavior trying to update the database.
auto_update: true, database: path, same as current behavior using the user-provided database.
auto_update: false, database: nil, use the CC database, and delete all EULA databases.
auto_update: false, database: path, same as current behavior using the user-provided database.

Why is it important/What is the impact to the user?

If users want to disable the GeoIP database auto-update, they need to set the geoip database => "/PATH/TO/DB" option to a file path. As this method is not very intuitive, providing a specific configuration to globally disable the feature, would improve the user experience. It also allows users to stick with the CC databases.

Checklist

  • My code follows the style guidelines of this project
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • Update plugin documentation #210

How to test this PR locally

Pipeline config for testing:

input { 
  generator {
    count => 1
    message => '{ "ip": "8.8.8.8" }'
    codec => json
  } 
}

filter {
  geoip {
    source => "ip"
    target => "geoip"
  }
}

output {
  stdout {
  }
}

Test 1:

  • Start Logstash with the test pipeline.
  • The plugin should work the same way it does today, downloading the EULA databases and using them.

Test 2:

  • Change the logstash.yml file and set the config xpack.geoip.downloader.enabled to false.
  • Start Logstash with the test pipeline.
  • Logstash should display an info message about deleting EULA database copies, removing them, and using the CC databases instead.
  • Repeat the same test setting the xpack.geoip.downloader.enabled config to true. It should download the EULA databases and work as it does today.

Test 3:

  • Change the logstash.yml file and set the config xpack.geoip.downloader.enabled to false.
  • Change the test pipeline by adding the database => "path/to/db" option on the geoip filter.
  • Start Logstash, it should use the user-provided database.

Related issues

Closes #14724

Copy link
Contributor

@kaisecheng kaisecheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving geoip database auto-update experience. The implementation is correct according to the description of the issue.

I have thought again if xpack.geoip.db.auto_update really helps the user. auto_update: false means user needs to set a valid database path to disable update, while auto_update: true and database path is set does not mean the update is enabled. This is quite confusing to the user whether the new config is useful.

I try to find another way to make it useful. For example, when auto_update: false, database_manager falls back to CC database and disables the online checking. In this way, I have doubts about compliance because currently, once users have used EULA database, they cannot switch back to CC database. We haven't discussed the possibility of switching on/off to CC database.

From time to time, users ask how to disable auto-update, especially in air-gapped environment. As of now, they can run Logstash without config anything because, by default, Logstash uses CC database indefinitely if Logstash is never able to download an update. The downside of now is that Logstash logs exception when it fails to touch the geoip service endpoint.

To move forward, there are two ways to go.

  1. Instead of adding a new config, we can just add documentation to say by configuring database path, auto-update is disabled.
  2. If users are compliant to fall back to CC database, set autp_update: false to disable online checking.

I think (1) is enough. Even having a new config, a similar documentation is required. (2) would be awesome if it is feasible.

config/logstash.yml Outdated Show resolved Hide resolved
x-pack/lib/filters/geoip/database_manager.rb Outdated Show resolved Hide resolved
@edmocosta
Copy link
Contributor Author

edmocosta commented Dec 29, 2022

Hello @kaisecheng, thanks for reviewing and for all the detailed information!

I was wondering if allowing users to stick to the CC database, even if connected to the internet, would be somehow more useful for those cases. Personally, I find any implying/switch mechanism, based on the auto_update or in the fact that the DB was updated, very confusing.

With this configuration, users running in an air-gapped environment could just stick to the CC database (by setting some xpack.*.hold_to_cc_database: true config) or manage it by themselves. In my opinion, it's easier to understand compared to the auto_update config, which is more ambiguous.

In summary, instead of adding a configuration to control the auto_update, we could add a configuration to allow users to stick to the CC database. In that case, the behavior would be:

hold_to_cc_database = true: Uses the CC database indefinitely, with no auto-update or error logs as we wouldn't try to download the eula version.
hold_to_cc_database = false: Same as current behavior
hold_to_cc_database & database =>: Uses the DB provided on the database => option, the license compliment is the user's responsibility.

What do you think?

@kaisecheng
Copy link
Contributor

After revisiting EULA, I believe we can safely switch back to CC databases if the EULA databases are removed within 30 days. So, agree that we can offer an option to allow using CC databases indefinitely regardless of network connectivity. Users may not be familiar with the database version (CC/ EULA). On the other hand, they notice the DB auto-update feature in the plugin, so xpack.geoip.db.auto_update should be widely accepted.

auto_update: true, database: nil, same as current behavior trying to update db
auto_update: true, database: path, same as current behavior using user-provided db
auto_update: false, database: nil, use CC db and delete all EULA db
auto_update: false, database: path, same as current behavior using user-provided db

@edmocosta
Copy link
Contributor Author

That's great! thanks for checking it out! I'm gonna implement the changes and update the PR.

@edmocosta edmocosta changed the title Add setting to disable the GeoIP database auto-update Add setting to disable the GeoIP database downloader Jan 4, 2023
Copy link
Contributor

@kaisecheng kaisecheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in my first review! The implementation is clean and the messages to users are very clear. I will do more testing in the second round. You can update the doc for xpack.geoip.downloader.enabled in geoip plugin.
Good work!

rescue => e
details = error_details(e, logger)
details[:databases_path] = get_data_dir_path
logger.error "Failed to delete existing MaxMind EULA databases. To be compliant with the MaxMind EULA, you must "\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

Copy link
Contributor

@kaisecheng kaisecheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally. The offline feature will benefit many users and hopefully, with clear docs, fewer support cases come to us. Just one suggestion, it would be great if you can update the Release Note section, so the release manager can copy your words to the release note.

@edmocosta edmocosta merged commit e4dc82a into elastic:main Jan 5, 2023
@edmocosta edmocosta deleted the add-setting-disable-geoip-db-auto-update branch January 5, 2023 14:46
edmocosta added a commit to logstash-plugins/logstash-filter-geoip that referenced this pull request Feb 7, 2023
This commit added the documentation for the `xpack.geoip.downloader.enabled setting`, implemented on the Logstash issue elastic/logstash#14823.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a config option to disable geoip auto update
3 participants