Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a GeoIP processor. #253

Closed
2 tasks done
laneholloway opened this issue Sep 6, 2021 · 4 comments
Closed
2 tasks done

Add a GeoIP processor. #253

laneholloway opened this issue Sep 6, 2021 · 4 comments
Assignees
Labels
plugin - processor A plugin to manipulate data in the data prepper pipeline.
Milestone

Comments

@laneholloway
Copy link
Contributor

laneholloway commented Sep 6, 2021

Provide a new processor which can enrich Data Prepper events with location information using a provided IP address.

The minimal configuration is to provide a source_key with the JSON Pointer key path.

processor:
  - geoip:
      source_key: "peer/ip"

Additionally, this plugin should be able to use either a MaxMind GeoIP Lite2 database or the GeoIP2 Commercial Licensing database. The Data Prepper author must provide information for configuring the commercial license.

The pipeline author can also specify an optional target_key property to specify where the location fields are written. By default, this will be the root of the event.

Example 1 - Minimal Configuration

processor:
  - geoip:
      source_key: "peer/ip"

Input Event:

"peer" : {
  "ip" : "1.2.3.4"
  "host" : "example.org"
}
"status" : "success"

Output Event:

"peer" : {
  "ip" : "1.2.3.4"
  "host" : "example.org"
}
"status" : "success"
"country" : "United States"
"city_name" : "Seattle"
"latitude" : 47.64097
"longitude" : 122.25894
"zip_code" : "98115"

Example 2 - Target Key

processor:
  - geoip:
      source_key: "peer/ip"
      target_key: "location"

Input Event:

"peer" : {
  "ip" : "1.2.3.4"
  "host" : "example.org"
}
"status" : "success"

Output Event:

"peer" : {
  "ip" : "1.2.3.4"
  "host" : "example.org"
}
"location" : {
  "status" : "success"
  "country" : "United States"
  "city_name" : "Seattle"
  "latitude" : "47.64097"
  "longitude" : "122.25894"
  "zip_code" : "98115"
}
@laneholloway laneholloway created this issue from a note in Data Prepper Project Roadmap (Release: Expanded Log Analytics Features) Sep 6, 2021
@laneholloway laneholloway added the plugin - processor A plugin to manipulate data in the data prepper pipeline. label Sep 6, 2021
@dlvenable dlvenable moved this from 1.3 Release - March 15, 2022: Expanded Log Analytics Features to Backlog in Data Prepper Project Roadmap Oct 12, 2021
@dlvenable dlvenable mentioned this issue Apr 4, 2023
@ashoktelukuntla
Copy link
Contributor

ashoktelukuntla commented May 12, 2023

Is your feature request related to a problem? Please describe.

Pipeline users want to add geographical location details based on IP address to enrich data for analytical purposes.

Describe the solution you'd like

Create GeoIP plugin which will enrich traces with a geo location field, derived from the IP address. This will give customers a more insightful value that will allow them to visualize where traces are coming originating from.

Plugin should be able to use Geo data from MaxMind database / Amazon location Service / User provided path of Geodata.

- geoip:
    targets:
    - source_key: ip
        target_key: target
        attributes: ["location", "city_name", "country_name"]
    - source_key: ip2
        target_key: target2
        attributes: ["location", "city_name"]
  service_type:
    maxmind:
      database_path: /usr/share/local/maxmind (Optional)
      cache_refresh_schedule: PT15D (Optional)
    maxmind_webservice( optional)
      account_id: ""( optional)
      license_key:""  (optional)
      endpoint: << Sample: https://geoip.maxmind.com/geoip/v2.1/country/ >> :( optional)

GeoIP attributes should have many optional attributes like ip, city_name, country_name, continent_code, country_iso_code, postal_code, region_name, region_code, timezone, location, latitude, longitude . Default should be all values included. Location attribute refers to latitude and longitude.

Design Considerations: Any Geo data - consumer needs to keep updating the latest data in the plugin periodically

Additional context

Resources :

MaxMind data: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data?lang=en
Amazon Location Services: https://aws.amazon.com/location/, https://docs.aws.amazon.com/location/latest/developerguide/search-place-index-geocoding.html
Java Implementation : https://maxmind.github.io/GeoIP2-java/

@dlvenable
Copy link
Member

@ashoktelukuntla , I'd like to propose that we decouple the service configuration from the pipeline.

In data-prepper-config.yaml

plugins:
  geoip:
    location_service:
      maxmind:
        database_path: /usr/share/local/maxmind

The pipelines need only have:

pipeline1:
  processor:
    - geoip:
         source_key: "peer/ip"
pipeline2:
  processor:
    - geoip:
         source_key: "attributes/ip_address"

The work being done for #2588 can help support this type of configuration

@jimishs
Copy link

jimishs commented Jun 6, 2023

Hi team, this feature would also be tremendously useful for security analytics usecases. Can you share any timelines for this feature? Thanks

@dlvenable dlvenable moved this from Backlog to 2.5 (Jul/Aug 2023) in Data Prepper Project Roadmap Aug 12, 2023
@dlvenable dlvenable added this to the v2.5 milestone Aug 12, 2023
@dlvenable dlvenable moved this from 2.5 (mid Oct 2023) to 2.6 (Nov/Dec 2023) in Data Prepper Project Roadmap Sep 19, 2023
@dlvenable dlvenable modified the milestones: v2.5, v2.6 Sep 19, 2023
@dlvenable dlvenable removed this from the v2.6 milestone Oct 6, 2023
@dlvenable dlvenable moved this from 2.6 (mid/late Nov 2023) to 2.7 (early 2024) in Data Prepper Project Roadmap Oct 6, 2023
@dlvenable dlvenable added this to the v2.7 milestone Oct 6, 2023
@dlvenable dlvenable moved this from 2.7 (Dec 2023) to 2.8 (early 2024) in Data Prepper Project Roadmap Nov 1, 2023
@dlvenable dlvenable modified the milestones: v2.7, v2.8 Nov 1, 2023
@dlvenable dlvenable moved this from 2.8 (early 2024) to 2.7 (early 2024) in Data Prepper Project Roadmap Nov 16, 2023
@dlvenable dlvenable moved this from 2.7 (early 2024) to 2.8 (early 2024) in Data Prepper Project Roadmap Nov 28, 2023
@dlvenable dlvenable modified the milestones: v2.8, v2.7 Dec 13, 2023
@dlvenable dlvenable moved this from 2.8 (early 2024) to 2.7 (early 2024) in Data Prepper Project Roadmap Dec 13, 2023
@dlvenable dlvenable removed the backlog label Dec 14, 2023
@dlvenable dlvenable self-assigned this Mar 21, 2024
@dlvenable
Copy link
Member

Last PR to complete this implementation for 2.7: #4307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin - processor A plugin to manipulate data in the data prepper pipeline.
Projects
Development

No branches or pull requests

5 participants