Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

METRON-733: Remove Geo database from ParserBolt #461

Closed
wants to merge 4 commits into from

Conversation

justinleet
Copy link
Contributor

To create the original problem, just run up a parser and make sure there's no geo data on hdfs (by default in /apps/metron/geo).

This PR removes geo from metron-parsers entirely (since it shouldn't be necessary at all, only pulled in when Stellar cares about it).

Testing

To validate this, I basically ran through the squid demo with the geo file missing, which works as expected (no exception thrown). In addition, the ParserBoltTest is updated to not have a reference to the test Geo DB data (and runs fine without it).

To ensure that Stellar GEO_GET works as expected in a parser, quick-dev was spun up. The steps for squid were followed, but with a custom parser config

{
  "parserClassName": "org.apache.metron.parsers.GrokParser",
  "sensorTopic": "squid",
  "parserConfig": {
    "grokPath": "/patterns/squid",
    "patternLabel": "SQUID_DELIMITED",
    "timestampField": "timestamp"
  },
  "fieldTransformations" : [
    {
      "transformation" : "STELLAR"
    ,"output" : [ "geo_test" ]
    ,"config" : {
      "geo_test" : "GEO_GET(ip_dst_addr)"
                }
    }
                           ]
}

Either update global.json with a valid geo.hdfs.file or run /usr/metron/0.3.1/bin/geo_enrichment_load.sh -z node1:2181 -r /apps/metron/geo/default/ to place the file in the default spot (instead of a timestamped stop). This is necessary to ensure that the push doesn't clobber geo configs.

The resulting data in the index includes

{
...
               "geo_test": {
                  "country": "US",
                  "dmaCode": "807",
                  "city": "San Francisco",
                  "postalCode": "94107",
                  "latitude": "37.7697",
                  "location_point": "37.7697,-122.3933",
                  "locID": "5391959",
                  "longitude": "-122.3933"
               },
...
               "ip_dst_addr": "151.101.192.73",
...
}

@ottobackwards
Copy link
Contributor

Does this mean you can get rid of the local geo database in the parsers resource dir?

@justinleet
Copy link
Contributor Author

Updated to drop the local geo database in parsers

@nickwallen
Copy link
Contributor

The build error does not seem related to your PR. I saw the same failure on #450.

The resource com.fasterxml:oss-parent:pom:28 really does not exist in Maven Central, AFAIK. Hmm.

[ERROR] Failed to execute goal on project metron-enrichment: Could not resolve dependencies 
for project org.apache.metron:metron-enrichment:jar:0.3.1: 
Failed to collect dependencies at com.maxmind.geoip2:geoip2:jar:2.8.0 ->
 com.maxmind.db:maxmind-db:jar:1.2.1 -> 
com.fasterxml.jackson.core:jackson-databind:jar:2.9.0-SNAPSHOT: 
Failed to read artifact descriptor for com.fasterxml.jackson.core:jackson-databind:jar:2.9.0-SNAPSHOT: 
Could not find artifact com.fasterxml:oss-parent:pom:28 in central 
(http://repo.maven.apache.org/maven2) -> [Help 1]

@justinleet
Copy link
Contributor Author

@nickwallen it is a separate issue. https://issues.apache.org/jira/browse/METRON-734 exists to track and fix it

@ottobackwards
Copy link
Contributor

ottobackwards commented Feb 22, 2017 via email

@cestella
Copy link
Member

+1 by inspection, looks good! Thanks @justinleet

@asfgit asfgit closed this in 898d236 Feb 24, 2017
@justinleet justinleet deleted the geo_profiler_fix branch April 4, 2017 12:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants