METRON-283 Migrate Geo Enrichment outside of MySQL #421

justinleet · 2017-01-20T13:29:45Z

MySQL Removed

Drops MySQL entirely from the project. This is done for a couple reasons outlined in a discussion thread on the dev lists. They boil down to a combination of licensing, eliminating a single point of failure, and using MaxMind's libraries for handling GeoLite2 data, and a couple other concerns.

This PR includes dependencies, installation paths, READMEs, etc. The only places left are MySQLConfig and MySQLConfigTest for if anyone wants to use them. The vast majority of removed files / code are simply from stripping this out. If any traces of MySQL outside of this are found in review, they should almost certainly be removed.

This moves to a system based on using MaxMind's binary DB directly. Per their page:

The GeoLite2 databases are distributed under the Creative Commons Attribution-ShareAlike 4.0 International License.

Our LICENSE file has been updated with the notification that we include use their database (and some of their test data, which is CCA ShareAlike 3.0). Both of these licenses are acceptable for us as stated on Apache Legal.

Let me know if that notification should be in a different spot or spots, and I can adjust it appropriately.

GeoLite2 Database

The main portion of the PR is in GeoLiteDatabaseTest.java, which manages access to the GeoLite2 database.

Raw Database

The raw database is stored on HDFS. By default, it will be in /apps/metron/geo. If no explicit location is given, /apps/metron/geo/default/<db_filename> will be used. Otherwise, updates will use /apps/metron/geo/<millis>/<db_filename>. Given the low rate of churn on the DB (updated once per week) and the potential for replay use cases , I haven't implemented any pruning or anything fancy on top of this.

Updating DB

A script is provided for updates geo_enrichment_load.sh in metron-data-management. Usage details are provided in metron-data-management/README.md. Note that the original didn't appear to have update capabilities,

The script will pull down a new instance of GeoLite2 database. This location can be either their standard web address (or somewhere else hosted), or even a file:// URL. Once the db file is pulled down, it will push to the appropriate HDFS location. Finally, it will pull down and update the global config with the new location. This will not require a topology restart.

Note that there have been conversations about how we manage config updates (specifically leaning towards Ambari). This has not been finalized, and we have two non Ambari testing environments (quickdev and docker-metron) so this just hits ZK. Ambari is not updated based on this script, and it is the user's responsibility to update global.json.

This leads to a questions people may have preferences on

Do we want the script to always update? Should there be a flag to stage the file, but not update configs?

Code

It is a singleton that allows for the database to be updated when a global config is updated. It is (hopefully!) correctly locked to avoid threading issues when updating or reading from the DB (and I've been able to update without issues.

The various Bolts have been updated to make sure they initialize the adapter to have it grab the current data appropriately.

In addition, a Stellar function has been provided GEO_GET(), which takes an IPV4 address. It probably works with an IPV6 address, but I didn't really dig into it, given that the goal was to initially match parity.

Given the somewhat core nature of this, and my relative unfamiliarity going in with how all these pieces tie together, I'm definitely looking for feedback on how things are implemented, or if I missed conventions we've used in the code.

Testing

Unit testing is added for the database and Stellar portions of the code as needed. The DB testing uses one of MaxMind's test DB's that they've published, because we can't create the binary format correctly. It does not use the full (20+ MB) version of the data, but rather a stripped down version (on the order of several KB).

Three environments were tested during this. Having these three disparate environments make features that cut across like this more complicated to test, so additional scrutiny would be merited (I would definitely like at least one person to run through one of these themselves and make sure it's transparent). Notably, quickdev requires Ansible setup scripts to align; the mpack requires layout, internal configuration, and handling of additional files and ownership of scripts to work properly; and docker-metron requires essentially cheating the scripts and just running a wget on the file because things aren't actually setup.

quickdev
Ambari Management Pack
docker-metron

QuickDev

Ansible scripts are updated. Running data through topologies kicked out the data.

Ambari Management Pack

RPMs updated where needed. Config Screen layout changed, updates made to properly handle configs and ownership. Ran Stellar on this install. Again, ran data through the topology.

Docker

Essentially this just involved cheating the scripts and running a wget on the GeoLite2 dbfile, because there's no Hadoop. Ran through the instructions to run the topologies (which are a little different than the others because Docker) and again was able to get data back out.

Additional Notes

As noted above, do we want the DB script to always update? Should there be a flag to stage the file, but not update configs? I primarily see this affecting the mpack because of the Ambari management behind it.
Increased withMaxTimeMS in the indexing integration test. This seems unrelated to my changes ( and I believe had been seen elsewhere), so if anybody has found the root cause, I can adjust my code appropriately. Outside issue (and code is reverted in this PR), see: https://issues.apache.org/jira/browse/METRON-672.
LocID doesn't technically exist in the new data, and I suspect it was never meant to be relied upon anyway outside of being a join key. The same applies to the new field that is replacing it in this context. It seems like we were mostly just passing that field along because it was available, and it seems like it should be refactored to be more useful. I didn't take on that analysis here, it's the slightly more validated version of a gut feeling.
The newer form of the MaxMind info has more data available than the old source we were using. We should also consider passing (at least some) of this data along. See MaxMind's What's New in GeoIp2. One of the ones that leapt out at me as potentially interesting was a field containing where an IP was registered, rather than just where the IP actually is. Another is fields for is_anonymous_proxy key, etc. I didn't validate if everything new is in the free version of the dataset.

justinleet · 2017-01-20T13:52:30Z

Expanding the Stellar GEO_GET function to allow for just returning a map of specified fields (if more than one), or just the direct value (if just one field requested) would be very nice.

e.g. GEO_GET('', ['country', 'city']) returns {"country": "US", "city:"Chicago"} and GEO_GET('ip', ['county'] just returns 'US'

cestella

This looks great...first pass through and a few comments, but I love it so far!

cestella · 2017-01-23T14:19:55Z

LICENSE

@@ -210,6 +210,12 @@ This product bundles some test examples from the Stix project (metron-platform/m

 This product bundles wait-for-it.sh, which is available under a "MIT Software License" license.  For details, see https://github.com/vishnubob/wait-for-it

+------------------------------------------------------------------------------------
+


Could you make the license explicit here?

Added header with license similar to the MIT one above

Updated with another license. The testing DB used is under a share-alike 3.0 (not 4.0) license, so I explicitly called it it and added a link to the GitHub repo.

cestella · 2017-01-23T14:21:33Z

.../metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml

@@ -225,6 +163,8 @@ indexing.executors=0
 kafka.zk={{ zookeeper_quorum }}
 kafka.broker={{ kafka_brokers }}
 kafka.start=WHERE_I_LEFT_OFF
+##### GEO #####
+geo.hdfs.file={{ geoip_hdfs_file }}


Why is the geo IP data file in HDFS configured in flux rather than zookeeper?

Shouldn't be there at all, deleted it.

cestella · 2017-01-23T14:24:49Z

...n-enrichment/src/main/java/org/apache/metron/enrichment/adapters/stellar/StellarAdapter.java

@@ -81,7 +81,6 @@ public JSONObject enrich(CacheKey value) {
    Map<String, Object> globalConfig = value.getConfig().getConfiguration();
    Map<String, Object> sensorConfig = value.getConfig().getEnrichment().getConfig();
    if(handler == null) {
-        _LOG.trace("Stellar ConfigHandler is null.");


Didn't like the trace statements there?

Added back in. Didn't intend to drop them at all.

cestella · 2017-01-23T14:26:19Z

...metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/GenericEnrichmentBolt.java

@@ -161,6 +167,7 @@ protected void initializeStellar() {
    stellarContext = new Context.Builder()
                         .with(Context.Capabilities.ZOOKEEPER_CLIENT, () -> client)
                         .with(Context.Capabilities.GLOBAL_CONFIG, () -> getConfigurations().getGlobalConfig())
+                         .with(Context.Capabilities.GEO_IP, () -> GeoLiteDatabase.INSTANCE)


Since the INSTANCE is static, is there a reason to not just refer to it directly in the Stellar function rather than adding a capability? I always envisioned capabilities to be for configs which were changing at runtime.

Also- why is this in the GenericEnrichment bolt? Am I mistaken, or wouldn't this be better located in the GEO bolt?

@cestella Capability was left over from earlier testing. Everything referred to the static instance anyway for precisely that reason. Dropped it entirely and removed from the enum.

@dlyle65535 There's not a geo specific bolt, it's an instance of GenericEnrichment. To the best of my knowledge (and correct me if I'm wrong), you can't grab global configs in the adapters. This specific one gets dropped, but the prepare() method references the static INSTANCE Casey refers to. Is there somewhere that could live?

cestella · 2017-01-23T14:30:09Z

...m/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/ThreatIntelJoinBolt.java

@@ -63,12 +65,15 @@ public ThreatIntelJoinBolt(String zookeeperUrl) {
  @Override
  public void prepare(Map map, TopologyContext topologyContext) {
    super.prepare(map, topologyContext);
+    GeoLiteDatabase.INSTANCE.update((String)getConfigurations().getGlobalConfig().get(GeoLiteDatabase.GEO_HDFS_FILE));


This can certainly stand now, but it opens the question as to whether we want to register some easier way to do this than remembering to call the right set of functions in all of the bolts that use stellar as a follow-on JIRA. Just a thought.

I completely agree with this, particularly because this shows up a couple times, but without a particularly good way to avoid it that I saw.

I'm definitely open to suggestions if there's anything we want to do short term to make this a little better or clearer.

…tside issue

justinleet · 2017-01-25T13:00:29Z

As a note to anyone coming in late, at least one comment (David's) is still relevant, but hidden behind a collapsed outdated diff, because the line he commented on needed to be deleted, but the point he brought up is still in active conversation.

justinleet · 2017-01-25T13:10:42Z

Forgot to comment on it, but the GEO_GET call was updated a bit ago per my comment about lists and fields. That change should also be reviewed and given any feedback.

nickwallen

This is a really nice contribution. Love that we don't need MySQL any longer. This was a ton of work. Thanks, Justin.

nickwallen · 2017-01-25T15:04:53Z

metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec

@@ -317,6 +316,8 @@ This package installs the Metron Profiler %{metron_home}
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 %changelog
+* Thu Jan 19 2017 Justin Leet <justinjleet@gmail.com> - 0.3.0


Unrelated to this PR, but I thought it was un-Apache to put individual's names and emails in the source code?

I'm not even sure who we should talk to about how it should be handled. I'm more than happy to adjust the changelog as needed, but I need to know how to change it first.

It is true that apache recommends strongly (or maybe outright forbids) things like author tags in source code ( see here for a discussion, the reasoning was mostly around the author tags not being accurately representative.

For this situation, however, this is a changelog, so it doesn't have the problem of accurate representation that author tags do. That being said, it is redundant information because such information is stored in git.

I'd recommend doing a dev list discussion with a subject that starts with [MENTORS] on this to see what the ASF wants.

I'd just yank that section or leave it blank if it's required. No point in figuring out how to keep it if we're agreed we don't need it.

+1 to that @dlyle65535

I sent out the email before I saw this, but I'm +1 on just dropping it. I'll pull it in this PR, and whatever answer we get from the mentors will just be for our info.

But so we're clear- with the changelog in place you can interrogate the rpm directly to find out what's changed "rpm -q --changelog". We'd lose that capability. I'm still good with omitting it, but I wanted to make sure everyone was aware of that consequence.

Hmm, you're right @dlyle65535, we do lose some capabilities by not having the changelog in there. You know what would be cool, if we could generate the spec files as part of maven's generate-source and fill in the changelog with the information from git. Anyway, best of both worlds kinda thing, but not effort that should be done as part of this PR.

I think that's where I'm at with it. How about we let @justinleet's mentor thread confirm/deny that it's okay to keep it and then handle it as a directed effort after?

sounds good :)

nickwallen · 2017-01-25T15:12:17Z

metron-deployment/roles/snort/files/snort.conf

@@ -586,7 +586,6 @@ include $RULE_PATH/community.rules
 # include $RULE_PATH/malware-tools.rules
 # include $RULE_PATH/misc.rules
 # include $RULE_PATH/multimedia.rules
-# include $RULE_PATH/mysql.rules


I don't think that we actually want to remove the MySQL rules from Snort. These rules help Snort create alerts based on MySQL databases that may be running in the protected network.

Good catch. I'll add it back in.

nickwallen · 2017-01-25T15:18:53Z

metron-platform/metron-data-management/README.md

+The shell script `$METRON_HOME/bin/geo_enrichment_load.sh` will retrieve MaxMind GeoLite2 data and load data into HDFS, and update the configuration.
+
+THIS SCRIPT WILL NOT UPDATE AMBARI'S GLOBAL.JSON, JUST THE ZK CONFIGS.  CHANGES WILL GO INTO EFFECT, BUT WILL NOT PERSIST PAST AN AMBARI RESTART UNTIL UPDATED THERE.
+


This seems like something that could cause a really annoying problem for a user. Is there no simple action we can take in this PR to stop Ambari from clobbering things?

Obviously, deploying/updating the geo database in an Ambari "service action" would be ideal. Would be nice to have one way to do it that does not break on restart.

Or maybe we need to accept this for now and once we have Ansible deployments using the MPack, we can implement an Ambari "service action" for this?

It's annoying, for sure. It's a persistent problem and one of the reasons for that discuss thread about how to handle configurations. I believe the conclusion arrived at there is that in the future state, we should push changes through ambari which will then update zookeeper, thereby avoiding this unfortunate state.

This gets into the whole "How do we manage configs discussion?". Unfortunately, it's in a really awkward spot. I might be able to add a service action to sorta take care of it, but it still probably does the same end around of Ambari's management, except it's going through the UI. I don't know that there's a good solution to this until we unify our config management, which is the real answer here.

Given that I don't think there was originally support for updating the db, I'm inclined to clean it up as part of unifying config management. It's ugly and I don't like it, but I can't come up with a good, short-term, alternative that isn't ugly for other reason anyway.

nickwallen · 2017-01-25T15:39:50Z

...on-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java

+        return null;
+      }
+      if(args.size() > 2) {
+        return null;


Should we throw an exception? If a user misuses the function, then let's tell them.

Is there a well defined expectation for what happens in cases like this? I don't think I've seen anything, and on reflection I'm pretty sure I just based that off what other functions did.

@cestella Can you shed some light on what perceptions/expectations exist for this case? Or shame my reading comprehension by pointing me to where we have them documented, if we do?

So it is true that we do not tend to throw exceptions on some errors in Stellar functions for reasons that may or may not be valid. The original intent was to not stop topologies because an error has happened in a specific function. I happen to think that this isn't correct reasoning anymore, but that's a discussion for another thread.

In this specific case, that the user has passed in too many arguments, I think you could make a reasonable argument for a couple of different semantics:

do nothing, passing in too many arguments does no harm

throw an exception as Nick suggests

return null

I actually think I agree in this case with Nick, I'd rather err on the side of caution and throw an exception here so we don't have people using functions wrong and not realizing it.

Now throwing an IllegalArgumentException and adjusted the test.

It is a little tricky because we don't want a data quality issue to impact the system negatively, but at the same time failing fast can make the user experience much better.

I have been thinking about it this way. If it is a condition that can be triggered by a data quality issue, then try to be as forgiving as possible. If it is clearly a 'static' condition, like I passed the wrong number of arguments, then it is appropriate to fail-fast.

nickwallen · 2017-01-25T15:57:31Z

...tron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java

+    return addr.isAnyLocalAddress() || addr.isLoopbackAddress()
+            || addr.isSiteLocalAddress() || addr.isMulticastAddress()
+            || !ipvalidator.isValidInet4Address(ipStr);
+  }


Method name may be a little misleading. It is more answering the question of whether an IP is eligible for geo-enrichment; not whether it is a local address.

I agree, I'll go ahead and change the method name.

nickwallen · 2017-01-25T16:08:12Z

...on-enrichment/src/main/java/org/apache/metron/enrichment/stellar/GeoEnrichmentFunctions.java

+                      "fields - Optional list of GeoIP fields to grab. Options are locID, country, city, postalCode, dmaCode, latitude, longitude, location_point"
+                    }
+          ,returns = "If a Single field is requested a string of the field, If multiple fields a map of string of the fields, and null otherwise"
+  )


Seems useful. Would we ever deprecate the current way of doing enrichment in favor of just using Stellar and a function like this? May take some more work, but that might simplify things.

(Comment for future state; obviously outside the scope of your PR)

Outside of the scope of this PR, I agree, but I just wanted to chime in and agree with the sentiment. Now that we have geo IP enrichment, honestly, stellar enrichments aren't missing any features that the other enrichments have. Worthy of a discussion, for sure.

Do we want to kick off a discuss thread for this? Seems like we might as well start a discussion, rather than potentially having it buried here. And the discussion is useful even before everyone's set with this PR.

Sure. Will do.

nickwallen · 2017-01-25T16:08:28Z

...metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/GenericEnrichmentBolt.java

@@ -149,9 +154,10 @@ public JSONObject load(CacheKey key) throws Exception {
    cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
            .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
            .build(loader);
+    GeoLiteDatabase.INSTANCE.update((String)getConfigurations().getGlobalConfig().get(GeoLiteDatabase.GEO_HDFS_FILE));
    boolean success = adapter.initializeAdapter();


Does this mean that we will attempt to load the geo database into memory in every GenericEnrichmentBolt; even ones not doing geo-enrichment? Maybe that is one reason, we have singleton for the GeoLiteDatabase; to avoid repeated initialization?

Along the same lines, If I choose to not do geo-enrichment, do I still need to have a valid Maxmind file in HDFS? Or would that cause all GenericEnrichmentBolts to fail to initiailize?

Would it make sense to update the EnrichmentAdapter interface to accept configuration values, so that this initialization occurs in the GeoAdapter where it makes more sense? Then only the bolts doing geo-enrichment attempt to initialize the geo data?

You're right that if we adjust the enrichment adapter to accept configuration values, this can be pushed to the GeoAdapter, where I agree it makes more sense.

Are there concerns over changing the interface method directly, or would we prefer to give EnrichmentAdapter a initializeAdapter(Map<String, Object> config) with default impl that calls initializeAdapter() with no args? The second option makes it a bit ugly that GeoAdapter still has the original initializeAdapter(), but has the benefit of keeping the interface backwards compatible to anybody with custom adapters (Is this a thing that people have?).

And you are correct that there is an issue with initializing a missing DB (Great catch!). If we make this change, I believe the only time you'd have an issue is if you set up a geo enrichment and didn't actually set up the DB. Which is an entirely reasonable issue to have an error with.

@dlyle65535 Would this address your concerns about having the init in the GenericEnrichmentBolt in a satisfactory manner?

@justinleet - I think that's a great solution, thanks!

nickwallen · 2017-01-25T16:09:00Z

...tron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java

+import java.util.zip.GZIPInputStream;
+
+public enum GeoLiteDatabase {
+  INSTANCE;


Why did you go with a singleton? I'm sure you thought through this and there are good reasons, so just want to understand. I just don't like the extra burden that using a singleton brings on (like the extra locking).

So, I'll defend the singleton use:

We can reuse the same database across multiple storm tasks on the same storm worker

Even without a singleton, you'd have to deal with the locking in the stellar function since those are effectively static. i.e. You're reading a database that could be being updated/replaced.

I'll let Justin respond with his reasons, but those are the ones that came to my head.

It's primary because the database file could be used and accessed by multiple threads of execution, depending on Storm's parallelism. If multiple tasks are running, they could potentially start stomping on each other's data.

The burden shouldn't be bad, because it primarily consists of reading. The alternative is having to manage potentially every thread pulling down it's own version of the file and loading it while avoiding tripping up other threads.

Having said that, I am definitely open to other ideas if we have good alternatives.

nickwallen

I wanted to look over the locking code one more time.

nickwallen · 2017-01-25T20:12:35Z

...tron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java

+    try {
+      readLock.lock();
+      addr = InetAddress.getByName(ip);
+      CityResponse cityResponse = reader.city(addr);


Would it make sense to constrain the scope of the read lock? Once we have a CityResponse object we don't need the lock any longer.

nickwallen · 2017-01-25T20:13:29Z

...tron-enrichment/src/main/java/org/apache/metron/enrichment/adapters/geo/GeoLiteDatabase.java

+    }
+
+    // Always update if we don't have a DatabaseReader
+    if (reader == null || !hdfsLoc.equals(hdfsFile)) {


Does this null check need to be protected by a lock? I am really not sure.

UpdateIfNecessary is actually synchronized, so it's already locked. Given that updates are rare and this is only run on global configs being updated, impact will be very low (since it's called very rarely).

justinleet · 2017-01-25T20:49:42Z

Updated to replace EnrichmentAdapter.initializeAdapter() with EnrichmentAdapter.initializeAdapter(Map<String, Object> config). Also added updateAdapter(Map<String, Object> config);. Right now the configs only get used by GeoAdapter, but all adapters can use configs it is passed as it needs. All implementing classes have the method updated, but simply ignore the config param.

This removes the direct dependency of GenericEnrichmentBolt on GeoLiteDatabase. It will simply call init and update to delegate to each adapter what to do (if anything). Update is called during reloadCallback if global config updates.

I'll want to spin it up again to further validate, but this addresses @nickwallen's catch on topologies not using geo needing geo data to exist and @dlyle65535's concern about the GenericEnrichmentBolt needing to know about geo for everything. Now errors should only occur if a topology using geo data can't find it.

nickwallen

Looks great. +1

I rant it up on "Quick Dev" and geo-enrichment was working as expected.

cestella · 2017-01-26T13:44:03Z

+1 as well, great job, Justin.

dlyle65535 · 2017-01-26T14:23:56Z

Ditto +1. Thanks for all the hard work!

justinleet · 2017-01-26T15:10:23Z

@cestella @dlyle65535 @nickwallen I added one more change to EnrichmentIntegrationTest to not create and destroy some tmp stuff that's not being used in the test. Let me know if there's any concerns.

nickwallen · 2017-01-26T16:17:29Z

+1 looks good

dlyle65535 · 2017-01-26T16:17:41Z

Still +1 @justinleet - thanks for checking.

cestella · 2017-01-26T16:18:06Z

reaffirmed +1

justinleet added 12 commits January 18, 2017 07:48

Drop MySQL and use the GeoLite2 databases instead

fe5e6b8

Updating unit test

b907d26

Update with MaxMind license

1b434ba

geo url fix

dd14dbf

fixing docker

09b33b7

fixing Ansible after adjusting default path

c9cbb23

Fixing metron-docker

859106a

Update docs

b6bfc16

updating stellar function and readme

6760c3e

Adding a couple unit tests and cleaning up Stellar function results

46e50bb

Updating Stellar docs

fc856f6

Adding note to readme about script interaction with Ambari

6df268d

Adding second param to GEO_GET to filter result fields

52171a6

cestella suggested changes Jan 23, 2017

View reviewed changes

justinleet added 2 commits January 25, 2017 07:35

fixing review comments

079b37f

Rolling back integration test increased timing. Doesn't fix and is ou…

248fd5c

…tside issue

nickwallen reviewed Jan 25, 2017

View reviewed changes

justinleet added 2 commits January 25, 2017 12:04

More review comments

7d00731

Updating args error of stellar function

e75d48b

nickwallen reviewed Jan 25, 2017

View reviewed changes

justinleet added 2 commits January 25, 2017 15:35

Adding config to adapter init and adding an updateAdapter

bb67aed

Removing unused import

1f2201f

nickwallen approved these changes Jan 25, 2017

View reviewed changes

Updating license to mention the license for the test data

9ef49d5

Don't create and tear down tmp data

cda133d

asfgit closed this in fa6f3df Jan 26, 2017

justinleet deleted the geo_mmdb branch April 4, 2017 12:14

		@@ -210,6 +210,12 @@ This product bundles some test examples from the Stix project (metron-platform/m

		This product bundles wait-for-it.sh, which is available under a "MIT Software License" license. For details, see https://github.com/vishnubob/wait-for-it

		------------------------------------------------------------------------------------

		The shell script `$METRON_HOME/bin/geo_enrichment_load.sh` will retrieve MaxMind GeoLite2 data and load data into HDFS, and update the configuration.

		THIS SCRIPT WILL NOT UPDATE AMBARI'S GLOBAL.JSON, JUST THE ZK CONFIGS. CHANGES WILL GO INTO EFFECT, BUT WILL NOT PERSIST PAST AN AMBARI RESTART UNTIL UPDATED THERE.

METRON-283 Migrate Geo Enrichment outside of MySQL #421

METRON-283 Migrate Geo Enrichment outside of MySQL #421

Conversation

justinleet commented Jan 20, 2017 • edited

MySQL Removed

GeoLite2 Database

Raw Database

Updating DB

Code

Testing

QuickDev

Ambari Management Pack

Docker

Additional Notes

justinleet commented Jan 20, 2017

cestella left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinleet Jan 25, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinleet commented Jan 25, 2017

justinleet commented Jan 25, 2017

nickwallen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cestella Jan 25, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickwallen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinleet commented Jan 25, 2017

nickwallen left a comment

Choose a reason for hiding this comment

cestella commented Jan 26, 2017

dlyle65535 commented Jan 26, 2017

justinleet commented Jan 26, 2017

nickwallen commented Jan 26, 2017

dlyle65535 commented Jan 26, 2017

justinleet commented Jan 20, 2017 •

edited

justinleet Jan 25, 2017 •

edited

cestella Jan 25, 2017 •

edited