Skip to content

Commit

Permalink
Removed enrichment-specific code from EmrEtlRunner (snowplow/snowplow…
Browse files Browse the repository at this point in the history
…#811)

Replaced enrichment-specific args from Emr-EtlRunner -> Hadoop Enrich, replaced with enrichments JSON (snowplow/snowplow#808)
  • Loading branch information
fblundun authored and peel committed May 25, 2020
1 parent 286fa7d commit dc20196
Showing 1 changed file with 2 additions and 12 deletions.
14 changes: 2 additions & 12 deletions lib/snowplow-emr-etl-runner/emr_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,7 @@ def initialize(debug, shred, s3distcp, config)
},
{ :input_format => config[:etl][:collector_format],
:etl_tstamp => etl_tstamp,
:maxmind_file => assets[:maxmind],
:anon_ip_octets => config[:enrichments][:anon_ip]
:enrichments => Base64.strict_encode(JSON.generate(build_enrichments_json(config)))
}
)
@jobflow.add_step(enrich_step)
Expand Down Expand Up @@ -341,7 +340,6 @@ def self.build_enrichments_json(config)
:schema => 'iglu:com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0',
:data => enrichments_json_data
}

end

Contract IgluConfigHash => String
Expand All @@ -351,15 +349,7 @@ def self.jsonify(iglu_hash)

Contract String, String, String => AssetsHash
def self.get_assets(assets_bucket, hadoop_enrich_version, hadoop_shred_version)

asset_host =
if assets_bucket == "s3://snowplow-hosted-assets/"
"http://snowplow-hosted-assets.s3.amazonaws.com/" # Use the public S3 URL
else
assets_bucket
end

{ :maxmind => "#{asset_host}third-party/maxmind/GeoLiteCity.dat",
{
:enrich => "#{assets_bucket}3-enrich/hadoop-etl/snowplow-hadoop-etl-#{hadoop_enrich_version}.jar",
:shred => "#{assets_bucket}3-enrich/scala-hadoop-shred/snowplow-hadoop-shred-#{hadoop_shred_version}.jar",
}
Expand Down

0 comments on commit dc20196

Please sign in to comment.