Skip to content

Commit

Permalink
Removed enrichment-specific code from EmrEtlRunner (#811)
Browse files Browse the repository at this point in the history
Replaced enrichment-specific args from Emr-EtlRunner -> Hadoop Enrich, replaced with enrichments JSON (#808)
  • Loading branch information
fblundun committed Jul 9, 2014
1 parent a0287d8 commit 5b30db3
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 12 deletions.
14 changes: 2 additions & 12 deletions 3-enrich/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,7 @@ def initialize(debug, shred, s3distcp, config)
},
{ :input_format => config[:etl][:collector_format],
:etl_tstamp => etl_tstamp,
:maxmind_file => assets[:maxmind],
:anon_ip_octets => config[:enrichments][:anon_ip]
:enrichments => Base64.strict_encode(JSON.generate(build_enrichments_json(config)))
}
)
@jobflow.add_step(enrich_step)
Expand Down Expand Up @@ -341,7 +340,6 @@ def self.build_enrichments_json(config)
:schema => 'iglu:com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0',
:data => enrichments_json_data
}

end

Contract IgluConfigHash => String
Expand All @@ -351,15 +349,7 @@ def self.jsonify(iglu_hash)

Contract String, String, String => AssetsHash
def self.get_assets(assets_bucket, hadoop_enrich_version, hadoop_shred_version)

asset_host =
if assets_bucket == "s3://snowplow-hosted-assets/"
"http://snowplow-hosted-assets.s3.amazonaws.com/" # Use the public S3 URL
else
assets_bucket
end

{ :maxmind => "#{asset_host}third-party/maxmind/GeoLiteCity.dat",
{
:enrich => "#{assets_bucket}3-enrich/hadoop-etl/snowplow-hadoop-etl-#{hadoop_enrich_version}.jar",
:shred => "#{assets_bucket}3-enrich/scala-hadoop-shred/snowplow-hadoop-shred-#{hadoop_shred_version}.jar",
}
Expand Down
8 changes: 8 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
Version 0.9.X (2014-XX-XX)
--------------------------
Schemas: added schema for collection of all enrichments (#806)
Schemas: added campaigns schema (#805)
Schemas: added ip_anon schema (#804)
Schemas: added ip_to_geo schema (#803)
EmrEtlRunner: Removed enrichment-specific code (#811)
EmrEtlRunner: Replaced enrichment-specific args to Hadoop Enrich with enrichments JSON (#808)
Scala Common Enrich: bumped to 0.5.0 - TODO
Scala Common Enrich: stored etl_tstamp in new field in CanonicalOutput (#818)
Scala Common Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#836)
Scala Hadoop Enrich: bumped to Scala Common Enrich 0.5.0 - TODO
Scala Hadoop Enrich: passed etl_tstamp into Scala Common Enrich (#817)
Scala Hadoop Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#835)
Redshift: bumped table-def to 0.4.0
Expand Down

0 comments on commit 5b30db3

Please sign in to comment.