Skip to content

Stream Enrich

Ben Fradet edited this page Feb 19, 2019 · 8 revisions

HOME > SNOWPLOW TECHNICAL DOCUMENTATION > Enrichment > Stream Enrich

Overview

Stream Enrich is an app, written in Scala, which:

  1. Reads raw Snowplow events off a Kinesis stream populated by the Scala Stream Collector
  2. Validates each raw event
  3. Enriches each event (e.g. infers the location of the user from his/her IP address)
  4. Writes the enriched Snowplow event to another stream

It is designed to be used downstream of the Scala Stream Collector.

It works with Kinesis, Kafka, NSQ and stdin/stdout.

Stream Enrich utilizes the scala-common-enrich Scala project to enrich events and the SnowplowRawEvent for reading Thrift-serialized objects collected with the Scala Stream Collector.

Steam Enrich output

The result of the enrichment process is a TSV representation of the event breakdown of which is outlined below. For the description of each field, please, refer to the Canonical Event Model.


The application (site, game, app, etc.) this event belongs to, and the tracker platform


  • app_id: String
  • platform: String

Date/time


  • etl_tstamp: String
  • collector_tstamp: String
  • dvce_created_tstamp: String

Transaction (i.e. this logging event)


  • event: String
  • event_id: String
  • txn_id: String

Versioning


  • name_tracker: String
  • v_tracker: String
  • v_collector: String
  • v_etl: String

User and visit


  • user_id: String
  • user_ipaddress: String
  • user_fingerprint: String
  • domain_userid: String
  • domain_sessionidx: Integer
  • network_userid: String

Location


  • geo_country: String
  • geo_region: String
  • geo_city: String
  • geo_zipcode: String
  • geo_latitude: Float
  • geo_longitude: Float
  • geo_region_name: String

Other IP lookups


  • ip_isp: String
  • ip_organization: String
  • ip_domain: String
  • ip_netspeed: String

Page


  • page_url: String
  • page_title: String
  • page_referrer: String

Page URL components


  • page_urlscheme: String
  • page_urlhost: String
  • page_urlport: Integer
  • page_urlpath: String
  • page_urlquery: String
  • page_urlfragment: String

Referrer URL components


  • refr_urlscheme: String
  • refr_urlhost: String
  • refr_urlport: Integer
  • refr_urlpath: String
  • refr_urlquery: String
  • refr_urlfragment: String

Referrer details


  • refr_medium: String
  • refr_source: String
  • refr_term: String

Marketing


  • mkt_medium: String
  • mkt_source: String
  • mkt_term: String
  • mkt_content: String
  • mkt_campaign: String

Custom Contexts


  • contexts: String

Structured Event


  • se_category: String
  • se_action: String
  • se_label: String
  • se_property: String
  • se_value: String

Unstructured Event


  • unstruct_event: String

Ecommerce transaction (from querystring)


  • tr_orderid: String
  • tr_affiliation: String
  • tr_total: String
  • tr_tax: String
  • tr_shipping: String
  • tr_city: String
  • tr_state: String
  • tr_country: String

Ecommerce transaction item (from querystring)


  • ti_orderid: String
  • ti_sku: String
  • ti_name: String
  • ti_category: String
  • ti_price: String
  • ti_quantity: String

Page Pings


  • pp_xoffset_min: Integer
  • pp_xoffset_max: Integer
  • pp_yoffset_min: Integer
  • pp_yoffset_max: Integer

User Agent


  • useragent: String

Browser (from user-agent)


  • br_name: String
  • br_family: String
  • br_version: String
  • br_type: String
  • br_renderengine: String

Browser (from querystring)


  • br_lang: String
  • br_features_pdf: Byte_
  • br_features_flash: Byte
  • br_features_java: Byte
  • br_features_director: Byte
  • br_features_quicktime: Byte
  • br_features_realplayer: Byte
  • br_features_windowsmedia: Byte
  • br_features_gears: Byte
  • br_features_silverlight: Byte
  • br_cookies: Byte
  • br_colordepth: String
  • br_viewwidth: Integer
  • br_viewheight: Integer

OS (from user-agent)


  • os_name: String
  • os_family: String
  • os_manufacturer: String
  • os_timezone: String

Device/Hardware (from user-agent)


  • dvce_type: String
  • dvce_ismobile: Byte

Device (from querystring)


  • dvce_screenwidth: Integer
  • dvce_screenheight: Integer

Document


  • doc_charset: String
  • doc_width: Integer
  • doc_height: Integer

Currency


  • tr_currency: String
  • tr_total_base: String
  • tr_tax_base: String
  • tr_shipping_base: String
  • ti_currency: String
  • ti_price_base: String
  • base_currency: String

Geolocation


  • geo_timezone: String

Click ID


  • mkt_clickid: String
  • mkt_network: String

ETL tags


  • etl_tags: String

Time event was sent


  • dvce_sent_tstamp: String

Referer


  • refr_domain_userid: String
  • refr_dvce_tstamp: String

Derived contexts


  • derived_contexts: String

Session ID


  • domain_sessionid: String

Derived timestamp


  • derived_tstamp: String

Derived event vendor/name/format/version


  • event_vendor: String
  • event_name: String
  • event_format: String
  • event_version: String

Event fingerprint


  • event_fingerprint: String

True timestamp


  • true_tstamp: String

See also:

Clone this wiki locally
You can’t perform that action at this time.