Skip to content
This repository has been archived by the owner on May 18, 2021. It is now read-only.
ilyakava edited this page Nov 27, 2014 · 56 revisions

Consider this page essential Snowplow reading.

Snowplow at Artsy

A consolidated events stream in several apps.

If you are looking for information on how to consume the event stream, please see the resistance wiki.

Note the 'Artsy Snowplow' sidebar on the right below the 'Pages' sidebar for further reading. If you need the official snowplow wiki, it is here.

A note on repo organization

The official snowplow source repo contains many (>20) deployable apps, breaking heroku's 1-repo-1-app paradigm. For this reason, source management can be complicated.

Our solution is to create a unique branch for each deployable app, creating a unique make deploy task for its deployment in the root of the repo.

Please see the Development and Deployment sections for more details, but keep in mind:

The branches for different apps are completely separate, and will not be merged together in this repository. The commit history of each app is completely unrelated, and occasionally at a completely different point in time. The practices set up in this wiki are meant to work around the difficulties introduced by disparate apps being grouped in the same repository.

We have chosen not to break up the single snowplow repo of a collection of apps, into multiple repos each with one app, in order to stick with the snowplow community, and to be able to push/pull commits if necessary.

Suggestions for improvements are welcome!

The Apps we use

Confused about naming? See the page about Naming.


Note: The changes to the apps here cannot be found in the snowplow/snowplow master branch. Some of these changes are frozen in PRs to snowplow master (as of the time of this wiki's writing). Some of these changes are too custom to PR back to the snowplow org since they make assumptions about deployment to heroku or about our particular schema for events


These are the four main concepts behind snowplow (there is a fifth: analysis, which is mostly resistance)

Here is a list of the four main concepts with reference to the particular apps we use for them:

  1. Tracker (embedded JS/Ruby code)
  • right now we primarily use the standard javascript-tracker in our node apps
  1. Collector (scala web process on heroku)
  • specifically scala-stream-collector
  1. Enricher (scala worker on heroku)
  • specifically scala-kinesis-enrich
  1. Storage (java worker on heroku)
  • specifically kinesis-redshift-sink

Wondering what these apps do? Read "About the flow".

Clone this wiki locally