Skip to content
main
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
app
 
 
bin
 
 
 
 
db
 
 
lib
 
 
log
 
 
 
 
 
 
tmp
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Snowplow Ruby tracker examples

An example of how to incorporate Snowplow trackers (SDKs) into a Rails project.

Versions used:

Ruby v3.0.0
Rails v6.1.4
Ruby tracker v0.8.0
JavaScript tracker v3.1.0

Table of Contents

1. Quick Start

Requirements: Ruby 3.0, Yarn, Bundle, Docker, and Cypress (for the tests).

Install project dependencies:

yarn install
bundle install

Run the app:

# To run the app on port 3000
rails server

The Ruby and JavaScript trackers are configured to use Snowplow Micro as the event collector.

Start Micro:

# run from the root folder of this app
docker run \
  --mount type=bind,source=$(pwd)/snowplow-micro,destination=/config \
  -p 9090:9090 \
  snowplow/snowplow-micro:1.2.1 \
  --collector-config /config/micro.conf \
  --iglu /config/iglu.json

Interact with the site to generate events.

Before running the Cypress tests, create a .env file in the root folder and copy in the code from .env.example. Run tests:

# Rails tests
rspec

# Cypress tests with UI
rails cypress:open

# Cypress tests headless
rails cypress:run

2. Tracking Design and Implementation

2.1 This demo app

This Snowplow shop sells skiing equipment. We want to understand how much traffic the website gets, and how users move through the site. In the shop, we want to track when a product is added to the shopping basket, and when products are purchased.

Both the Ruby and JavaScript Snowplow tracker SDKs are included in this app, for server-side and client-side tracking. This allows tracking of events in the most appropriate way for each event. For example, Page Views are often best tracked client-side, as the client has easy access to information about e.g. IP address. However, server-side Page View tracking can be more accurate as no events will be lost to adblockers. CRUD actions or activities such as purchasing, which are processed by the server, should be tracked server-side. Read more about designing tracking in the Snowplow docs.

This demo does not include any authentication or database functionality.

2.2 Ruby tracker

The Ruby tracker is imported as a gem.

# in Gemfile

gem "snowplow-tracker", "~> 0.8.0"

The tracker is written as a Singleton global object, to avoid reinitializing new Trackers and Emitters on every page load. The tracker set-up code is found in app/lib/tracker.rb (only files within app auto-reload, so for ease of development the app/lib folder is used here instead of lib).

# adapted from snowplow.rb

require "snowplow-tracker"
require "singleton"

class Snowplow
  include Singleton

  def tracker
    return @tracker unless @tracker.nil?

    @tracker = SnowplowTracker::Tracker.new(emitters: emitter)
  end

  private

  def emitter
    return @emitter unless @emitter.nil?

    @emitter = SnowplowTracker::AsyncEmitter.new(endpoint: "localhost:9090")
  end
end

Here, Page View tracking is defined, using the Singleton tracker:

# in application_controller.rb

def track_page_view
  Snowplow.instance.tracker.track_page_view(page_url: request.original_url,
                                            referrer: request.headers["Referer"])
end

2.3 JavaScript tracker

The tag-based JavaScript tracker comes in two parts. The sp.js code is here placed in public/snowplow for hosting as part of the app. The script tag is included in the shared application.html.erb header.

<!-- in the head of application.html.erb -->

<script async="1">
  (function (p, l, o, w, i, n, g) {
    if (!p[i]) {
      p.GlobalSnowplowNamespace = p.GlobalSnowplowNamespace || [];
      p.GlobalSnowplowNamespace.push(i);
      p[i] = function () {
        (p[i].q = p[i].q || []).push(arguments);
      };
      p[i].q = p[i].q || [];
      n = l.createElement(o);
      g = l.getElementsByTagName(o)[0];
      n.async = 1;
      n.src = w;
      g.parentNode.insertBefore(n, g);
    }
  })(
    window,
    document,
    "script",
    "<%= root_url + 'snowplow/sp.js' %>",
    "snowplow"
  );

  snowplow("newTracker", "sp", "0.0.0.0:9090");

  snowplow("enableActivityTracking", {
    minimumVisitLength: 10,
    heartbeatDelay: 10,
  });
  snowplow("trackPageView");
</script>

This code initialises a JavaScript tracker, as well as setting up Page View and Page Ping (activity) events. Read more about the JavaScript tracker SDK events here. Further examples of custom event tracking with the JS tracker can be found in the Snowplow Micro examples demo Django app.

3. Event types and context

The Ruby tracker has several specific event types available out-of-the-box. Of these, Page Views and eCommerce events are demonstrated here.

However, for all tracker SDKs we strongly recommend using custom Self-Describing events. These are defined by "self-describing" JSON schema rulesets. As the schema are fully customisable, it's possible to track any number of metrics that are important to you. Read more about event data structures on the Snowplow blog and in the documentation.

Each Snowplow event has the option of adding contextual information, by the attachment of entities. The attached entities are called the context of the event. Like the events themselves, entities are defined by self-describing JSON schemas. For example, this app includes schemas that define a Purchase event, and a Product entity. You can see that the schemas look very similar.

// example schema for a Purchase event

{
  "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
  "description": "Schema for a purchase event",
  "self": {
    "vendor": "test.example.iglu",
    "name": "purchase_event",
    "format": "jsonschema",
    "version": "1-0-0"
  },
  "type": "object",
  "properties": {
    "order_id": {
      "description": "Order ID",
      "type": "string",
      "maxLength": 64
    },
    "total_value": {
      "description": "Sum of product prices in the order",
      "type": "number",
      "minimum": 0,
      "maximum": 100000
    },
    "price_reduced": {
      "description": "Does this order include items whose price has been reduced?",
      "type": ["boolean", "null"]
    }
  },
  "required": ["order_id", "total_value"],
  "additionalProperties": false
}

This event would have a product entity attached for each product in the order.

// example schema for a product entity

{
  "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
  "description": "Schema for a product entity",
  "self": {
    "vendor": "test.example.iglu",
    "name": "product_entity",
    "format": "jsonschema",
    "version": "1-0-1"
  },
  "type": "object",
  "properties": {
    "sku": {
      "description": "Product SKU",
      "type": "string",
      "maxLength": 64
    },
    "name": {
      "description": "Product name",
      "type": ["string"],
      "maxLength": 255
    },
    "price": {
      "description": "Price of product at point of sale",
      "type": "number",
      "minimum": 1,
      "maximum": 10000
    },
    "on_sale": {
      "description": "Has the price been reduced since it was first offered?",
      "type": ["boolean", "null"]
    },
    "orig_price": {
      "description": "Original selling price",
      "type": ["number", "null"],
      "minimum": 1,
      "maximum": 10000
    },
    "quantity": {
      "description": "Quantity of this product",
      "type": "integer",
      "minimum": 1,
      "maximum": 10000
    }
  },
  "required": ["sku", "name", "price", "quantity", "on_sale", "orig_price"],
  "additionalProperties": false
}

This product entity schema is version 1-0-1, as a non-breaking change has been made since the first version. Read more about schema versioning here and here.

The self-describing JSON schemas are validated by part of the Snowplow data collection pipeline called Iglu. Read about how to lint the schemas with IgluCTL here.

Below is Ruby code that creates a purchase event based off these schemas. Two products have been bought:

# Based on code in app/shop_controller.rb

event_schema = "iglu:test.example.iglu/purchase_event/jsonschema/1-0-0"
entity_schema = "iglu:test.example.iglu/product_entity/jsonschema/1-0-1"

transaction = { "order_id": "ABC-123", "total_value": 40.99 }
ordered_products = [product1_hash, product2_hash]

purchase_json = SnowplowTracker::SelfDescribingJson.new(
  event_schema, transaction
)
context = ordered_products.map do |product|
  SnowplowTracker::SelfDescribingJson.new(entity_schema, product)
end
Snowplow.instance.tracker.track_self_describing_event(event_json: purchase_json,
                                                      context: context)

Every event sent by the JavaScript tracker (v3+) automatically includes a web page entity, whose sole parameter is an ID unique to that page load. This context helps data modelling by allowing the easy identification of events that occurred during the same page view. Of course, personalised custom entities can be attached to any event type in addition to the web page entity, to create richer context data.

Events from the Ruby tracker do not have any automatically included context.

4. Matching the domain_userid for both trackers

The JavaScript tracker sets and uses cookies. One of the stored identifiers is the domain_userid, a unique identifier for each user. Every event sent from the JavaScript tracker includes this information.

The Ruby tracker, by default, does not attach domain_userid information to its events. Providing the Ruby tracker with the same domain_userid set by the JavaScript tracker can be extremely helpful for modelling the data. It makes it easy to understand which client-side and server-side events were generated by the same user.

The domain_userid can be extracted from the cookie as follows. Note that only Rails Controllers can access cookies.

# in ApplicationController

def snowplow_domain_userid
  sp_cookie = cookies.find { |key, _value| key =~ /^_sp_id/ }
  sp_cookie.last.split(".").first if sp_cookie.present?
end

A third-party gem, snowplow_ruby_duid is available that provides this functionality plus further configuration options.

The Ruby tracker domain_userid is set using the Snowplow method set_domain_user_id. Read more about this here.

# in snowplow.rb

@tracker.set_domain_user_id(domain_userid) unless domain_userid.nil?

In this app, we have linked the Ruby Page View tracking to setting the domain_userid. Since the cookies are set by the JavaScript tracker, the very first Ruby Page View event may lack the domain_userid if the JavaScript tracker has not yet finished initialising and creating the cookie.

5. Testing using Snowplow Micro

To confirm that the trackers have been configured correctly, Snowplow provides a minimal data collection pipeline called Snowplow Micro. Micro collects emitted events, and provides an API to analyse them. The Micro config files are included in the snowplow-micro folder. Snowplow pipelines use Iglu repositories for schema validation. The file iglu.json informs Iglu where to find the standard schemas. The custom schemas are placed in the iglu-client-embedded folder, to be automatically accessed by Micro's own Iglu client (a feature added in Snowplow Micro v1.2). Read more about Iglu repositories here.

To start Micro using Docker. The standard port is 9090, configured in the micro.conf configuration file.

# run from the root folder of this app
docker run \
  --mount type=bind,source=$(pwd)/snowplow-micro,destination=/config \
  -p 9090:9090 \
  snowplow/snowplow-micro:1.2.1 \
  --collector-config /config/micro.conf \
  --iglu /config/iglu.json

Snowplow Micro provides four API endpoints, micro/all, micro/good, micro/bad, and micro/reset. Visit them in your client at e.g. http://localhost:9090/micro/good.

In this app, we use the e2e testing library Cypress to test event collection. We defined a set of custom Cypress Commands in spec/cypress/support/commands.js that relate to Snowplow events. For example, here is a test for a self-describing (custom) purchase event:

// in spec/cypress/integration/event_self_describing_spec.js
// with added comments

it("is emitted by Ruby tracker for purchase activity", () => {
  cy.visit("/shop/all_products");

  // Adding products to the shopping basket
  cy.get(".green_skis > #basket-add-form").click();
  cy.get(".green_skis > #basket-add-form").click();
  cy.get(".white_poles > #basket-add-form").click();

  // Wait to make sure the basket additions have finished
  cy.wait(1000);

  // "Buy" the items
  cy.get("#purchase-submit").click();

  // Allow time for the events to be collected by Micro
  cy.wait(2000);

  // The badEvents() custom command queries the "/micro/bad" API endpoint
  // and returns all the bad events.
  // The count() custom command compares the given argument
  // with the length of the given array
  cy.badEvents().count(0);

  // The goodEvents() custom command queries the "/micro/good" API endpoint
  // and returns all the good events.
  // The other custom commands check for events which match the arguments given
  cy.goodEvents()
    // Self-describing events are also called "unstruct" events
    // for legacy reasons
    .hasEventType("unstruct", "rb")
    .eventSchema("iglu:test.example.iglu/purchase_event/jsonschema/1-0-0")
    .selfDescribingEventData({ order_id: "ABC-123", total_value: 959.78 });

  cy.goodEvents()
    .hasEventType("unstruct", "rb")
    .contextSchema("iglu:test.example.iglu/product_entity/jsonschema/1-0-1")
    .selfDescribingContextData({
      name: "Green skis (size S)",
      quantity: 2,
      price: 449.99,
    })
    .selfDescribingContextData({ name: "Ski poles (white)" });
});

We recommend designing your own tests, based on your own app, tracking, and needs. These tests are provided as one example of event testing using Cypress and Snowplow Micro. See the Snowplow Micro examples repository for a more comprehensive example of testing. Other e2e/integration testing libraries can also be used.

6. Further information

Detailed information about behavioural data management, and Snowplow tracking and data collection pipelines can be found on the Snowplow website, blog, knowledge base and in the documentation.

The skiing equipment image used in this app is from Pexels, and is by Pixabay.

About

Example of how to incorporate Snowplow's Ruby and JavaScript tracker SDKs into a Rails app

Resources

License

Packages

No packages published