Skip to content
This repository has been archived by the owner on May 12, 2022. It is now read-only.

Visit Tracking

Travis Anderson edited this page Mar 12, 2015 · 10 revisions

Concept

Visit tracking is a tool that keeps track of the contributions of each traffic source in our site including direct navigation, referral and search traffic. All the tables are normalized. Instead of one table having all of the information, it will contain foreign keys to other tables with that information. In order to find the name of the source of a given visit, for example, you’d need to join visits to attributions (using attribution_id), then attributions to sources (using source_id):

SELECT source, COUNT(*) AS number_of_visits
FROM 			visits v
JOIN			attributions a 	ON v.attribution_id = a.attribution_id
LEFT OUTER JOIN	sources s 	ON s.source_id = a.source_id
WHERE v.created_at between ‘11/1/2013’ and ‘11/30/2013’
GROUP BY source

Configuration

To ENABLE Visit Tracking, turn it on in your config like so

Landable.configure do |config|
   config.traffic_enabled = true
end

To set up paths that are never tracked by Visit Tracking, config it like so

Landable.configure do |config|
   config.untracked_paths = %w(/status) # Array of Paths you wish to not be tracked
end

Key Tables

Visits:

Who visited (join to visitors) Where they came from (join to attributions) When visit happened (created_at) Associated owner, if logged in Visits go stale if there is no activity within 30 mins (30 mins is hard coded in tracker.rb right now) If user has option to not to be tracked, we save that info too

Owners

Represents an internal key from the parent application (owner == your_apps_customer_id, NOT owner_id)

Owner collection configuration example using the proposed new user_tracker API.
This method was placed in a helper in our app - we called it landable_integration_helper.rb.
Method was then referenced from the appropriate controller to store the customer_id as owner after a login event.

def identify_visitor(my_customer_id)
  if my_customer_id.present? and user_tracker.present?
    user_tracker.identify(my_customer_id)
  end
end

User Agents

This table stores information about the client’s software or where a request was originated from such as: *user agent types (user, ping, crawl, srape, scan) *device *platform *browser *browser version

  • not yet populated but data is stored in user_agents.user_agent

Attributions

Where a visit came from Device information Search terms used Tracking params: ad_type, ad_group, bid_match_type, campaign, content, creative, device_type, click_id, experiment, keyword, match_type, medium, network, placement, position, search_term, source, target

Accesses

The last time a specific visitor accessed a specific url

Domains

The domains that traffic has originated from www.google.com, www.bing.com, www.facebook.com, etc

Page Views

Tracks information about each page viewed in a visit query_strings, mime_type, http_method, etc Also makes note of the page revision that was published at the time of the visit This makes it easy to look at changes in traffic and events across multiple versions of a page

Events

Tracks what event type (if any) for a given visit, and the time of said event

Referrers

This table holds information of who referred a visitor to us, such as: Domain: where they came from (ex. www.facebook.com) Path: where there were redirected to (ex. /about-us.html) Query string: additional query which is not part of the tracking params

Query Examples

-- Track conversions across page versions -- These assume there is an event type called ‘Conversion’ for conversions.

SELECT
  p.path,
  count(case when event_type = 'Conversion' then 1 end) AS conversions_count,
  count(case when event_type <> 'Conversion' OR event_type IS NULL then 1 end) AS non_conversions_count,
  pr.created_at AS page_version_published_on,
  count(distinct v.visit_id) AS number_of_visits
FROM
  traffic.page_views pv
  JOIN landable.page_revisions pr 		ON pv.page_revision_id = pr.page_revision_id
  JOIN traffic.paths p 				ON p.path_id = pv.path_id
  JOIN traffic.visits v 				ON v.visit_id = pv.visit_id
  JOIN traffic.visitors vs 			ON vs.visitor_id = v.visitor_id
  JOIN traffic.user_agents ua 		ON ua.user_agent_id = vs.user_agent_id
  LEFT OUTER JOIN traffic.events e 	ON e.visit_id = v.visit_id
  LEFT OUTER JOIN traffic.event_types et 	ON et.event_type_id = e.event_type_id
GROUP BY p.path, pr.created_at
ORDER BY p.path, pr.created_at;

-- Number of visits that resulted in conversion vs didn't result in conversions

SELECT
  COUNT(case when event_type = 'Conversion' THEN 1 END) AS conversion_count,
  COUNT(CASE WHEN event_type <> 'Conversion' OR event_type IS NULL THEN 1 END) AS non_conversion_count
FROM
  traffic.visits v
  LEFT OUTER JOIN traffic.events e 	ON e.visit_id = v.visit_id
  LEFT OUTER JOIN traffic.event_types et 	ON et.event_type_id = e.event_type_id;

-- Visits and conversion by source

SELECT
  COALESCE(source, 'Direct') AS source,
  COUNT(case when event_type = 'Conversion' THEN 1 END) AS conversion_count,
  COUNT(CASE WHEN event_type <> 'Conversion' OR event_type IS NULL THEN 1 END) AS non_conversion_count,
  COUNT(*) AS visits_count
FROM
  traffic.visits v
  JOIN traffic.attributions a 			ON a.attribution_id = v.attribution_id
  LEFT OUTER JOIN traffic.sources s 	ON a.source_id = s.source_id
  LEFT OUTER JOIN traffic.events e 	ON v.visit_id = e.visit_id
  LEFT OUTER JOIN traffic.event_types et 	ON et.event_type_id = e.event_type_id
GROUP BY source;

-- Track visitors per page over a specific interval (basic conversion funnel)

CREATE TYPE landable_traffic.visits_per_path_type AS (
  count_unique_visits integer,
  path text
);
CREATE FUNCTION landable_traffic.visits_per_path(p_date date, p_interval interval)
RETURNS SETOF visits_per_path_type
LANGUAGE plpgsql
AS $$
DECLARE
r_rec landable_traffic.visits_per_path_type;

BEGIN

FOR r_rec IN
SELECT
count(visitor_id) count_unique_visitors
, path
FROM landable_traffic.visitors_per_path vpp
WHERE visit_date::date <= p_date
  AND visit_date::date > p_date - p_interval
GROUP BY path_id, path
ORDER BY count(visitor_id) desc, path_id

LOOP
  RETURN next r_rec;
END LOOP;
END;
$$;