In [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Context:

This is data pulled from 11/1/2017 to 11/7/2017 of various
activity by individuals who had visited the Gordon Ramsay course
marketing page within the same period of time.

# Prompt:

Based on given web data, pull insights from user data.

Following tables can give you a story of where the user went after they landed
and viewed one of our pages. Pages will give you an idea of where they
viewed, and then hompage_click and course_marketing_click are clicks on
those marketing pages. Then, once they begin the checkout process with
purchase_click, they finalize the process with purchased_class.

# Data Dictionary

* anonymous_id : an identifier given to unique device session
* received_at: when the event or page view occurred
* location: place on the page where the event occurred
* action: descriptor given to event
* channel_grouping: marketing bucket given to source of traffic
* paid: acquired via paid traffic
* organic-social-pr: free traffic via referrals, social networks, PR stories,
etc.
* null: equivalent to organics
* traffic_source: origin of how the user came to the website
* ad_type: type of ad (e.g. video)
* acquisition_type: type of user that the marketing ad was intended
towards
* prospecting: advertising to users who hadn’t visited the website in at
least 14 days
* remarketing: advertising to users who had visited the website in 14 days
* lifecycle: advertising to users who have made a purchase and/or
enrolled

In [2]:
# any click on course marketing pg except puchase click
marketing = pd.read_csv('/kaggle/input/masterclass-gordon-ramsey-class/course_marketing_click.csv')
marketing.head(3)

Unnamed: 0,anonymous_id,received_at,class,location,action,video,video_carousel_number
0,b8d1d717-f4b1-4d39-9383-f63b32b74fce,11/1/2017 0:04:32,gordon-ramsay-teaches-cooking,hero,play-trailer,trailer,
1,074b9167-b7f3-4f0d-8e13-c93dc9d2ba6a,11/1/2017 0:05:19,aaron-sorkin-teaches-screenwriting,hero,play-trailer,trailer,
2,074b9167-b7f3-4f0d-8e13-c93dc9d2ba6a,11/1/2017 0:09:35,gordon-ramsay-teaches-cooking,hero,play-trailer,trailer,


In [3]:
# any click on homepage
homepage = pd.read_csv('/kaggle/input/masterclass-gordon-ramsey-class/homepage_click.csv')
homepage.head(3)

Unnamed: 0,anonymous_id,received_at,action,class,location
0,e921f531-128f-4e71-922d-28f71d65dc93,11/1/2017 0:15:58,gordon-ramsay-teaches-cooking,,tile
1,e921f531-128f-4e71-922d-28f71d65dc93,11/1/2017 0:14:24,steve-martin-teaches-comedy,,hero
2,e921f531-128f-4e71-922d-28f71d65dc93,11/1/2017 0:14:23,samuel-l-jackson-teaches-acting,,hero


In [4]:
# major page views including home and marketing page
pages = pd.read_csv('/kaggle/input/masterclass-gordon-ramsey-class/pages.csv')
pages.head(3)

Unnamed: 0,anonymous_id,received_at,name,class,channel_grouping,traffic_source,ad_type,acquisition_type,user_agent
0,faff1903-357c-44e8-b98e-2d36d8be5832,11/01/2017 00:01:13,Course Marketing,gordon-ramsay-teaches-cooking,organic-social-pr,website,gr_mainpage,prospecting,Mozilla/5.0 (Windows NT 6.1; Win64; x64) Apple...
1,cb41781f-feb6-47ed-abe1-867716a0bc34,11/01/2017 00:01:39,Course Marketing,gordon-ramsay-teaches-cooking,paid,facebook,video,remarketing,Mozilla/5.0 (iPhone; CPU iPhone OS 11_0_1 like...
2,f48cb91d-4e6c-42ad-b32b-6e532c1b49b0,11/01/2017 00:02:07,Course Marketing,gordon-ramsay-teaches-cooking,,,,,Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6...


In [5]:
# any 'take the course/give as a gift' purchase from course marketing page
purchase_click = pd.read_csv('/kaggle/input/masterclass-gordon-ramsey-class/purchase_click.csv')
purchase_click.head(3)

Unnamed: 0,anonymous_id,received_at,class,location,action
0,9be8d642-3000-45db-970f-aedbc9d9ee3c,11/1/2017 0:24:58,gordon-ramsay-teaches-cooking,hero,primary
1,21862340-a8fb-4e6f-bca7-85f5cf1d2f68,11/1/2017 0:36:47,gordon-ramsay-teaches-cooking,video-carousel,primary
2,13d9d32f-a11b-489e-9dda-740442d60961,11/1/2017 0:37:53,gordon-ramsay-teaches-cooking,hero,primary


In [6]:
# when user purchases class/annual-pass (1 row per item purchase)
purchased_class = pd.read_csv('/kaggle/input/masterclass-gordon-ramsey-class/purchased_class.csv')
purchased_class.head(3)

Unnamed: 0,anonymous_id,received_at,product_id,total,revenue,discount,is_gift
0,13d9d32f-a11b-489e-9dda-740442d60961,11/1/2017 0:39,gordon-ramsay-teaches-cooking,90,90,0,f
1,47c79436-b6e8-4009-a5e4-b82a0a32e93b,11/1/2017 1:07,gordon-ramsay-teaches-cooking,90,90,0,f
2,83259ee8-4de6-4748-94a3-1f6646c9fd69,11/1/2017 1:45,shonda-rhimes-teaches-writing-for-television,90,90,0,f


Analysis Strategy:

Business has Subscription model where purchase of class or subscription plan adds to revenue. 

1. We want to track various user cohorts who visited Gordan Ramsey's marketing page: 
* those who converted (purchased one class- Gordan Ramsey's or non-GM or annual membership)
** did they buy multiple classes
** did they buy for themselves/gift
* churned
* resurrected
* retained (still on the platform, but have not made a purchase).

2. After defining groups, create attributions (first/last touch) since there are multiple purchase clicks (some not related to purchase), multiple homepage clicks/marketing clicks (only marketing clicks related to class purchase clicks) and multiple page views.

3. And from there, characterize what each group's behavior is including how they got to the site, their engagement, attributions to purchase (class/annual membership).

4. Would be nice to add some correlation heatmap's to correlate certain behavior to conversion, etc.

In [7]:
# 1. Of those who visited Gordon Ramsay (GR) marketing page and bought a Master Class (MC) product (total: 476 distinct users) whether it be purchase for self or gift and the bought item was a GR class (single/multiple time(s):
print(format((323/476)*100, ".2f"), '% of total MC purchases that were GR class.' )

67.86 % of total MC purchases that were GR class.


In [8]:
# 2. Of GM purchases' distinct users, those who only purchased GM class only:
print(format((310/323)*100, ".2f"), '% of distinct users who purchased only GR class and NO other MC offerings (other classes/annual pass).' )

95.98 % of distinct users who purchased only GR class and NO other MC offerings (other classes/annual pass).


In [9]:
# 3. Of GM purchases' distinct users, those who purchased GM class along with other MC offerings:
print(format((13/323) *100, ".2f"), '% of distinct users who purchased only GR class AND other MC offerings (other classes/annual pass)s.' )

4.02 % of distinct users who purchased only GR class AND other MC offerings (other classes/annual pass)s.
