## Funnel for Cool T-Shirts Inc.

Page Visits Funnel
Cool T-Shirts Inc. has asked you to analyze data on visits to their website. Your job is to build a funnel, which is a description of how many people continue to the next step of a multi-step process.

In this case, our funnel is going to describe the following process:

- A user visits CoolTShirts.com
- A user adds a t-shirt to their cart
- A user clicks “checkout”
- A user actually purchases a t-shirt

In [1]:
import pandas as pd
import numpy as np

##### Exercise 1

Inspect the DataFrames using print and head:

- `visits` lists all of the users who have visited the website
- `cart` lists all of the users who have added a t-shirt to their cart
- `checkout` lists all of the users who have started the checkout
- `purchase` lists all of the users who have purchased a t-shirt

In [3]:
visits = pd.read_csv("visits.csv")
cart = pd.read_csv("cart.csv")
checkout = pd.read_csv("checkout.csv")
purchase = pd.read_csv("purchase.csv")

In [4]:
visits.head()

Unnamed: 0,user_id,visit_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00
2,6e0b2d60-4027-4d9a-babd-0e7d40859fb1,2017-08-20 08:23:00
3,6879527e-c5a6-4d14-b2da-50b85212b0ab,2017-11-04 18:15:00
4,a84327ff-5daa-4ba1-b789-d5b4caf81e96,2017-02-27 11:25:00


In [5]:
cart.head()

Unnamed: 0,user_id,cart_time
0,2be90e7c-9cca-44e0-bcc5-124b945ff168,2017-11-07 20:45:00
1,4397f73f-1da3-4ab3-91af-762792e25973,2017-05-27 01:35:00
2,a9db3d4b-0a0a-4398-a55a-ebb2c7adf663,2017-03-04 10:38:00
3,b594862a-36c5-47d5-b818-6e9512b939b3,2017-09-27 08:22:00
4,a68a16e2-94f0-4ce8-8ce3-784af0bbb974,2017-07-26 15:48:00


In [6]:
checkout.head()

Unnamed: 0,user_id,checkout_time
0,d33bdc47-4afa-45bc-b4e4-dbe948e34c0d,2017-06-25 09:29:00
1,4ac186f0-9954-4fea-8a27-c081e428e34e,2017-04-07 20:11:00
2,3c9c78a7-124a-4b77-8d2e-e1926e011e7d,2017-07-13 11:38:00
3,89fe330a-8966-4756-8f7c-3bdbcd47279a,2017-04-20 16:15:00
4,3ccdaf69-2d30-40de-b083-51372881aedd,2017-01-08 20:52:00


In [7]:
purchase.head()

Unnamed: 0,user_id,purchase_time
0,4b44ace4-2721-47a0-b24b-15fbfa2abf85,2017-05-11 04:25:00
1,02e684ae-a448-408f-a9ff-dcb4a5c99aac,2017-09-05 08:45:00
2,4b4bc391-749e-4b90-ab8f-4f6e3c84d6dc,2017-11-20 20:49:00
3,a5dbb25f-3c36-4103-9030-9f7c6241cd8d,2017-01-22 15:18:00
4,46a3186d-7f5a-4ab9-87af-84d05bfd4867,2017-06-11 11:32:00


##### Exercise 2

Combine visits and cart using a left merge.

In [10]:
visits_and_cart = pd.merge(visits, cart, how="left")
visits_and_cart.head()

Unnamed: 0,user_id,visit_time,cart_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00,
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00,2017-01-26 14:44:00
2,6e0b2d60-4027-4d9a-babd-0e7d40859fb1,2017-08-20 08:23:00,2017-08-20 08:31:00
3,6879527e-c5a6-4d14-b2da-50b85212b0ab,2017-11-04 18:15:00,
4,a84327ff-5daa-4ba1-b789-d5b4caf81e96,2017-02-27 11:25:00,


##### Exercise 3
How long is your merged DataFrame?

In [11]:
len(visits_and_cart)
# 2000

2000

##### Exercise 4

How many of the timestamps are null for the column `cart_time`?

What do these null rows mean?

In [15]:
visits_and_cart["cart_time"].isnull().value_counts()
# 1652 of those who visited the page did not add to cart.

True     1652
False     348
Name: cart_time, dtype: int64

##### Exercise 5
What percent of users who visited Cool T-Shirts Inc. ended up not placing a t-shirt in their cart?

In [43]:
perc = visits_and_cart["cart_time"].isnull().sum() / len(visits_and_cart["cart_time"]) * 100
print(f"{perc}% of users who visited Cool T-shirts Inc. ended up not placing a t-shirt in their cart.")

82.6% of users who visited Cool T-shirts Inc. ended up not placing a t-shirt in their cart.


##### Exercise 6
Repeat the left merge for cart and checkout and count null values. What percentage of users put items in their cart, but did not proceed to checkout?

In [24]:
cart_and_checkout = pd.merge(cart, checkout, how="left")
cart_and_checkout.head()

Unnamed: 0,user_id,cart_time,checkout_time
0,2be90e7c-9cca-44e0-bcc5-124b945ff168,2017-11-07 20:45:00,2017-11-07 21:14:00
1,2be90e7c-9cca-44e0-bcc5-124b945ff168,2017-11-07 20:45:00,2017-11-07 20:50:00
2,2be90e7c-9cca-44e0-bcc5-124b945ff168,2017-11-07 20:45:00,2017-11-07 21:11:00
3,4397f73f-1da3-4ab3-91af-762792e25973,2017-05-27 01:35:00,
4,a9db3d4b-0a0a-4398-a55a-ebb2c7adf663,2017-03-04 10:38:00,2017-03-04 11:04:00


In [25]:
cart_and_checkout["checkout_time"].isnull().value_counts()

False    360
True     122
Name: checkout_time, dtype: int64

In [44]:
perc = round(cart_and_checkout["checkout_time"].isnull().sum() / len(cart_and_checkout) * 100, 2)
print(f"{perc}% of users put items in their cart, but did not proceed to checkout.")

25.31% of users put items in their cart, but did not proceed to checkout.


##### Exercise 7
Merge all four steps of the funnel, in order, using a series of left merges. Save the results to the variable `all_data`.

Examine the result using print and head.

In [40]:
all_data = visits.merge(cart, how="left").merge(checkout, how="left").merge(purchase, how="left")

In [42]:
all_data.head()

Unnamed: 0,user_id,visit_time,cart_time,checkout_time,purchase_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00,,,
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00,2017-01-26 14:44:00,2017-01-26 14:54:00,2017-01-26 15:08:00
2,6e0b2d60-4027-4d9a-babd-0e7d40859fb1,2017-08-20 08:23:00,2017-08-20 08:31:00,,
3,6879527e-c5a6-4d14-b2da-50b85212b0ab,2017-11-04 18:15:00,,,
4,a84327ff-5daa-4ba1-b789-d5b4caf81e96,2017-02-27 11:25:00,,,


##### Exercise 8

What percentage of users proceeded to checkout, but did not purchase a t-shirt?

In [53]:
checkout_count = all_data["checkout_time"].isnull().value_counts()[0]
checkout_count

598

In [55]:
purchase_count = all_data["purchase_time"].isnull().value_counts()[0]
purchase_count

497

In [58]:
checkout_to_purchase = round(purchase_count / checkout_count * 100, 2)
checkout_to_purchase

83.11

In [60]:
checkout_to_not_purchase = 100 - checkout_to_purchase
print(f"{checkout_to_not_purchase}% of users proceeded to checkout, but did not purchase a t-shirt.")

16.89% of users proceeded to checkout, but did not purchase a t-shirt.


##### Exercise 9

Which step of the funnel is weakest (i.e., has the highest percentage of users not completing it)?

How might Cool T-Shirts Inc. change their website to fix this problem?

In [62]:
len(all_data)

2372

In [72]:
all_data["cart_time"].isnull().value_counts()[0] / len(all_data) * 100

30.354131534569984

In [74]:
all_data["checkout_time"].isnull().value_counts()[0] / len(all_data) * 100

25.21079258010118

In [76]:
all_data["purchase_time"].isnull().value_counts()[0] / len(all_data) * 100

20.952782462057336

In [77]:
# cart step of the funnel is weakest. Campaigns can be made that will enable customers to add their products to the cart.

##### Exercise 10

Using the giant merged DataFrame `all_data` that you created, let’s calculate the average time from initial visit to final purchase. Add a column that is the difference between `purchase_time` and `visit_time`.

In [78]:
all_data.head()

Unnamed: 0,user_id,visit_time,cart_time,checkout_time,purchase_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00,,,
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00,2017-01-26 14:44:00,2017-01-26 14:54:00,2017-01-26 15:08:00
2,6e0b2d60-4027-4d9a-babd-0e7d40859fb1,2017-08-20 08:23:00,2017-08-20 08:31:00,,
3,6879527e-c5a6-4d14-b2da-50b85212b0ab,2017-11-04 18:15:00,,,
4,a84327ff-5daa-4ba1-b789-d5b4caf81e96,2017-02-27 11:25:00,,,


In [97]:
all_data["visit_to_purchase_time"] = (pd.to_datetime(all_data["purchase_time"]) - pd.to_datetime(all_data["visit_time"]))

##### Exercise 11
Examine the results by printing the new column to the screen.

In [99]:
all_data[["visit_to_purchase_time"]]

Unnamed: 0,visit_to_purchase_time
0,NaT
1,0 days 00:44:00
2,NaT
3,NaT
4,NaT
...,...
2367,NaT
2368,NaT
2369,NaT
2370,NaT


##### Exercise 12

Calculate the average time to purchase by applying the `.mean()` function to your new column.

In [114]:
all_data["visit_to_purchase_time"].mean()

Timedelta('0 days 00:43:53.360160965')