# Page Visits Funnel

Cool T-Shirts Inc. has asked you to analyze data on visits to their website. Your job is to build a funnel, which is a description of how many people continue to the next step of a multi-step process.

In this case, our funnel is going to describe the following process:

* A user visits CoolTShirts.com
* A user adds a t-shirt to their cart
* A user clicks “checkout”
* A user actually purchases a t-shirt

## Instructions
***


### Funnel for Cool T-Shirts Inc.

1. Inspect the DataFrames using print and head:



2. Combine visits and cart using a left merge.


3. How long is your merged DataFrame?


4. How many of the timestamps are null for the column `cart_time`?
What do these null rows mean?



5. What percent of users who visited Cool T-Shirts Inc. ended up not placing a t-shirt in their cart?<br>

    **Note**: *To calculate percentages, it will be helpful to turn either the numerator or the denominator into a float, by using `float()`, with the number to convert passed in as input. Otherwise, Python will use integer division, which truncates decimal points.*



6. Repeat the left merge for cart and checkout and count null values. What percentage of users put items in their cart, but did not proceed to checkout?



7. Merge all four steps of the funnel, in order, using a series of left merges. Save the results to the variable `all_data`.<br>
Examine the result using print and head.


8. What percentage of users proceeded to checkout, but did not purchase a t-shirt?


9. Which step of the funnel is weakest (i.e., has the highest percentage of users not completing it)?<br>
How might Cool T-Shirts Inc. change their website to fix this problem?


### Average Time to Purchase

10. Using the giant merged DataFrame `all_data` that you created, let’s calculate the average time from initial visit to final purchase. 


11. Examine the results


12. Calculate the average time to purchase 

## Practice
***


In [27]:
import pandas as pd

visits = pd.read_csv('visits.csv',
                     parse_dates=[1])
cart = pd.read_csv('cart.csv',
                   parse_dates=[1])
checkout = pd.read_csv('checkout.csv',
                       parse_dates=[1])
purchase = pd.read_csv('purchase.csv',
                       parse_dates=[1])

In [45]:
# 1 Inspect the DataFrames
display(visits.head(2),cart.head(2),checkout.head(2),purchase.head(2))

Unnamed: 0,user_id,visit_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00


Unnamed: 0,user_id,cart_time
0,2be90e7c-9cca-44e0-bcc5-124b945ff168,2017-11-07 20:45:00
1,4397f73f-1da3-4ab3-91af-762792e25973,2017-05-27 01:35:00


Unnamed: 0,user_id,checkout_time
0,d33bdc47-4afa-45bc-b4e4-dbe948e34c0d,2017-06-25 09:29:00
1,4ac186f0-9954-4fea-8a27-c081e428e34e,2017-04-07 20:11:00


Unnamed: 0,user_id,purchase_time
0,4b44ace4-2721-47a0-b24b-15fbfa2abf85,2017-05-11 04:25:00
1,02e684ae-a448-408f-a9ff-dcb4a5c99aac,2017-09-05 08:45:00


In [29]:
#2 Combine visits and cart 
visits_cart = pd.merge(visits,cart, how='left')

#3 How long is the merged DataFrame
display(len(visits_cart))

2000

In [30]:
#4 many of the timestamps are null for cart_time
null_rows = visits_cart[visits_cart.cart_time.isnull()] #return a DataFrame
display(len(null_rows))

1652

In [43]:
#5 Percent of users who visited and ended up not placing a t-shirt in their cart
not_placed_percentage = round((float(len(null_rows)) / visits_cart['user_id'].count()) * 100,2)
display(not_placed_percentage)

82.6

In [41]:
#6 Percentage of users put items in their cart, but did not proceed to checkout
cart_checkout = pd.merge(cart,checkout, how='left')
null_checkouts = cart_checkout[cart_checkout.checkout_time.isnull()]
not_checkout_percentage = round((float(len(null_checkouts)) / cart_checkout.user_id.count()) * 100,2)
display(not_checkout_percentage)

25.31

In [33]:
#7 Merging all four steps of the funnel, in order, using a series of left merges.
all_data = visits.merge(cart,how='left').merge(checkout,how='left').merge(purchase,how='left')
display(all_data.head())

Unnamed: 0,user_id,visit_time,cart_time,checkout_time,purchase_time
0,943647ef-3682-4750-a2e1-918ba6f16188,2017-04-07 15:14:00,NaT,NaT,NaT
1,0c3a3dd0-fb64-4eac-bf84-ba069ce409f2,2017-01-26 14:24:00,2017-01-26 14:44:00,2017-01-26 14:54:00,2017-01-26 15:08:00
2,6e0b2d60-4027-4d9a-babd-0e7d40859fb1,2017-08-20 08:23:00,2017-08-20 08:31:00,NaT,NaT
3,6879527e-c5a6-4d14-b2da-50b85212b0ab,2017-11-04 18:15:00,NaT,NaT,NaT
4,a84327ff-5daa-4ba1-b789-d5b4caf81e96,2017-02-27 11:25:00,NaT,NaT,NaT


In [40]:
#8 Percentage of users proceeded to checkout, but did not purchase a t-shirt
checkout_purchase = pd.merge(checkout, purchase, how='left')
null_purchase = len(checkout_purchase[checkout_purchase.purchase_time.isnull()])
null_purchase_percentage = round((float(null_purchase) * 100) / float(len(checkout_purchase)),2)
display(null_purchase_percentage)

16.89

In [35]:
#9 Weakest  step of the funnel
weakest_step = max([not_placed_percentage, not_checkout_percentage, null_purchase_percentage])

#10 Average time from initial visit to final purchase
all_data['time_to_purchase'] =  (all_data.purchase_time - all_data.visit_time)

In [39]:
#11 Examine the results
display(all_data.time_to_purchase)

0                  NaT
1      0 days 00:44:00
2                  NaT
3                  NaT
4                  NaT
             ...      
2367               NaT
2368               NaT
2369               NaT
2370               NaT
2371               NaT
Name: time_to_purchase, Length: 2372, dtype: timedelta64[ns]

In [38]:
#12 Average time to purchase
display(all_data.time_to_purchase.mean())

Timedelta('0 days 00:43:53.360160965')