#### Cool T-Shirts Inc. has asked you to analyze data on visits to their website. Myjob is to build a funnel, which is a description of how many people continue to the next step of a multi-step process.


In [3]:
"""In this case, our funnel is going to describe the following process:

A user visits CoolTShirts.com
A user adds a t-shirt to their cart
A user clicks “checkout”
A user actually purchases a t-shirt
"""
import pandas as pd

#Reading the CSV files
visits = pd.read_csv('visits.csv',
                     parse_dates=[1])
cart = pd.read_csv('cart.csv',
                   parse_dates=[1])
checkout = pd.read_csv('checkout.csv',
                       parse_dates=[1])
purchase = pd.read_csv('purchase.csv',
                       parse_dates=[1])

#Inspecting DataFrames
#print(visits.head(5))
#print(cart.head(5))
#print(checkout.head(5))
#print(purchase.head(5))

#First level of funnel is calculating what percentage of users moved to cart

#To find it I need to combine "visits" & "cart" dataframes
visit_to_cart = pd.merge(visits,cart,how="left")

#Length of dataframe
len_visit_to_cart = len(visit_to_cart)

#number of null timestamps in "cart" dataframe
null_cart_times = len(visit_to_cart[visit_to_cart.cart_time.isnull()])

#ratio of customers who visited but not added the product to their cart
visited_not_cart = float(null_cart_times)/float(len(visits))


#Second level of funnel is calculating what percentage of users moved to checkout from cart
## To calculate this ratio I will merge "cart" and "checkout" dataframes
cart_checkout = cart.merge(checkout, how="left")

#number of users who did not checkout
null_checkout_times = len(cart_checkout[cart_checkout.checkout_time.isnull()])

#ratio of customers who put products to cart but not checkout
cart_not_checkout = float(null_checkout_times) / float(len(cart))

#Third level of funnel is calculating what percentage of users not moved to purchase from checkout
checkout_purchase = checkout.merge(purchase, how ="left")

#number of users who did not purchase
null_purchase_times = len(checkout_purchase[checkout_purchase.purchase_time.isnull()])

#ratio of customers who put products to cart but not checkout
checkout_not_purchase = float(null_purchase_times) / float(len(checkout))

print("{} percent of users who visited the page did not add a t-shirt to their cart".format(round(visited_not_cart*100, 2)))
print("{} percent of users who added a t-shirt to their cart did not checkout".format(round(cart_not_checkout*100, 2)))
print("{} percent of users who made it to checkout  did not purchase a shirt".format(round( checkout_not_purchase*100, 2)))

#what is the average time for a customer to buy an item
all_data = all_data = visit_to_cart.merge(cart_checkout, how = 'left').merge(purchase, how = 'left')

all_data['time_to_purchase'] = all_data['purchase_time'] - all_data['visit_time']



print("Average time to purchase is {}".format(all_data.time_to_purchase.mean()) )

82.6 percent of users who visited the page did not add a t-shirt to their cart
35.06 percent of users who added a t-shirt to their cart did not checkout
36.28 percent of users who made it to checkout  did not purchase a shirt
Average time to purchase is 0 days 00:43:12.380952380


The biggest pain point is we are not able convince customers to add the products to their carts. We have to focus on that point