## Page Visits Funnel

Cool T-Shirts Inc. has asked you to analyze data on visits to their website. Your job is to build a funnel, which is a description of how many people continue to the next step of a multi-step process.

In this case, our funnel is going to describe the following process:

    1. A user visits CoolTShirts.com
    2. A user adds a t-shirt to their cart
    3. A user clicks “checkout”
    4. A user actually purchases a t-shirt

### Complete all the tasks below!
---

**1: Inspect the DataFrames using print and head:**
 
* visits lists all of the users who have visited the website
* cart lists all of the users who have added a t-shirt to their cart
* checkout lists all of the users who have started the checkout
* purchase lists all of the users who have purchased a t-shirt

In [1]:
import pandas as pd

visits = pd.read_csv('visits.csv',
                     parse_dates=[1])
cart = pd.read_csv('cart.csv',
                   parse_dates=[1])
checkout = pd.read_csv('checkout.csv',
                       parse_dates=[1])
purchase = pd.read_csv('purchase.csv',
                       parse_dates=[1])

# code for task1
print(visits.head(5))
print(cart.head(5))
print(checkout.head(5))
print(purchase.head(5))

                                user_id          visit_time
0  943647ef-3682-4750-a2e1-918ba6f16188 2017-04-07 15:14:00
1  0c3a3dd0-fb64-4eac-bf84-ba069ce409f2 2017-01-26 14:24:00
2  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00
3  6879527e-c5a6-4d14-b2da-50b85212b0ab 2017-11-04 18:15:00
4  a84327ff-5daa-4ba1-b789-d5b4caf81e96 2017-02-27 11:25:00
                                user_id           cart_time
0  2be90e7c-9cca-44e0-bcc5-124b945ff168 2017-11-07 20:45:00
1  4397f73f-1da3-4ab3-91af-762792e25973 2017-05-27 01:35:00
2  a9db3d4b-0a0a-4398-a55a-ebb2c7adf663 2017-03-04 10:38:00
3  b594862a-36c5-47d5-b818-6e9512b939b3 2017-09-27 08:22:00
4  a68a16e2-94f0-4ce8-8ce3-784af0bbb974 2017-07-26 15:48:00
                                user_id       checkout_time
0  d33bdc47-4afa-45bc-b4e4-dbe948e34c0d 2017-06-25 09:29:00
1  4ac186f0-9954-4fea-8a27-c081e428e34e 2017-04-07 20:11:00
2  3c9c78a7-124a-4b77-8d2e-e1926e011e7d 2017-07-13 11:38:00
3  89fe330a-8966-4756-8f7c-3bdbcd47279a 

**2: Combine ```visits``` and ```cart``` using a left merge.**

In [2]:
# code for task2
visits_cart_left_merge = pd.merge(visits, cart, how='left')
print(visits_cart_left_merge.head(5))

                                user_id          visit_time  \
0  943647ef-3682-4750-a2e1-918ba6f16188 2017-04-07 15:14:00   
1  0c3a3dd0-fb64-4eac-bf84-ba069ce409f2 2017-01-26 14:24:00   
2  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
3  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
4  6879527e-c5a6-4d14-b2da-50b85212b0ab 2017-11-04 18:15:00   

            cart_time  
0                 NaT  
1 2017-01-26 14:44:00  
2 2017-08-20 08:31:00  
3 2017-08-20 08:49:00  
4                 NaT  


**3: How long is your merged DataFrame?**

In [3]:
# code for task3
print(len(visits_cart_left_merge))

2052


**4: How many of the timestamps are null for the column cart_time?**

**What do these null rows mean?**

In [4]:
# code for task4
cart_time_null = visits_cart_left_merge[visits_cart_left_merge.cart_time.isnull()]
print(len(cart_time_null))

1652


**5: What percent of users who visited Cool T-Shirts Inc. ended up not placing a t-shirt in their cart?**

**Note: To calculate percentages, it will be helpful to turn either the numerator or the denominator into a float, by using float(), with the number to convert passed in as input. Otherwise, Python will use integer division, which truncates decimal points.**

In [5]:
# code for task5
percentage_visit_no_cart = float(len(cart_time_null)) / float(len(visits_cart_left_merge))
print(percentage_visit_no_cart)

0.8050682261208577


**6: Repeat the left merge for ```cart``` and ```checkout``` and count null values. What percentage of users put items in their cart, but did not proceed to checkout?**

In [6]:
# code for task6
cart_checkout_left_merge = pd.merge(cart, checkout, how='left')
checkout_time_null = cart_checkout_left_merge[cart_checkout_left_merge.checkout_time.isnull()]
percentage_cart_no_checkout = float(len(checkout_time_null)) / float(len(cart_checkout_left_merge))
print(percentage_cart_no_checkout)

0.20930232558139536


**7: Merge all four steps of the funnel, in order, using a series of left merges. Save the results to the variable all_data.**

**Examine the result using print and head.**

In [7]:
# code for task7
all_tables_left_merge = visits.merge(cart, how='left').merge(checkout, how='left').merge(purchase, how='left')
print(all_tables_left_merge.head(5))

                                user_id          visit_time  \
0  943647ef-3682-4750-a2e1-918ba6f16188 2017-04-07 15:14:00   
1  0c3a3dd0-fb64-4eac-bf84-ba069ce409f2 2017-01-26 14:24:00   
2  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
3  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
4  6879527e-c5a6-4d14-b2da-50b85212b0ab 2017-11-04 18:15:00   

            cart_time       checkout_time       purchase_time  
0                 NaT                 NaT                 NaT  
1 2017-01-26 14:44:00 2017-01-26 14:54:00 2017-01-26 15:08:00  
2 2017-08-20 08:31:00                 NaT                 NaT  
3 2017-08-20 08:49:00                 NaT                 NaT  
4                 NaT                 NaT                 NaT  


**8: What percentage of users proceeded to checkout, but did not purchase a t-shirt?**

In [8]:
# code for task8
percentage_checkout_no_purchase = \
    float(len(all_tables_left_merge[all_tables_left_merge.checkout_time.notnull()])) / \
    float(len(all_tables_left_merge[all_tables_left_merge.purchase_time.isnull()]))
print(percentage_checkout_no_purchase)

0.42992623814541625


**9: Which step of the funnel is weakest?(i.e., has the highest percentage of users not completing it)**

**How might Cool T-Shirts Inc. change their website to fix this problem?**

In [9]:
# code for task9
print(percentage_visit_no_cart)
print(percentage_cart_no_checkout)
print(percentage_checkout_no_purchase)

# Which funnel to improve?
print('We can improve checkout -> purchase funnel best: high percentage with more motivation.')

0.8050682261208577
0.20930232558139536
0.42992623814541625
We can improve checkout -> purchase funnel best: high percentage with more motivation.


**10: Using the giant merged DataFrame ```all_date``` that you created(name of variable is defferent depending on what you named.)**
**Let’s calculate the average time from initial visit to final purchase. Start by adding the following column to your DataFrame:**

> all_data['time_to_purchase'] = all_data.purchase_time - all_data.visit_time

In [10]:
# code for task10
all_tables_left_merge['time_to_purchase'] = all_tables_left_merge.purchase_time - all_tables_left_merge.visit_time
print(all_tables_left_merge.head(5))

                                user_id          visit_time  \
0  943647ef-3682-4750-a2e1-918ba6f16188 2017-04-07 15:14:00   
1  0c3a3dd0-fb64-4eac-bf84-ba069ce409f2 2017-01-26 14:24:00   
2  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
3  6e0b2d60-4027-4d9a-babd-0e7d40859fb1 2017-08-20 08:23:00   
4  6879527e-c5a6-4d14-b2da-50b85212b0ab 2017-11-04 18:15:00   

            cart_time       checkout_time       purchase_time time_to_purchase  
0                 NaT                 NaT                 NaT              NaT  
1 2017-01-26 14:44:00 2017-01-26 14:54:00 2017-01-26 15:08:00         00:44:00  
2 2017-08-20 08:31:00                 NaT                 NaT              NaT  
3 2017-08-20 08:49:00                 NaT                 NaT              NaT  
4                 NaT                 NaT                 NaT              NaT  


**11: Calculate the average time to purchase.**

In [11]:
# code fot task11
average_time_to_purchase = all_tables_left_merge.time_to_purchase.mean()
print(average_time_to_purchase)

0 days 00:44:02.672413
