## Assignment

You are leading a project to analyze product performance at Stripe. We at Stripe are most interested in how the products are performing and growing as well as how to prioritize our product development efforts to maximize our growth. You have been provided an initial cut of data on a few flagship products, each of which is targeted at a specific user segment. From the data you've been given, please prepare a short presentation detailing your findings.

**Product Usage Table:**

|Label| Description|
| - | :- |  
| `merchant` | This is the unique ID of each Stripe user |
|`date` | Data is aggregated up to the month level for each Stripe user.|
|`product` | This is the Stripe product that the user is using to charge their customers.|
|`event` | This is an action within a product. For more details on how products are used, see "segment details" below|
|`count of events` |       |
|`usd_amount` |Total amount in cents and USD that was processed for that API call|

**Segment Table:**

This is a mapping of merchant IDs to the user segmentation we have.

|Label| Description|
| - | :- |
|`saas`| These businesses serve SaaS products which means they primarily charge their customers on a recurring basis (usually   monthly). We want them to use our Subscriptions payments product to charge regularly on a time interval.|
|`ecommerce` | These businesses use Stripe's shopping cart product and primarily sell physical or digital goods online. |
|`platforms` | These users are platforms upon which other users can sign up and charge for services through the Stripe API. Examples would include ridesharing services, delivery services, etc. (e.g. Lyft, Task Rabbit, Instacart) |


## Segment Details

### SaaS

SaaS users have two options when they process recurring payments. The recurring payments product allows them to schedule automatically recurring payments on a fixed schedule, but the merchants can also manually create charges on Stripe for their recurring payments. Our hope with the recurring payments product is to make it easy for all users to automate their payments. The product was launched in May 2013.

### E-Commerce Store

Our shopping cart product enables online e-commerce stores to sell goods. We track details on their website around the conversion funnel and actions that customers take. We can see when an item is viewed, added to the cart, when the checkout flow is initiated, and when it is completed with a payment submitted.

### Platforms

Our Marketplace product allows platforms to charge on behalf of other users and payout funds to each end automatically.



### Questions to guide thinking:

1. How are each of Stripe's products and segments performing and where are they headed?

2. Are there any issues with the products that we should address?

3. Given more time and access to more data, what would you want to dig deeper on?

4. How should we prioritize development for different products, given our limited resources?




In [2]:
# Write your code here
import os
import pandas as pd
import numpy as np

In [66]:
product = pd.read_csv("/Users/rosiebai/Downloads/product_usage.csv")
product = product[['Merchant', 'Date', 'Product', 'Event', 'Count of events', 'Usd Amount']]
product = product.dropna(how = 'all')
segmentation = pd.read_csv("/Users/rosiebai/Downloads/segmentation.csv")

In [73]:
product.head(20)

Unnamed: 0,Merchant,Date,Product,Event,Count of events,Usd Amount
0,282t1vpldi,01/01/2013,Basic API,Charge,33.0,329967.0
1,282t1vpldi,01/02/2013,Basic API,Charge,17.0,169983.0
2,282t1vpldi,01/03/2013,Basic API,Charge,20.0,199980.0
3,282t1vpldi,01/04/2013,Basic API,Charge,21.0,209979.0
4,282t1vpldi,01/05/2013,Recurring,Subscription.Charge,23.0,229977.0
5,282t1vpldi,01/06/2013,Recurring,Subscription.Charge,21.0,209979.0
6,282t1vpldi,01/07/2013,Recurring,Subscription.Charge,25.0,249975.0
7,282t1vpldi,01/08/2013,Recurring,Subscription.Charge,26.0,259974.0
8,282t1vpldi,01/09/2013,Recurring,Subscription.Charge,25.0,249975.0
9,282t1vpldi,01/10/2013,Recurring,Subscription.Charge,26.0,259974.0


In [68]:
segmentation.head()

Unnamed: 0,Merchant,Segment
0,282t1vpldi,SaaS
1,2x5fpa2a9k9,SaaS
2,39rrckrzfr,SaaS
3,3r5r60f6r,Platform
4,4p36czyqfr,Platform


In [69]:
data.Event.unique()

array(['Charge', 'Subscription.Charge', 'Marketplace.Charge',
       'Cart.AddItem', 'Cart.Checkout', 'Cart.PaymentSubmit',
       'Cart.ViewItem', nan], dtype=object)

In [70]:
data['product_event'] = data['Product']+'-'+data['Event']

In [71]:
data.product_event.unique()

array(['Basic API-Charge', 'Recurring-Subscription.Charge',
       'Marketplaces-Marketplace.Charge', 'Cart-Cart.AddItem',
       'Cart-Cart.Checkout', 'Cart-Cart.PaymentSubmit',
       'Cart-Cart.ViewItem', nan], dtype=object)

In [6]:
data = product.merge(segmentation, on = "Merchant", how = 'left')

In [7]:
data.head()

Unnamed: 0,Merchant,Date,Product,Event,Count of events,Usd Amount,Segment
0,282t1vpldi,01/01/2013,Basic API,Charge,33.0,329967.0,SaaS
1,282t1vpldi,01/02/2013,Basic API,Charge,17.0,169983.0,SaaS
2,282t1vpldi,01/03/2013,Basic API,Charge,20.0,199980.0,SaaS
3,282t1vpldi,01/04/2013,Basic API,Charge,21.0,209979.0,SaaS
4,282t1vpldi,01/05/2013,Recurring,Subscription.Charge,23.0,229977.0,SaaS


In [9]:
# the performance metrics at product level and segment level
data.groupby('Product')['Count of events'].sum().reset_index(name='total events')

Unnamed: 0,Product,total events
0,Basic API,54261.0
1,Cart,11943599.0
2,Marketplaces,522102.0
3,Recurring,123992.0


In [16]:
data.groupby('Segment')['Count of events'].sum().reset_index(name='total events')

Unnamed: 0,Segment,total events
0,E-Commerce Store,11943599.0
1,Platform,522102.0
2,SaaS,178253.0


In [85]:
data.groupby(['Product','Segment'])['Count of events'].sum().reset_index(name='total events').sort_values(ascending=False, by = 'total events')

Unnamed: 0,Product,Segment,total events
1,Cart,E-Commerce Store,11943599.0
2,Marketplaces,Platform,522102.0
3,Recurring,SaaS,123992.0
0,Basic API,SaaS,54261.0


In [15]:
data.groupby('Product')['Usd Amount'].sum().reset_index(name = 'total amount')

Unnamed: 0,Product,total amount
0,Basic API,86477670.0
1,Cart,558510700.0
2,Marketplaces,1531997000.0
3,Recurring,48848470.0


In [17]:
data.groupby('Segment')['Usd Amount'].sum().reset_index(name = 'total amount')

Unnamed: 0,Segment,total amount
0,E-Commerce Store,558510700.0
1,Platform,1531997000.0
2,SaaS,135326100.0


In [84]:
data.groupby(['Product','Segment'])['Usd Amount'].sum().reset_index(name='total amount').sort_values(by = 'total amount', ascending=False)

Unnamed: 0,Product,Segment,total amount
2,Marketplaces,Platform,1531997000.0
1,Cart,E-Commerce Store,558510700.0
0,Basic API,SaaS,86477670.0
3,Recurring,SaaS,48848470.0


Comments: Cart & E-Commerce Store had the highest amount of events but Martketplaces & Platform had the highest amount of revenue. 

In [36]:
data.head()

Unnamed: 0,Merchant,Date,Product,Event,Count of events,Usd Amount,Segment,Month,Year,Year-Month
0,282t1vpldi,2013-01-01,Basic API,Charge,33.0,329967.0,SaaS,1.0,2013.0,2013-01
1,282t1vpldi,2013-01-02,Basic API,Charge,17.0,169983.0,SaaS,1.0,2013.0,2013-01
2,282t1vpldi,2013-01-03,Basic API,Charge,20.0,199980.0,SaaS,1.0,2013.0,2013-01
3,282t1vpldi,2013-01-04,Basic API,Charge,21.0,209979.0,SaaS,1.0,2013.0,2013-01
4,282t1vpldi,2013-01-05,Recurring,Subscription.Charge,23.0,229977.0,SaaS,1.0,2013.0,2013-01


In [30]:
data['Date'] = pd.to_datetime(data['Date'])

In [28]:
# performance metrics at monthly level for each product and segment 
print(min(data.Date))
print(max(data.Date))


2013-01-01 00:00:00
2014-01-12 00:00:00


In [34]:
data['Year-Month'] = data.Date.dt.strftime("%Y-%m")

In [38]:
data.groupby(['Product','Year-Month'])['Count of events'].sum().reset_index(name = 'total amount')

Unnamed: 0,Product,Year-Month,total amount
0,Basic API,2013-01,8389.0
1,Basic API,2014-01,45872.0
2,Cart,2013-01,3327761.0
3,Cart,2014-01,8615838.0
4,Marketplaces,2013-01,232075.0
5,Marketplaces,2014-01,290027.0
6,Recurring,2013-01,18137.0
7,Recurring,2014-01,105855.0


In [43]:
data_2013 = data[data['Year-Month'] == '2013-01']
data_2014 = data[data['Year-Month'] == '2014-01']

In [52]:
data_2013_grouped_by_product = data_2013.groupby(['Product','Year-Month'])['Usd Amount'].sum().reset_index(name = 'total amount')
data_2014_grouped_by_product = data_2014.groupby(['Product','Year-Month'])['Usd Amount'].sum().reset_index(name = 'total amount')
data_grouped_by_product = data_2013_grouped_by_product.merge(data_2014_grouped_by_product, on = 'Product', how = 'inner')
data_grouped_by_product['pct of change'] = round((data_grouped_by_product['total amount_y'] - data_grouped_by_product['total amount_x'])/data_grouped_by_product['total amount_y'],2)
data_grouped_by_product

Unnamed: 0,Product,Year-Month_x,total amount_x,Year-Month_y,total amount_y,pct of change
0,Basic API,2013-01,25507208.0,2014-01,60970458.0,0.58
1,Cart,2013-01,170822087.0,2014-01,387688583.0,0.56
2,Marketplaces,2013-01,674669133.0,2014-01,857328305.0,0.21
3,Recurring,2013-01,12543585.0,2014-01,36304888.0,0.65


In [53]:
data_2013_grouped_by_segment = data_2013.groupby(['Segment','Year-Month'])['Usd Amount'].sum().reset_index(name = 'total amount')
data_2014_grouped_by_segment = data_2014.groupby(['Segment','Year-Month'])['Usd Amount'].sum().reset_index(name = 'total amount')
data_grouped_by_segment = data_2013_grouped_by_segment.merge(data_2014_grouped_by_segment, on = 'Segment', how = 'inner')
data_grouped_by_segment['pct of change'] = round((data_grouped_by_segment['total amount_y'] - data_grouped_by_segment['total amount_x'])/data_grouped_by_segment['total amount_y'],2)
data_grouped_by_segment

Unnamed: 0,Segment,Year-Month_x,total amount_x,Year-Month_y,total amount_y,pct of change
0,E-Commerce Store,2013-01,170822087.0,2014-01,387688583.0,0.56
1,Platform,2013-01,674669133.0,2014-01,857328305.0,0.21
2,SaaS,2013-01,38050793.0,2014-01,97275346.0,0.61


In [41]:
data.groupby(['Segment','Year-Month'])['Count of events'].sum().reset_index(name = 'total events')

Unnamed: 0,Segment,Year-Month,total events
0,E-Commerce Store,2013-01,3327761.0
1,E-Commerce Store,2014-01,8615838.0
2,Platform,2013-01,232075.0
3,Platform,2014-01,290027.0
4,SaaS,2013-01,26526.0
5,SaaS,2014-01,151727.0


In [42]:
data.groupby(['Segment','Year-Month'])['Usd Amount'].sum().reset_index(name = 'total amount')

Unnamed: 0,Segment,Year-Month,total amount
0,E-Commerce Store,2013-01,170822087.0
1,E-Commerce Store,2014-01,387688583.0
2,Platform,2013-01,674669133.0
3,Platform,2014-01,857328305.0
4,SaaS,2013-01,38050793.0
5,SaaS,2014-01,97275346.0


In [54]:
data_2013_grouped_by_product = data_2013.groupby(['Product','Year-Month'])['Count of events'].sum().reset_index(name = 'total events')
data_2014_grouped_by_product = data_2014.groupby(['Product','Year-Month'])['Count of events'].sum().reset_index(name = 'total events')
data_grouped_by_product = data_2013_grouped_by_product.merge(data_2014_grouped_by_product, on = 'Product', how = 'inner')
data_grouped_by_product['pct of change'] = round((data_grouped_by_product['total events_y'] - data_grouped_by_product['total events_x'])/data_grouped_by_product['total events_y'],2)
data_grouped_by_product

Unnamed: 0,Product,Year-Month_x,total events_x,Year-Month_y,total events_y,pct of change
0,Basic API,2013-01,8389.0,2014-01,45872.0,0.82
1,Cart,2013-01,3327761.0,2014-01,8615838.0,0.61
2,Marketplaces,2013-01,232075.0,2014-01,290027.0,0.2
3,Recurring,2013-01,18137.0,2014-01,105855.0,0.83


In [55]:
data_2013_grouped_by_segment = data_2013.groupby(['Segment','Year-Month'])['Count of events'].sum().reset_index(name = 'total events')
data_2014_grouped_by_segment = data_2014.groupby(['Segment','Year-Month'])['Count of events'].sum().reset_index(name = 'total events')
data_grouped_by_segment = data_2013_grouped_by_segment.merge(data_2014_grouped_by_segment, on = 'Segment', how = 'inner')
data_grouped_by_segment['pct of change'] = round((data_grouped_by_segment['total events_y'] - data_grouped_by_segment['total events_x'])/data_grouped_by_segment['total events_y'],2)
data_grouped_by_segment

Unnamed: 0,Segment,Year-Month_x,total events_x,Year-Month_y,total events_y,pct of change
0,E-Commerce Store,2013-01,3327761.0,2014-01,8615838.0,0.61
1,Platform,2013-01,232075.0,2014-01,290027.0,0.2
2,SaaS,2013-01,26526.0,2014-01,151727.0,0.83


Comments: SaaS segment and Recurring product had the most percentage of increase in terms of count of events since Jan 2013. I would highly recommend to continue to develop and promote recurring product. 
SaaS product was launched in May 2013 but there was no data provided between Jan 2013 and Jan 2014. If we had more data, I would measure the effectiveness of this launch.
Basic API in SaaS segment didn't produce enough number of events compared with other products. 

In [74]:
cart_product = data[data['Product'] == 'Cart']

In [78]:
cart_grouped = cart_product.groupby('product_event')['Count of events'].sum().reset_index(name = 'Total events')
cart_grouped

Unnamed: 0,product_event,Total events
0,Cart-Cart.AddItem,1652213.0
1,Cart-Cart.Checkout,1195557.0
2,Cart-Cart.PaymentSubmit,278835.0
3,Cart-Cart.ViewItem,8816994.0


In [80]:
payment_submission_rate = 278835.0/8816994.0
payment_submission_rate

0.0316247238004245

In [81]:
checkout_rate = 1195557.0/8816994.0
checkout_rate

0.1355968939073793

Comment: the checkout rate is 14%. And the payment submission rate is only 3%. Is there anyway to increase payment submission rate?