![logo.png](https://github.com/interviewquery/takehomes/blob/stripe_1/stripe_1/logo.png?raw=1)



## Assignment

You are leading a project to analyze product performance at Stripe. We at Stripe are most interested in how the products are performing and growing as well as how to prioritize our product development efforts to maximize our growth. You have been provided an initial cut of data on a few flagship products, each of which is targeted at a specific user segment. From the data you've been given, please prepare a short presentation detailing your findings. 

**Product Usage Table:**

|Label| Description|
| - | :- |  
| `merchant` | This is the unique ID of each Stripe user |
|`date` | Data is aggregated up to the month level for each Stripe user.|
|`product` | This is the Stripe product that the user is using to charge their customers.|
|`event` | This is an action within a product. For more details on how products are used, see "segment details" below|
|`count of events` |       |
|`usd_amount` |Total amount in cents and USD that was processed for that API call|

**Segment Table:**

This is a mapping of merchant IDs to the user segmentation we have.

|Label| Description|
| - | :- | 
|`saas`| These businesses serve SaaS products which means they primarily charge their customers on a recurring basis (usually   monthly). We want them to use our Subscriptions payments product to charge regularly on a time interval.|
|`ecommerce` | These businesses use Stripe's shopping cart product and primarily sell physical or digital goods online. |
|`platforms` | These users are platforms upon which other users can sign up and charge for services through the Stripe API. Examples would include ridesharing services, delivery services, etc. (e.g. Lyft, Task Rabbit, Instacart) |


## Segment Details

### SaaS

SaaS users have two options when they process recurring payments. The recurring payments product allows them to schedule automatically recurring payments on a fixed schedule, but the merchants can also manually create charges on Stripe for their recurring payments. Our hope with the recurring payments product is to make it easy for all users to automate their payments. The product was launched in May 2013. 

### E-Commerce Store

Our shopping cart product enables online e-commerce stores to sell goods. We track details on their website around the conversion funnel and actions that customers take. We can see when an item is viewed, added to the cart, when the checkout flow is initiated, and when it is completed with a payment submitted. 

### Platforms 

Our Marketplace product allows platforms to charge on behalf of other users and payout funds to each end automatically.



### Questions to guide thinking:

1. How are each of Stripe's products and segments performing and where are they headed?

2. Are there any issues with the products that we should address?

3. Given more time and access to more data, what would you want to dig deeper on?

4. How should we prioritize development for different products, given our limited resources?




In [32]:
!git clone --branch stripe_1 https://github.com/interviewquery/takehomes.git
%cd takehomes/stripe_1
!if [[ $(ls *.zip) ]]; then unzip *.zip; fi
!ls

Cloning into 'takehomes'...
remote: Enumerating objects: 1768, done.[K
remote: Counting objects: 100% (576/576), done.[K
remote: Compressing objects: 100% (455/455), done.[K
remote: Total 1768 (delta 169), reused 481 (delta 120), pack-reused 1192[K
Receiving objects: 100% (1768/1768), 297.30 MiB | 20.59 MiB/s, done.
Resolving deltas: 100% (619/619), done.
/content/takehomes/stripe_1/takehomes/stripe_1
ls: cannot access '*.zip': No such file or directory
logo.png  product_usage.csv  segmentation.csv  takehomefile.ipynb


In [98]:
# Write your code here
import pandas as pd
import numpy as np
pd.set_option('display.float_format', lambda x: '%.5f' % x)

In [85]:
# load the data
product = pd.read_csv("/content/takehomes/stripe_1/product_usage.csv")
segment = pd.read_csv("/content/takehomes/stripe_1/segmentation.csv")

In [86]:
product.isnull()

Unnamed: 0,Merchant,Date,Product,Event,Count of events,Usd Amount,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 12,Unnamed: 13,Unnamed: 14,Unnamed: 15,Unnamed: 16,Unnamed: 17,Unnamed: 18,Unnamed: 19,Unnamed: 20,Unnamed: 21
0,False,False,False,False,False,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
1,False,False,False,False,False,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
2,False,False,False,False,False,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
3,False,False,False,False,False,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
4,False,False,False,False,False,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1441,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
1442,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
1443,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
1444,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True


In [87]:
product.columns

Index(['Merchant', 'Date', 'Product', 'Event', 'Count of events', 'Usd Amount',
       'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', 'Unnamed: 9', 'Unnamed: 10',
       'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13', 'Unnamed: 14',
       'Unnamed: 15', 'Unnamed: 16', 'Unnamed: 17', 'Unnamed: 18',
       'Unnamed: 19', 'Unnamed: 20', 'Unnamed: 21'],
      dtype='object')

In [88]:
product = product[['Merchant','Date','Product', 'Event', 'Count of events', 'Usd Amount']].copy()

In [93]:
product.dropna(axis=0,how = 'all',inplace = True)

In [99]:
product['Usd Amount'].describe()

count        598.00000
mean     3722130.84783
std      5004896.49185
min         5800.00000
25%       370251.75000
50%      1317444.50000
75%      4854264.75000
max     19976172.00000
Name: Usd Amount, dtype: float64

In [124]:
product.groupby('Product')['Usd Amount'].value_counts()

Product    Usd Amount   
Basic API  5800.00000       1
           10000.00000      1
           10500.00000      1
           10989.00000      1
           11700.00000      1
                           ..
Recurring  1433100.00000    1
           1529469.00000    1
           1596402.00000    1
           1632500.00000    1
           1899810.00000    1
Name: Usd Amount, Length: 587, dtype: int64

In [104]:
product[product.isna()['Usd Amount']].head()

Unnamed: 0,Merchant,Date,Product,Event,Count of events,Usd Amount
147,8kkxv1xxbt9,01/08/2013,Cart,Cart.AddItem,6877.0,
148,8kkxv1xxbt9,01/09/2013,Cart,Cart.AddItem,6495.0,
149,8kkxv1xxbt9,01/10/2013,Cart,Cart.AddItem,7293.0,
150,8kkxv1xxbt9,01/11/2013,Cart,Cart.AddItem,6435.0,
151,8kkxv1xxbt9,01/12/2013,Cart,Cart.AddItem,8709.0,
