### 2/28/2022

Below is a daily table for an active acount at Shopify (an online ecommerce, retail platform). The table is called store_account and the columns are:  

| Column Name | Data Type | Description |
|:--|:--|:--|
| store_id | integer | a unique Shopify store id  |
| date | string | date  |
| status | string | Possible values are: open, closed, fraud  |
| revenue | double | Amount of spend in USD  |

Here's some additional information about the table:
   - The granularity of the table is store_id and day  
   - Assume “closed” and “fraud” are permanent labels  
   - Active = daily revenue > 0   
   - Accounts get labeled by Shopify as fraudulent and they no longer can sell product   
   - Every day of the table has every store_id that has ever used Shopify  
   
Some clarifications:
   - We want one value for each day in the month   
   - A store can be fraudulent and active on same day. E.g. they could generate revenue until 10AM, then be flagged as fradulent from 10AM onward  
   

**Given the above, write code using Python (Pandas library) to show what percent of active stores were fraudulent by day.** 

In [2]:
import pandas as pd

In [60]:
# create the store_account table
store_account = pd.DataFrame(columns = ['store_id', 'date', 'status', 'revenue'])

In [63]:
# calculate the total number of active stores and the number of those active stores that are fraudulent per day
result = store_account.groupby('date').apply(
    lambda x: pd.Series(dict(
        n_active_stores = (x.revenue > 0).sum(),
        n_active_fraud_stores = ((x.revenue > 0) & (x.status == 'fraud')).sum()
    ))).reset_index()

In [56]:
# calculate the proportion of active stores that are fraudulent, rounded to fourth decimal place
result['percent_active_fraud_stores'] = (result['n_active_fraud_stores']/result['n_active_stores']).round(4)

In [58]:
# drop the columns containing counts
result = result.drop(columns = ['n_active_fraud_stores', 'n_active_stores'])

In [None]:
# final dataframe with columns for day and pecent of active stores that were fraudulent
result