## Case Study 2: Diagnostic Analysis for Main Metrics Performance

1. Python Developer: Sulaiha Subi
2. Created Date: 24-08-24
3. The goal of Case Study 2 is to optimize Floward's main metrics performance by:

    > Identifying Anomalies: Detecting unusual or unexpected changes in the main metrics over time. This involves pinpointing specific time periods where these anomalies occur, which could indicate potential issues or opportunities.
    
    > Understanding Driving Metrics: Determining which other metrics (from the provided order and marketing datasets) have a significant influence on the main metrics. This analysis will help in understanding the factors contributing to the anomalies and overall performance of the main metrics.

### Step 0: Install Libraries

In [1]:
pip install pandas numpy matplotlib seaborn scikit-learn statsmodels scipy


Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


### Step 1: Load and Understanding the Data

In [6]:
import pandas as pd

# Load the datasets
main_metrics_file_path = '/Users/sulaihasubi/Documents/GitHub/Floward/Dataset/final_main_metrics.csv'
order_metrics_file_path = '/Users/sulaihasubi/Documents/GitHub/Floward/Dataset/final_order_metrics.csv'
marketing_metrics_file_path = '/Users/sulaihasubi/Documents/GitHub/Floward/Dataset/final_marketing_metrics.csv'

# Load the data into pandas dataframes
main_metrics_df = pd.read_csv(main_metrics_file_path)
order_metrics_df = pd.read_csv(order_metrics_file_path)
marketing_metrics_df = pd.read_csv(marketing_metrics_file_path)

# Display the first few rows of each dataframe to understand the structure
print("Main metrics data:")
display(main_metrics_df.head())

print("\nOrder metrics data:")
display(order_metrics_df.head())

print("\nMarketing metrics data:")
display(marketing_metrics_df.head())

Main metrics data:


Unnamed: 0,snapshot_date,main_metrics_3,main_metrics_2,main_metrics_1,android_main_metrics_1,ios_main_metrics_1,web_main_metrics_1,agent_main_metrics_1,android_main_metrics_2,ios_main_metrics_2,...,agent_main_metrics_3,new_customer_main_metrics_1,registered_user_main_metrics_1,existing_customer_main_metrics_1,new_customer_main_metrics_2,registered_user_main_metrics_2,existing_customer_main_metrics_2,new_customer_main_metrics_3,registered_user_main_metrics_3,existing_customer_main_metrics_3
0,2024-04-26,0.068206,0.025122,0.026547,0.044868,0.035319,0.004567,,0.02994,0.032209,...,,0.033368,0.051137,0.074245,0.060686,0.090244,0.103217,0.314336,0.093633,0.076582
1,2024-06-04,0.050436,0.057803,0.050395,0.123288,0.053946,0.020564,0.007371,0.143713,0.061399,...,0.009585,0.025112,0.047641,0.127215,0.055409,0.087805,0.160188,0.206085,0.080429,0.082295
2,2024-03-30,0.420614,0.06514,0.191866,0.08041,0.064473,0.014183,0.364856,0.113772,0.065677,...,0.185783,0.072485,0.076789,0.533942,0.08971,0.095122,0.177614,0.59017,0.247252,1.0
3,2024-04-08,0.136763,0.085149,0.110285,0.090209,0.099218,0.02707,0.062233,0.113772,0.091344,...,0.054594,0.094578,0.107718,0.258643,0.108179,0.126829,0.209786,0.654769,0.252145,0.254314
4,2024-05-04,0.047299,0.051801,0.044137,0.064615,0.053189,0.016509,,0.101796,0.058379,...,,0.040181,0.0541,0.090242,0.07124,0.085366,0.119303,0.304272,0.143298,0.082415



Order metrics data:


Unnamed: 0,order_date,order_metrics_2,order_metrics_1,order_metrics_3,android_order_metrics_1,ios_order_metrics_1,web_order_metrics_1,agent_order_metrics_1,android_order_metrics_2,ios_order_metrics_2,...,guest_order_metrics_5_4c,existing_customer_order_metrics_5_4c,registered_user_order_metrics_5_4c,order_metrics_5_4d,ios_order_metrics_5_4d,android_order_metrics_5_4d,web_order_metrics_5_4d,new_customer_order_metrics_5_4d,existing_customer_order_metrics_5_4d,registered_user_order_metrics_5_4d
0,2024-02-10,0.023302,0.030501,0.110195,0.178077,0.028842,0.064262,0.002072,0.134454,0.020504,...,0.572278,0.439305,0.022205,0.706353,0.765346,0.575893,0.530597,0.8,0.561925,0.871783
1,2024-05-13,0.105988,0.089589,0.056895,0.149196,0.091034,0.070039,0.008814,0.176471,0.103353,...,0.449763,0.657122,0.477181,0.854867,0.959752,0.487327,0.452474,0.421053,0.815769,0.762719
2,2024-08-01,0.090704,0.06784,0.036685,0.108641,0.071125,0.057607,0.005783,0.092437,0.097257,...,0.51733,0.595018,0.621729,0.814913,0.910787,0.642857,0.369403,0.2,0.901091,0.83413
3,2024-04-15,0.008519,0.008631,0.077317,0.013046,0.004228,0.020959,0.019494,0.0,0.007481,...,0.559948,0.434962,0.367865,0.591138,0.637635,0.642857,0.383795,0.8,0.686638,0.602332
4,2024-05-15,0.10649,0.091166,0.059408,0.152773,0.074921,0.040818,0.04426,0.159664,0.085896,...,0.367991,0.749189,0.566338,0.66436,0.706241,0.52381,0.416418,0.62,0.589111,0.8148



Marketing metrics data:


Unnamed: 0,date,marketing_metrics_1,marketing_metrics_2,marketing_metrics_3,marketing_metrics_4,marketing_metrics_5,marketing_metrics_6
0,2024-04-24,0.282093,0.065923,0.073126,0.152853,0.106035,0.034045
1,2024-06-07,0.206843,0.012596,0.049726,0.039554,0.269299,0.027199
2,2024-03-17,0.246269,0.334942,0.195247,0.959526,0.016029,0.083021
3,2024-06-13,0.119856,0.014389,0.072395,0.102228,0.32363,0.111552
4,2024-02-10,0.96214,0.088417,0.136015,0.015345,0.140563,0.058858


In [10]:
# Check shape of the dataframe
print(f"Main Metrics DataFrame Shape: {main_metrics_df.shape}")
print(f"Order Metrics DataFrame Shape: {order_metrics_df.shape}")
print(f"Marketing Metrics DataFrame Shape: {marketing_metrics_df.shape}")



Main Metrics DataFrame Shape: (183, 25)
Order Metrics DataFrame Shape: (183, 103)
Marketing Metrics DataFrame Shape: (183, 7)
