# LioJotstar Merger: Data Analysis with Python for Strategic Optimization

## 4. Formulating Key Metrics for Data Overview
This notebook focuses on formulating the essential metrics needed to create a clear and concise overview of the LioCinema and Jotstar datasets, enabling a quick understanding of their key characteristics.

### Importing Required Libraries

In [4]:
import pandas as pd

### Loading Required DataFrames from Saved Parquet Files

In [6]:
try:
    jotstar_contents_df = pd.read_parquet('Parquet Data Files/02. Data Wrangling/Jotstar_db/contents.parquet')
    print("Jotstar - Contents table loaded successfully.")
    jotstar_subscribers_df = pd.read_parquet('Parquet Data Files/03. Feature Engineering/Jotstar_db/subscribers.parquet')
    print("Jotstar - Subscribers table loaded successfully.")
    jotstar_content_consumption_df = pd.read_parquet('Parquet Data Files/02. Data Wrangling/Jotstar_db/content_consumption.parquet')
    print("Jotstar - Content Consumption table loaded successfully.")
    liocinema_contents_df = pd.read_parquet('Parquet Data Files/02. Data Wrangling/LioCinema_db/contents.parquet')
    print("LioCinema - Contents table loaded successfully.")
    liocinema_subscribers_df = pd.read_parquet('Parquet Data Files/03. Feature Engineering/LioCinema_db/subscribers.parquet')
    print("LioCinema - Subscribers table loaded successfully.")
    liocinema_content_consumption_df = pd.read_parquet('Parquet Data Files/02. Data Wrangling/LioCinema_db/content_consumption.parquet')
    print("LioCinema - Content Consumption table loaded successfully.")
    print("\nData Loading Complete.")
    
except FileNotFoundError as e:
       print("Error: One or more Parquet files not found. Please check the file paths.")
       print(f"Details: {e}")
except Exception as e:
       print("An error occurred during data import.")
       print(f"Details: {e}")

Jotstar - Contents table loaded successfully.
Jotstar - Subscribers table loaded successfully.
Jotstar - Content Consumption table loaded successfully.
LioCinema - Contents table loaded successfully.
LioCinema - Subscribers table loaded successfully.
LioCinema - Content Consumption table loaded successfully.

Data Loading Complete.


### Computing KPIs for Overall Data Summary

#### 1. Total Content Items

In [9]:
JS_Total_Content_Items = len(jotstar_contents_df['Content ID'])
LC_Total_Content_Items = len(liocinema_contents_df['Content ID'])
print(f"Total Content Items in Jotstar = {JS_Total_Content_Items}")
print(f"Total Content Items in LioCinema = {LC_Total_Content_Items}")

Total Content Items in Jotstar = 2360
Total Content Items in LioCinema = 1250


#### 2. Total Users

In [11]:
JS_Total_Users = len(jotstar_subscribers_df['User ID'])
LC_Total_Users = len(liocinema_subscribers_df['User ID'])
print(f"Total Users in Jotstar = {JS_Total_Users}")
print(f"Total Users in LioCinema = {LC_Total_Users}")

Total Users in Jotstar = 44620
Total Users in LioCinema = 183446


#### 3. Paid Users

In [13]:
JS_Paid_Users = len(jotstar_subscribers_df[jotstar_subscribers_df['New Subscription Plan'] != "Free"])
LC_Paid_Users = len(liocinema_subscribers_df[liocinema_subscribers_df['New Subscription Plan'] != "Free"])
print(f"Total Paid Users in Jotstar = {JS_Paid_Users}")
print(f"Total Paid Users in LioCinema = {LC_Paid_Users}")

Total Paid Users in Jotstar = 31677
Total Paid Users in LioCinema = 63499


#### 4. Paid Users %

In [15]:
JS_Paid_Users_pct = round((JS_Paid_Users/JS_Total_Users) * 100)
LC_Paid_Users_pct = round((LC_Paid_Users/LC_Total_Users) * 100)
print(f"Paid Users account for {JS_Paid_Users_pct}% of the Total Users in Jotstar.")
print(f"Paid Users account for {LC_Paid_Users_pct}% of the Total Users in LioCinema.")

Paid Users account for 71% of the Total Users in Jotstar.
Paid Users account for 35% of the Total Users in LioCinema.


#### 5. Active Users

In [17]:
JS_Active_Users = len(jotstar_subscribers_df[jotstar_subscribers_df['Last Active Date'].isnull() == True])
LC_Active_Users = len(liocinema_subscribers_df[liocinema_subscribers_df['Last Active Date'].isnull() == True])
print(f"Total Active Users in Jotstar = {JS_Active_Users}")
print(f"Total Active Users in LioCinema = {LC_Active_Users}")

Total Active Users in Jotstar = 37968
Total Active Users in LioCinema = 101141


#### 6. Inactive Users

In [19]:
JS_Inactive_Users = len(jotstar_subscribers_df[jotstar_subscribers_df['Last Active Date'].isnull() == False])
LC_Inactive_Users = len(liocinema_subscribers_df[liocinema_subscribers_df['Last Active Date'].isnull() == False])
print(f"Total Inactive Users in Jotstar = {JS_Inactive_Users}")
print(f"Total Inactive Users in LioCinema = {LC_Inactive_Users}")

Total Inactive Users in Jotstar = 6652
Total Inactive Users in LioCinema = 82305


#### 7. Active Rate (%)

In [21]:
JS_Active_Rate = round((JS_Active_Users/JS_Total_Users) * 100)
LC_Active_Rate = round((LC_Active_Users/LC_Total_Users) * 100)
print(f"Jotstar's Active User Rate is {JS_Active_Rate}%.")
print(f"LioCinema's Active User Rate is {LC_Active_Rate}%.")

Jotstar's Active User Rate is 85%.
LioCinema's Active User Rate is 55%.


#### 8. Inactive Rate (%)

In [23]:
JS_Inactive_Rate = round((JS_Inactive_Users/JS_Total_Users) * 100)
LC_Inactive_Rate = round((LC_Inactive_Users/LC_Total_Users) * 100)
print(f"Jotstar's Inactive User Rate is {JS_Inactive_Rate}%.")
print(f"LioCinema's Inactive User Rate is {LC_Inactive_Rate}%.")

Jotstar's Inactive User Rate is 15%.
LioCinema's Inactive User Rate is 45%.


#### 9. Upgraded Users

In [25]:
JS_Upgraded_Users = len(jotstar_subscribers_df[jotstar_subscribers_df['Plan Change Type'] == "Upgrade"])
LC_Upgraded_Users = len(liocinema_subscribers_df[liocinema_subscribers_df['Plan Change Type'] == "Upgrade"])
print(f"Total Upgraded Users in Jotstar = {JS_Upgraded_Users}")
print(f"Total Upgraded Users in LioCinema = {LC_Upgraded_Users}")

Total Upgraded Users in Jotstar = 4348
Total Upgraded Users in LioCinema = 4155


#### 10. Upgrade Rate (%)

In [27]:
JS_Upgrade_Rate = round((JS_Upgraded_Users/JS_Total_Users) * 100)
LC_Upgrade_Rate = round((LC_Upgraded_Users/LC_Total_Users) * 100)
print(f"Jotstar's Upgrade Rate is {JS_Upgrade_Rate}%.")
print(f"LioCinema's Upgrade Rate is {LC_Upgrade_Rate}%.")

Jotstar's Upgrade Rate is 10%.
LioCinema's Upgrade Rate is 2%.


#### 11. Downgraded Users

In [29]:
JS_Downgraded_Users = len(jotstar_subscribers_df[jotstar_subscribers_df['Plan Change Type'] == "Downgrade"])
LC_Downgraded_Users = len(liocinema_subscribers_df[liocinema_subscribers_df['Plan Change Type'] == "Downgrade"])
print(f"Total Downgraded Users in Jotstar = {JS_Downgraded_Users}")
print(f"Total Downgraded Users in LioCinema = {LC_Downgraded_Users}")

Total Downgraded Users in Jotstar = 2742
Total Downgraded Users in LioCinema = 20859


#### 12. Downgrade Rate (%)

In [31]:
JS_Downgrade_Rate = round((JS_Downgraded_Users/JS_Total_Users) * 100)
LC_Downgrade_Rate = round((LC_Downgraded_Users/LC_Total_Users) * 100)
print(f"Jotstar's Downgrade Rate is {JS_Downgrade_Rate}%.")
print(f"LioCinema's Downgrade Rate is {LC_Downgrade_Rate}%.")

Jotstar's Downgrade Rate is 6%.
LioCinema's Downgrade Rate is 11%.


#### 13. Total Watch Time (hrs)

In [33]:
JS_Total_Watch_Time_mins = jotstar_content_consumption_df['Total Watch Time (mins)'].sum()
JS_Total_Watch_Time_hrs = round(JS_Total_Watch_Time_mins/60, 2)
print(f"Total Watch Time for Jotstar: {round(JS_Total_Watch_Time_hrs / 1_000_000)} Million hours")
LC_Total_Watch_Time_mins = liocinema_content_consumption_df['Total Watch Time (mins)'].sum()
LC_Total_Watch_Time_hrs = round(LC_Total_Watch_Time_mins/60, 2)
print(f"Total Watch Time for LioCinema: {round(LC_Total_Watch_Time_hrs / 1_000_000)} Million hours")

Total Watch Time for Jotstar: 16 Million hours
Total Watch Time for LioCinema: 11 Million hours


#### 14. Average Watch Time (hrs)

In [35]:
JS_Average_Watch_Time_mins = jotstar_content_consumption_df['Total Watch Time (mins)'].mean()
JS_Average_Watch_Time_hrs = round(JS_Average_Watch_Time_mins/60, 2)
print(f"Average Watch Time for Jotstar: {round(JS_Average_Watch_Time_hrs, 1)} hours")
LC_Average_Watch_Time_mins = liocinema_content_consumption_df['Total Watch Time (mins)'].mean()
LC_Average_Watch_Time_hrs = round(LC_Average_Watch_Time_mins/60, 2)
print(f"Average Watch Time for LioCinema: {round(LC_Average_Watch_Time_hrs, 1)} hours")

Average Watch Time for Jotstar: 117.2 hours
Average Watch Time for LioCinema: 25.6 hours


#### 15. Upgrade vs. Downgrade Rate (%)

In [37]:
JS_Upgrade_vs_Downgrade_Rate = round((JS_Upgraded_Users/JS_Downgraded_Users) * 100)
LC_Upgrade_vs_Downgrade_Rate = round((LC_Upgraded_Users/LC_Downgraded_Users) * 100)
print(f"For Jotstar, the Upgrade vs. Downgrade Rate is approximately {JS_Upgrade_vs_Downgrade_Rate}%.")
print(f"For LioCinema, the Upgrade vs. Downgrade Rate is approximately {LC_Upgrade_vs_Downgrade_Rate}%.")

For Jotstar, the Upgrade vs. Downgrade Rate is approximately 159%.
For LioCinema, the Upgrade vs. Downgrade Rate is approximately 20%.


#### Upcoming Section... 
### 'Data Analysis & Visualization' across seven objectives:
##### 1. Content Library
##### 2. Subscribers
##### 3. Inactivity
##### 4. Upgrades
##### 5. Downgrades
##### 6. Engagement
##### 7. Revenue

## Next Notebook: "5. Exploring Content Diversity Across Platforms"