## East-Asia Analysis

In [1]:
%matplotlib inline
import os
import pandas as pd
import requests
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import ipywidgets as widgets
from ipywidgets import interact, interactive, Output
import plotly.graph_objects as go
from IPython.display import display
from scipy.stats import gaussian_kde
import plotly.express as px
from eastasia_functions import *

In [2]:
import warnings
warnings.filterwarnings("ignore")

In [3]:
DATA_FOLDER = '/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/data/final/'

In [4]:
def var_loader(DATA_FOLDER, mode='hollywood'):
    results = []
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"{mode}_data.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"{mode}_data_ethnicity.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"{mode}_ethnic_realworld.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"male_{mode}_realworld_averages.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"female_{mode}_realworld_averages.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"bothsexes_{mode}_realworld_averages.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"male_{mode}_realworld_proportions.csv"))
    results.append(pd.read_csv(DATA_FOLDER + f"{mode}/"+ f"female_{mode}_realworld_proportions.csv"))
    return results

# Charging the East-Asia datasets in their respective dataframes
eastasia_data, eastasia_data_ethnicity, eastasia_ethnic_realworld, \
male_eastasia_realworld_averages, female_eastasia_realworld_averages, \
bothsexes_eastasia_realworld_averages, male_eastasia_realworld_proportions, \
female_eastasia_realworld_proportions = var_loader(DATA_FOLDER, mode="eastasia")

### Ethnicity analysis :

**Disproportionate Representation:**
- **Chinese populations** dominate real-world demographics but are underrepresented in certain genres and periods.
- **Korean** and **Japanese** ethnicities often appear **overrepresented** in the cinema industry, particularly in specific genres like Drama or Science Fiction.

In [5]:
# Inputs
cinema_data = eastasia_data  
real_world_data = eastasia_ethnic_realworld 
output_path = '/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/eastasia_ethnicity_1996_2012_single_legend.csv'

# Call the function
result_df = prepare_ethnicity_proportion_data(
    cinema_data=cinema_data,
    real_world_data=real_world_data,
    output_path=output_path,
    period='1996-2012'
)

# Display result
result_df.head()

Data saved to: /Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/eastasia_ethnicity_1996_2012_single_legend.csv


Unnamed: 0,Ethnicity,Category,Percentage
0,Chinese,Real-World,4.617209
1,Chinese,Cinema,14.600232
2,Japanese,Real-World,34.855235
3,Japanese,Cinema,8.34299
4,Koreans,Real-World,0.286529


**Temporal Trends:**
- The shifts in representation across time periods (e.g., 1950–1965 vs. 1996–2012) suggest evolving industry preferences. 
This can correlate with historical events, global interest in specific cultures, or the rise of certain countries' film industries (e.g., Korea's booming cinema in the 2000s).

In [6]:
# Categorize release years into time periods
eastasia_data['Period'] = pd.cut(
    eastasia_data['release'], 
    bins=[1950, 1965, 1980, 1995, 2012], 
    labels=['1950–1965', '1966–1980', '1981–1995', '1996–2012'] 
)
cinema_representation = eastasia_data.groupby(['Period', 'actor_ethnicity_classification']).size().reset_index(name='Movie_Count')
cinema_representation['Proportion'] = cinema_representation.groupby('Period')['Movie_Count'].transform(lambda x: x / x.sum() * 100)
cinema_pivot = cinema_representation.pivot(index='Period', columns='actor_ethnicity_classification', values='Proportion').fillna(0)
output_path = "temporal_trends_with_annotations.html"

plot_temporal_trends(cinema_pivot, output_path)


Plot saved to temporal_trends_with_annotations.html




### Key Events and Their Impact

1. **Golden Age of Japanese Cinema (1950–1965)**
- Japan dominated East Asian cinema with globally acclaimed films by Kurosawa, Ozu, and Mizoguchi.
- Deep themes and strong post-war industry led to high representation.
- [Reference: Scenario writers and scenario readers in the Golden Age of Japanese cinema](https://www.researchgate.net/publication/307866839_Scenario_writers_and_scenario_readers_in_the_Golden_Age_of_Japanese_cinema)

2. **Rise of Martial Arts Films (1966–1980)**
- Martial arts films (e.g., Bruce Lee, Jackie Chan) thrived, mostly from Hong Kong studios.
- Focused on Chinese culture but not produced by mainland China, causing underrepresentation.
- [Reference: Martial arts and the globalization of US and Asian film industries](https://www.researchgate.net/publication/233694107_Martial_arts_and_the_globalization_of_US_and_Asian_film_industries)

3. **Studio Ghibli’s Success (1981–1995)**
- Ghibli revolutionized animation globally with films like *Totoro* and *Mononoke*.
- Expanded Japan's dominance beyond live-action to animation.
- [Reference: The Myth of Ghibli: The Foundation and Early Industrial History of Studio Ghibli](https://www.researchgate.net/publication/368698265_The_Myth_of_Ghibli_The_Foundation_and_Early_Industrial_History_of_Studio_Ghibli)

4. **Korean Wave (1996–2012)**
- Korea emerged as a global cinema powerhouse with *Oldboy* and *The Host*.
- Government support and global appeal of *Hallyu* boosted representation.
- [Reference: Contemporary Korean Cinema: Challenges and the Transformation of 'Planet Hallyuwood'](https://www.researchgate.net/publication/266347427_Contemporary_Korean_Cinema_Challenges_and_the_Transformation_of_'Planet_Hallyuwood')

*Observations*
- **Chinese Cinema**: Underrepresented due to historical and political challenges.
- **Japanese & Korean Cinema**: Benefited from eras of innovation and global appeal.




### Key Events and Their Impact

1. **Golden Age of Japanese Cinema (1950–1965)**
- Japan dominated East Asian cinema with globally acclaimed films by Kurosawa, Ozu, and Mizoguchi.
- Deep themes and strong post-war industry led to high representation.
- [Reference: Scenario writers and scenario readers in the Golden Age of Japanese cinema](https://www.researchgate.net/publication/307866839_Scenario_writers_and_scenario_readers_in_the_Golden_Age_of_Japanese_cinema)

2. **Rise of Martial Arts Films (1966–1980)**
- Martial arts films (e.g., Bruce Lee, Jackie Chan) thrived, mostly from Hong Kong studios.
- Focused on Chinese culture but not produced by mainland China, causing underrepresentation.
- [Reference: Martial arts and the globalization of US and Asian film industries](https://www.researchgate.net/publication/233694107_Martial_arts_and_the_globalization_of_US_and_Asian_film_industries)

3. **Studio Ghibli’s Success (1981–1995)**
- Ghibli revolutionized animation globally with films like *Totoro* and *Mononoke*.
- Expanded Japan's dominance beyond live-action to animation.
- [Reference: The Myth of Ghibli: The Foundation and Early Industrial History of Studio Ghibli](https://www.researchgate.net/publication/368698265_The_Myth_of_Ghibli_The_Foundation_and_Early_Industrial_History_of_Studio_Ghibli)

4. **Korean Wave (1996–2012)**
- Korea emerged as a global cinema powerhouse with *Oldboy* and *The Host*.
- Government support and global appeal of *Hallyu* boosted representation.
- [Reference: Contemporary Korean Cinema: Challenges and the Transformation of 'Planet Hallyuwood'](https://www.researchgate.net/publication/266347427_Contemporary_Korean_Cinema_Challenges_and_the_Transformation_of_'Planet_Hallyuwood')

*Observations*
- **Chinese Cinema**: Underrepresented due to historical and political challenges.
- **Japanese & Korean Cinema**: Benefited from eras of innovation and global appeal.


### Gender Analysis

**Male Dominance Across Genres:**
In all periods, male representation is higher than female representation, especially in genres like Action/Adventure and Science Fiction.
*1996–2012:* Males dominate most genres, with their highest proportions in Action/Adventure (71.0%) and Thriller/Suspense (67.2%).

In [7]:
# File paths for output
radar_html_path = "/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/radar_plot.html"

create_gender_radar_chart(eastasia_data, radar_html_path)


Radar chart saved at: /Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/radar_plot.html


**Genre-Specific Representation:**
There is a gradual increase in female representation in some genres (e.g., Drama) over time.
Genres like Romance and Horror show relatively higher female proportions compared to other genres.

In [8]:
# Define output path
output_csv_path = "/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/flourish_bar_race.csv"

# Call the function
prepare_flourish_bar_race(eastasia_data, output_csv_path)

Data saved successfully at: /Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/flourish_bar_race.csv


**Real-World Comparison:**
The real-world population gender split (~55% female and 45% male) is underrepresented in East Asian cinema.
This gap persists across all periods and genres, indicating a systemic bias in gender representation.

**Chi-Square Test for Gender Proportions in Cinema vs. Real-World Population**

*Hypotheses*
1. **Null Hypothesis (H₀)**:  
   The gender proportions in East Asian cinema match the real-world population proportions (no significant difference).

2. **Alternative Hypothesis (H₁)**:  
   The gender proportions in East Asian cinema differ significantly from real-world population proportions.

In [9]:
from scipy.stats import chisquare

# Preparing the data for gender proportions in cinema
cinema_gender_proportions = eastasia_data.groupby('actor_gender').size()
cinema_gender_proportions = cinema_gender_proportions / cinema_gender_proportions.sum() * 100 

# Real-world proportions
real_world_proportions = pd.Series({'Male': 45, 'Female': 55})

# Chi-square test
observed = cinema_gender_proportions[['Male', 'Female']].values
expected = real_world_proportions[['Male', 'Female']].values

chi_stat, p_value = chisquare(f_obs=observed, f_exp=expected)

chi_stat, p_value


(14.773476633480033, 0.00012122869591680186)

*Results*
- **Chi-Square Statistic**: 14.77  
- **P-Value**: 0.00012  

*Interpretation*
1. **Chi-Square Statistic (14.77)**:  
   This measures the discrepancy between observed (cinema proportions) and expected (real-world proportions) values. A higher value indicates a greater deviation from the expected proportions.

2. **P-Value (0.00012)**:  
   The p-value is much smaller than the standard significance threshold (e.g., 0.05). This means the result is statistically significant.

*Conclusion:*


**Reject the Null Hypothesis (H₀)**: The gender proportions in East Asian cinema are significantly different from real-world gender proportions.
This result highlights a **systemic bias in gender representation** within East Asian cinema. Females, despite making up 55% of the real-world population, are likely underrepresented in cinematic roles.

*REFERENCES*

**Genre-Specific Representation**

Research indicates a gradual increase in female representation in certain genres over time. The book *Women in East Asian Cinema: Gender Representations, Creative Labour and Global Histories* (2023) discusses the evolving roles of women in East Asian film industries, highlighting increased female presence in genres such as Romance and Horror.

[Read more on Oxford Academic](https://academic.oup.com/edinburgh-scholarship-online/book/58312)

**Real-World Comparison**

The underrepresentation of women in cinema, compared to real-world demographics, remains a systemic issue. The same book examines how women's contributions in East Asian cinema have been historically marginalized, reflecting broader societal gender disparities.

[Read more on Oxford Academic](https://academic.oup.com/edinburgh-scholarship-online/book/58312)


### Age analysis :

**1. Dominance of Youth Representation:**

- Across all plots, actors predominantly fall within the 20-40 age range.
- Real-world populations show a more balanced age distribution, with representation across all age ranges, including older age groups.
- The cinema data skews heavily towards younger actors, reflecting industry preferences.

**2. Gender Differences:**

*Females:*
- Strong peak in the 20-30 age group, diminishing significantly past 40.
- This reflects age-related biases, particularly in female casting, where younger women dominate screen time.


*Males:*
- Peaks slightly later (30-40 age group) than females, with a more gradual decline.
- Males have more representation in older age groups compared to females, indicating longer careers.


In [10]:
female_realworld = female_eastasia_realworld_averages
male_realworld = male_eastasia_realworld_averages
male_output_path = "/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/male_violin_plot.html"
female_output_path = "/Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/female_violin_plot.html"

# Call the function
create_age_distribution_violin_plots(
    female_realworld, male_realworld, eastasia_data,
    male_output_path, female_output_path
)

Male violin plot saved at: /Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/male_violin_plot.html
Female violin plot saved at: /Users/zaynebmellouli/MA1/ada-2024-project-advanceddestroyers0fall/src/eastasia_analysis/female_violin_plot.html


The book *Women in East Asian Cinema: Gender Representations, Creative Labour and Global Histories* discusses the marginalization of older women in East Asian film industries. It highlights how cultural norms and industry practices contribute to the preference for younger female actors.  
[Source: *Women in East Asian Cinema* - [Oxford Academic](https://academic.oup.com/edinburgh-scholarship-online/book/58312)]
