# Disney Genre Trends Over Time

# Introduction

This project looks at Disney's movie genres and how they have made money over the years. I want to see if there are any patterns in which types of movies bring in more money. This is important because it helps us guess which movies will do well in the future and make better decisions for movie production. I'm using a dataset called disney_movies_total_gross.csv, which includes information about Disney movies released from 1937 until now. This dataset tells us the release date, genre, and how much money each movie made. By looking at this data, I hope to find out which movie genres have been the most successful over time.


# Methods & Results

Step 1: Import Necessary Libraries


In [38]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


ModuleNotFoundError: No module named 'seaborn'

Step 2: Read in the Data

In [26]:
df = pd.read_csv('disney_movies_total_gross.csv')

Step 3: Data Summary and Preliminary Cleaning


In [27]:
# Display the first few rows of the dataframe
print(df.head())

# Check for missing values
print(df.isnull().sum())

# Convert 'release_date' to datetime format
df['release_date'] = pd.to_datetime(df['release_date'])

# Extract year from 'release_date' for grouping
df['year'] = df['release_date'].dt.year


                       movie_title  release_date      genre MPAA_rating  \
0  Snow White and the Seven Dwarfs  Dec 21, 1937    Musical           G   
1                        Pinocchio   Feb 9, 1940  Adventure           G   
2                         Fantasia  Nov 13, 1940    Musical           G   
3                Song of the South  Nov 12, 1946  Adventure           G   
4                       Cinderella  Feb 15, 1950      Drama           G   

    total_gross inflation_adjusted_gross  
0  $184,925,485           $5,228,953,251  
1   $84,300,000           $2,188,229,052  
2   $83,320,000           $2,187,090,808  
3   $65,000,000           $1,078,510,579  
4   $85,000,000             $920,608,730  
movie_title                  0
release_date                 0
genre                       17
MPAA_rating                 56
total_gross                  0
inflation_adjusted_gross     0
dtype: int64


Step 4: Data Wrangling and Grouping


In [28]:
# Group data by genre and year
grouped_data = df.groupby(['genre', 'year']).agg({'total_gross': 'sum', 'inflation_adjusted_gross': 'sum'}).reset_index()


Step 5: Visualization of Genre Trends


In [31]:
# Set the plot size
plt.figure(figsize=(15, 10))

# Plot the total gross revenue by genre over the years
sns.lineplot(data=grouped_data, x='year', y='total_gross', hue='genre', marker='o')

# Set plot title and labels
plt.title('Total Gross Revenue by Genre Over the Years')
plt.xlabel('Year')
plt.ylabel('Total Gross Revenue')

# Display the plot
plt.legend(title='Genre')
plt.show()


NameError: name 'sns' is not defined

<Figure size 1080x720 with 0 Axes>

Step 6: Analysis of Popularity and Profitability


In [34]:
# Calculate the average revenue by genre
average_revenue = grouped_data.groupby('genre').agg({'total_gross': 'mean'}).reset_index().sort_values(by='total_gross', ascending=False)

# Display the top 5 profitable genres
print("Top 5 Profitable Genres:")
print(average_revenue.head())

# Plot the average revenue by genre
sns.barplot(data=average_revenue, x='genre', y='total_gross')
plt.title('Average Revenue by Genre')
plt.xlabel('Genre')
plt.ylabel('Average Revenue')
plt.xticks(rotation=45)
plt.show()


DataError: No numeric types to aggregate

Step 7: Investigate Correlations


In [None]:
# Example: Investigate the correlation between genre popularity and the number of movies released
genre_movie_count = df.groupby(['genre', 'year']).size().reset_index(name='movie_count')

# Merge with the revenue data
genre_analysis = pd.merge(grouped_data, genre_movie_count, on=['genre', 'year'])

# Plot the relationship between movie count and total gross revenue
sns.scatterplot(data=genre_analysis, x='movie_count', y='total_gross', hue='genre')
plt.title('Relationship Between Number of Movies and Total Gross Revenue by Genre')
plt.xlabel('Number of Movies')
plt.ylabel('Total Gross Revenue')
plt.legend(title='Genre', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.show()


# Discussion

In this analysis, it was observed that Disney's most profitable movie genres have shifted over time. Musicals were the most profitable in the 1930s, Adventure took the lead in the 1940s, Dramas were top in the 1950s, Comedies dominated from the 1960s to the 1990s, and Adventure made a comeback in the 2000s and 2010s. Adventure and Action are currently the top profitable genres. These findings match the initial expectations to some extent. The early popularity of Musicals and Dramas reflects the entertainment preferences of those times, while the rise of Comedies could be tied to the social and economic climates of the mid-20th century. The current dominance of Adventure and Action genres is likely due to advancements in film technology and a preference for blockbuster franchises.

The impact of these findings could be significant for Disney's future strategies, potentially leading them to focus more on Adventure and Action films. The data also reflect broader industry trends, indicating a shift towards visually stunning, action-packed movies.

# References

Data source: https://data.world/kgarrett/disney-character-success-00-16