Author: Christopher Hunt Jr.
The film industry aims to optimize its performance by creating movies that resonate with audiences and generate substantial revenue. Navigating the diverse landscape of genres, ratings, and lengths poses challenges in determining the most effective approaches. This project dives into movie data obtained from external APIs, which was subsequently organized in an SQL database, to uncover patterns and insights that can inform decision-making.
APIs such as IMDb API were leveraged to fetch comprehensive information about movies, providing a more dynamic and up-to-date dataset. The data was then organized and stored in an SQL database for efficient querying and analysis.
The data used for this analysis spans from the year 2000 to 2001, providing a snapshot of the film industry during that specific time frame.
The analysis heavily relies on the integration of external APIs, allowing for real-time updates and a more enriched dataset. This ensures that the analysis is based on the latest and most accurate information available.
To prepare the data, a cleaning process was performed, followed by Exploratory Data Analysis (EDA).
- Visualized histograms and countplots for all columns during EDA, providing insights into various movie statistics.
- Explored the impact of MPAA ratings on revenue and the relationship between movie length and financial success.
The average ratings of movies across various genres were analyzed to understand the audience's perception of each category. Here are the top genres, ranked by mean rating:
- Talk-Show: 8.25
- Biography: 6.57
- Music: 6.49
- History: 6.37
- Musical: 6.25
These findings provide insights into the genres that tend to receive higher average ratings, reflecting audience preferences based on content and storytelling.
These results complement the overall movie data analysis, offering a comprehensive understanding of both the financial success and audience reception across different genres.
The analysis delves into the revenue generated by different movie genres, providing insights into the financial performance of each category. The top genres, ranked by total revenue, are as follows:
- History: $210,601,234.50
- Fantasy: $105,999,670.07
- Sci-Fi: $103,452,691.92
- Comedy: $100,810,512.36
- Animation: $64,438,521.25
These results highlight the genres that have historically performed well in terms of revenue.
- Hypothesis: MPAA rating significantly affects revenue.
- Test: ANOVA.
- Results: Significant revenue difference between 'G' and 'R' rated movies.
- Hypothesis: Movie length affects revenue.
- Test: 2-sample t-test.
- Results: No significant difference in revenue between >2.5 hours and <=1.5 hours movies.
- Hypothesis: Movie genres have different average ratings.
- Test: ANOVA, Tukey HSD.
- Results: Significant rating differences among genres. Talk-Show, Biography, and Music genres have higher average ratings.
- Analysis: Explored genres with the highest average revenue.
- Results: "History" emerged with the highest total revenue.
- Dropped duplicates, created new columns, and formatted revenue values for clarity.
This analysis, based on real-time data from external APIs, subsequently organized in an SQL database, spans from the year 2000 to 2001. It provides valuable insights into the factors influencing movie revenue and ratings.
For any further inquiries or information, please contact:
- Christopher Hunt
- LinkedIn Profile: Christopher Hunt Jr.
- Email Address: cjhunt592.1@gmail.com


