Skip to content

KatherineCol/Correlation-Analysis.-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Correlation Analysis

Movie Correlation and Exploratory Data Analysis (EDA)

ABOUT DATA SET

There are 6820 movies in the dataset (220 movies per year, 1986-2016). In order to stablish the main interest in our data set, we have utilized the Exploratory Data Analysis (EDA) and correlation methods.

We explore the dataset through some visualizations to answer the following questions.

  • What are the top 5 movies by gross revenue?
  • What are the stars that made the most movies in this period of time? And the directors?
  • What are the directors that have generated the most revenue?
  • What are the best movies by score?
  • What is the volume of movies coming out per year?

FINDINGS

In our EDA analysis it was clear that throughout the last decade, the revenue in average was notably stable, even with a few peaks (e.g., 2015, year in which we had Star Wars >VII, Jurassic World, Avengers: Age of Ultron, etc.) But in 2020, which cinemas and studios shut down, the industry's revenue fell by ~89%.

Correlation: Through this analysis, we can clearly see that variables like company, director, star, and country where a movie is released have little to no correlation with the actual revenue. Whereas budget and the votes a movie gets seems to have greater impact in its earnings. Runtime also shows some correlation with the budget, as longer films tend to cost higher to be produced. Votes and Budgets have the highest correlations. On the other hand, the gross and company has low correlation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published