This project performs exploratory data analysis on a movie dataset (mymoviedb.csv
) to uncover trends in movie popularity, voting patterns, and ratings. The goal is to understand the characteristics of popular and highly-rated movies using Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.
The dataset contains information about popular movies, including:
Release_Date
: Date the movie was releasedTitle
: Movie titleOverview
: Short plot summaryPopularity
: Popularity scoreVote_Count
: Number of votes the movie receivedVote_Average
: Average ratingOriginal
: (Possibly original language or production flag - TBD)
Release Date | Title | Popularity | Vote Count | Vote Average |
---|---|---|---|---|
2021-12-15 | Spider-Man: No Way Home | 5083.954 | 8940 | 8.3 |
2022-03-01 | The Batman | 3827.658 | 1151 | 8.1 |
2022-02-25 | No Exit | 2618.087 | 122 | 6.3 |
2021-11-24 | Encanto | 2402.201 | 5076 | 7.7 |
2021-12-22 | The King's Man | 1895.511 | 1793 | 7.0 |
- Load and clean the dataset
- Visualize distributions and correlations
- Identify patterns in movie ratings and popularity
- Determine factors that influence audience engagement
- Python 3.x
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Jupyter Notebook



- Drama genre is the most frequent genre in our dataset and has appeared more than 14% of the times among 19 other genres.
- We have 25.5% of our dataset with popular vote (6520 rows). Drama again gets the highest popularity among fans by being having more than 18.5% of movies popularities.
- Spider-Man: No Way Home has the highest popularity rate in our dataset and it has genres of Action , Adventure and Sience Fiction .
- The united states, thread' has the highest lowest rate in our dataset and it has genres of music , drama , 'war', 'sci-fi' and history`.
- Year 2020 has the highest filmming rate in our dataset.