Skip to content

A simple data science code and blog for Data Science nanodegree of Udacity

Notifications You must be signed in to change notification settings

NoAbdulrahman/data_science_blog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data science blog

Simple data science code and blog for Data Science nanodegree of Udacity

Project motivation

This is a simple project to investigate movies dataset and the impact of genres on the voting and popularity. This investigating can help in many cases, for example: providing some offers or sales with the movies that belong to the genres of the highest popularity or average voting. The project aims to answer the following questions:

  1. What is the movie that has the highest average voting? what its genres? how many times it has been voted?
  2. What is the movie that has the highest voting count? what its genres and average voting?
  3. What are the most 5 common genres?
  4. What is the genre that has the maximum average voting?
  5. What is the genre that has the highest popularity?

Installation

Python3, pandas and matplotlib are the only requirements to be installed.

File description

  • IMDB 5000 Movie Dataset (21 columns and 10866 rows)
  • Python script to run simple exploring and and analysis on the dataset

Results summary

  • The movie with the highest average voting belongs to Documentary genre.
  • The movie with the highest average voting belongs to 'Action', 'Adventure', 'Mystery', 'Science Fiction' and 'Thriller' genres.
  • Drama is the most common genre, followed by Comedy, then Thriller.
  • The movies of documentary genre have the highest average voting than other genres.
  • Adventure is the most polpular genre, followed by Fiction and Fantasy.
  • The least polpular genres are Foreign and Documentary.

Licensing and Acknowledgements

The dataset resorce is on Kaggle: https://www.kaggle.com/carolzhangdc/imdb-5000-movie-dataset

About

A simple data science code and blog for Data Science nanodegree of Udacity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published