Skip to content

A project for Udacity Data Scientist Nano Degree Project 1, using Kaggle data on highest grossing movies

Notifications You must be signed in to change notification settings

shrutiturner/highestgrossingmovies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Analysis of Highest Grossing Movies

A project for Udacity Project 1, using Kaggle data on highest grossing movies.

1. Installations

Jupyter Notebook If you do not have the required libraries, they will need to be installed via pip3 on your terminal/command line:

  • numpy
  • pandas
  • sklearn
  • matplotlib
  • seaborn

2. Project Motivation

The project was created for project 1 of the Udacity Data Scientist Nanodegree programme to give me an opportunity to practice and communicate a data science problem.

3. File Descriptions

BlockbusterAnalysis.ipynb: the jupyter notebook file containing all the code I have done and narrative to analyse the data blockbusters.csv: data file from Kaggle

4. How to interact with the project

The project can be viewed as a stand alone demonstration of a simple Data Science project. It can also be used as a reference if you are trying to analyse this data yourself.

5. Licensing, Authors, Acknowledgements, etc.

The data licensing is as stated on Kaggle: The raw data was taken from a crowdflower dataset. The irrelevant columns like Poster URLs and Date of Release were dropped. The ratings from the original dataset (Rotten Tomatoes freshness and audience scores) were all sheared down to only the IMDb ratings of the movies. If you need the original dataset and want to see how the original data looked like, follow the above link.

The sole author of the work is Shruti Turner.

About

A project for Udacity Data Scientist Nano Degree Project 1, using Kaggle data on highest grossing movies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages