# Requirements


We expect you to use a tree-based model, but the rest of the decisions are up to you.

We will be looking for the following things:

- A clear problem statement & description of the goals of your study to be included in the final report
- Data from IMDB
- Cleaned and refined data
- Visualization. Plots that describe your data and evaluate your model.
- Tree-based models (use any combination of ensemble techniques: random forests, bagging, boosting).
- A blog post presenting the results of your findings as a report to Netflix, including:
a problem statement,
summary statistics of the various factors (e.g. year, number of ratings, etc.),
your model,
at least 2 graphics,
and your recommendations for next steps!

In [1]:
import pandas as pd
import numpy as np

In [2]:
!pip install imdbpie

Collecting imdbpie
  Downloading imdbpie-4.2.0-py2.py3-none-any.whl
Collecting cachecontrol[filecache] (from imdbpie)
  Downloading CacheControl-0.11.7.tar.gz
Collecting lockfile>=0.9 (from cachecontrol[filecache]->imdbpie)
  Downloading lockfile-0.12.2-py2.py3-none-any.whl
Building wheels for collected packages: cachecontrol
  Running setup.py bdist_wheel for cachecontrol: started
  Running setup.py bdist_wheel for cachecontrol: finished with status 'done'
  Stored in directory: C:\Users\voyo\AppData\Local\pip\Cache\wheels\9b\94\d2\1793b004461b5bc238a89e260cd2b9f770437c42424fdd0943
Successfully built cachecontrol
Installing collected packages: lockfile, cachecontrol, imdbpie
Successfully installed cachecontrol-0.11.7 imdbpie-4.2.0 lockfile-0.12.2


In [3]:
from imdbpie import Imdb
imdb = Imdb()
imdb = Imdb(anonymize=True) # to proxy requests

In [6]:
imdb.search_for_title("The Dark Knight")

[{u'imdb_id': u'tt0468569', u'title': u'The Dark Knight', u'year': u'2008'},
 {u'imdb_id': u'tt1345836',
  u'title': u'The Dark Knight Rises',
  u'year': u'2012'},
 {u'imdb_id': u'tt2313197',
  u'title': u'Batman: The Dark Knight Returns, Part 1',
  u'year': u'2012'},
 {u'imdb_id': u'tt2166834',
  u'title': u'Batman: The Dark Knight Returns, Part 2',
  u'year': u'2013'},
 {u'imdb_id': u'tt1213819', u'title': u'The Dark Knight', u'year': u'1995'},
 {u'imdb_id': u'tt1774602', u'title': u'The Dark Knight', u'year': u'2008'},
 {u'imdb_id': u'tt2258647', u'title': u'The Dark Knight', u'year': u'2011'},
 {u'imdb_id': u'tt2098632',
  u'title': u'Batman: The Dark Knight',
  u'year': u'2008'},
 {u'imdb_id': u'tt2257218',
  u'title': u'The Dark Knight Retires',
  u'year': u'2013'},
 {u'imdb_id': u'tt1265589', u'title': u'Batman Unmasked', u'year': u'2008'},
 {u'imdb_id': u'tt0486410',
  u'title': u'Legends of the Dark Knight: The History of Batman',
  u'year': u'2005'},
 {u'imdb_id': u'tt0486908

In [7]:
imdb.search_for_person("Christian Bale")

[{u'imdb_id': u'nm0000288', u'name': u'Christian Bale'},
 {u'imdb_id': u'nm7635250', u'name': u'Christian Balenciaga'},
 {u'imdb_id': u'nm3577667', u'name': u'Christian Bales'},
 {u'imdb_id': u'nm0160081', u'name': u'Roger Christian'},
 {u'imdb_id': u'nm2530201', u'name': u'Christian Bavle'},
 {u'imdb_id': u'nm1484525', u'name': u'Christian Balz'},
 {u'imdb_id': u'nm4569701', u'name': u'Christian A. Bayle'},
 {u'imdb_id': u'nm1491308', u'name': u'Pale Christian Thomas'},
 {u'imdb_id': u'nm6721313', u'name': u'Jean-Christian Bayle'},
 {u'imdb_id': u'nm6322979', u'name': u'Christian P. Beale'},
 {u'imdb_id': u'nm6338587', u'name': u'Christian Gayle'},
 {u'imdb_id': u'nm5858218', u'name': u'Christian Ball'},
 {u'imdb_id': u'nm3748638', u'name': u'David Christian Ball'},
 {u'imdb_id': u'nm1677412', u'name': u'Christian Haley'}]

In [8]:
title = imdb.get_title_by_id("tt0468569")
title.title

u'The Dark Knight'

In [9]:
title.rating

9

In [10]:
title.certification

u'PG-13'

In [11]:
person = imdb.get_person_by_id("nm0000151")
person.name

u'Morgan Freeman'

In [12]:
imdb.top_250()

[{u'can_rate': True,
  u'image': {u'height': 1388,
   u'url': u'https://images-na.ssl-images-amazon.com/images/M/MV5BODU4MjU4NjIwNl5BMl5BanBnXkFtZTgwMDU2MjEyMDE@._V1_.jpg',
   u'width': 933},
  u'num_votes': 1718414,
  u'rating': 9.3,
  u'tconst': u'tt0111161',
  u'title': u'The Shawshank Redemption',
  u'type': u'feature',
  u'year': u'1994'},
 {u'can_rate': True,
  u'image': {u'height': 1129,
   u'url': u'https://images-na.ssl-images-amazon.com/images/M/MV5BNTUxOTdjMDMtMWY1MC00MjkxLTgxYTMtYTM1MjU5ZTJlNTZjXkEyXkFqcGdeQXVyNTA4NzY1MzY@._V1_.jpg',
   u'width': 798},
  u'num_votes': 1174533,
  u'rating': 9.2,
  u'tconst': u'tt0068646',
  u'title': u'The Godfather',
  u'type': u'feature',
  u'year': u'1972'},
 {u'can_rate': True,
  u'image': {u'height': 1140,
   u'url': u'https://images-na.ssl-images-amazon.com/images/M/MV5BNDVjZjgxNTgtMGNhMC00YWU0LTg0YTQtNTkxNzBjMDBkNWYyXkEyXkFqcGdeQXVyNTA4NzY1MzY@._V1_.jpg',
   u'width': 800},
  u'num_votes': 804896,
  u'rating': 9,
  u'tconst': u'tt0071