# Nearest Neighbor Movie Recommender Tutorial

Welcome! This is an iPython notebook, which is a way to write and run interactive Python programs. 

The notebook has *cells* which contain code. To run a cell, select it and press Shift+Enter.

In [None]:
# Run this cell once when you open the worksheet.
%reload_ext autoreload
%autoreload 2
from recommend import *

title2MVec_norm, title2movie = init()

## Activity: Choose your feature weights
The next piece of code determines which features will be used to compare movies. Currently we only have the "year" and "runtime" features turned on. Change the number next to a feature to 1 to use the feature, or 0 to ignore it. 

In fact you can give each feature any weight you like, like 0.5 (half as important), 2 (twice as important), or 10 (ten times as important), depending on how important you think each feature is. Just no negative numbers!

When you've selected your features and their weights, run the cell to save the weights, then run the cell below to see the movies that are the nearest neighbors to *Harry Potter and the Goblet of Fire*. 

* Keep tinkering with the feature weights to see how it affects the recommendations. Try to get really good movie recommendations!
* Once you've got good recommendations for Harry Potter, try entering other movies into the cell. Once you've got good recommendations for that movie, check back with Harry Potter. Are the recommendations still good? You want to find feature weights that simultaneously work well for *all* movies.
* Look at the challenges on the board!


In [None]:
# Make sure that each number is followed by a comma or you will get an "invalid syntax" error when you run this cell
# You need to run this cell EVERY TIME you change the feature weights, or your changes won't have effect!

feat2weight = {
    'year': 1.0,
    'runtime': 1.0,
    'rating': 0.0,
    'mpaa': 0.0,
    'votes': 0.0,
    '% votes female': 0.0,
    '% votes non-US': 0.0,
    'age bracket with most votes': 0.0,
    'alcohol/drugs/smoking': 0.0,
    'frightening/intense scenes': 0.0,
    'profanity': 0.0,
    'sex & nudity': 0.0,
    'violence & gore': 0.0,
    'genres': 0.0,
    'countries': 0.0,
    'languages': 0.0,
    'aspect ratio': 0.0,
    'director': 0.0,
    'cast': 0.0,
    'production companies': 0.0,
    'cinematographer': 0.0,
    'original music': 0.0,
    'producer': 0.0,
    'writer': 0.0,
    'keywords': 0.0,    
}

In [None]:
get_recommendations("harry potter and the goblet of fire", feat2weight, title2MVec_norm, title2movie)

## Check your overall score

To check the quality of your recommendations on the entire dataset, you can run the cell below to get your overall score. Your score is based on comparing your nearest neighbor recommendations with the recommendations that IMDb provides. 

For example, we have a dataset of 240 movies and *Alien* -> *The Terminator* is a recommendation pair provided by IMDb. We ask your recommender system to rank the whole dataset (except *Alien*, so 239 movies) in terms of closeness to *Alien*. Each movie has a recommendation rank from 0 (the closest movie) to 238 (the furthest movie). Suppose *The Terminator* has a recommendation rank of 58. Then your score for this pair is: 58/238 = 24.37%, where 0 is best and 100 is worst. The closer your system ranks *The Terminator* to *Alien*, the lower (and better) your score.

Your overall score is the mean average of your score for each recommendation pair provided by IMDb.

In [None]:
get_score(feat2weight, title2MVec_norm, title2movie)