### Content-Based Recommendation

The next type of recommendation system we wanted to explore was a content-based version. Our previous model would look at other users that have similar interests, and it would recommend other titles that they have liked. This system goes the other direction and it takes movies that you like, and, having learned some information about the film, recommends titles that are similar to it.

To do this, we gathered descriptions and genre tags for each film, and then utilized some of Python's natural language processing tools to turn this text information into numerical information. We used the following process:

 1. **TF-IDF Vectorization**
   - Short for Term Frequency - Inverse Document Frequency, this is a method for assigning values to each word based on the amount of times it appear in documents. This specific value takes in to account the number of times a word appears in a single description and also how commonly it appears in all descriptions. In a single description, a word is given a high tf-idf score if it appears many times in one description, but it is relatively uncommon across all descriptions. This is partially meant to filter out words that are common to movies in general.
   
   
 2. **Cosine Similarity**
  - Once each film is represented by a many-dimensional vector, a common method for determining how 'similar' two films are is by caluculating how close to 1 the cosine of the angle between them is.
  
  
 3. **Sorting**
  - Now that we have a measure of similarity between every pair of movies, we can take in a single movie, sort the rest of the movies by how similar they are to our chosen film, and then return the top 10 most similar films.
  
  
We have put together a Python class to demonstrate our content-based recommender, the source code for it can be found in the src folder under the name [content_rec.py](../../src/content_rec.py). Below we initialize the ContentRecommender object and provide some examples of recommendations.

In [1]:
content = ContentRecommender()

NameError: name 'ContentRecommender' is not defined

In [None]:
content.recommend('Sunset Blvd. (a.k.a. Sunset Boulevard) (1950)')

In [2]:
content.recommend('Thor (2011)')

NameError: name 'content' is not defined

In [3]:
content.recommend('Journey 2: The Mysterious Island (2012)')

SyntaxError: invalid syntax (<ipython-input-3-f0aa34879fa7>, line 1)

If you would like to see some random recommendations, we have included the following method to generate suggestions based off random titles

In [4]:
random_film = content.random_title()
content.recommend(random_film)

NameError: name 'content' is not defined

Our system seems to be working out well! We could further improve the recommendations we are seeing by including more descriptive informations. Some additional information might be useful could be cast and crew names.



## Final Results

We had good success with both collaborative and content-based recommendation systems, as well as our Flask deployment. Our final collaborative model ended up with a mean average error of about 0.6, which is not bad on a 5-point rating scale. Our content based model is showing very good variety in picking movies that are similar in genre and description.

## Future Work

A good place to direct our efforts in the future would be speeding up our model training process so our app deployment can work faster. We should also consider taking parts of our content and collaboration systems to make a hybrid recommender system that makes SUPER GOOD recommendations.