<a href="https://colab.research.google.com/github/steimel60/ML/blob/main/DeepLearning/MovieRecs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()
from fastbook import *
from fastai.vision.all import *

[K     |████████████████████████████████| 719 kB 17.8 MB/s 
[K     |████████████████████████████████| 197 kB 62.7 MB/s 
[K     |████████████████████████████████| 346 kB 62.6 MB/s 
[K     |████████████████████████████████| 1.2 MB 43.5 MB/s 
[K     |████████████████████████████████| 4.2 MB 62.8 MB/s 
[K     |████████████████████████████████| 59 kB 9.7 MB/s 
[K     |████████████████████████████████| 1.1 MB 63.8 MB/s 
[K     |████████████████████████████████| 86 kB 8.8 MB/s 
[K     |████████████████████████████████| 140 kB 70.4 MB/s 
[K     |████████████████████████████████| 212 kB 72.4 MB/s 
[K     |████████████████████████████████| 86 kB 1.4 MB/s 
[K     |████████████████████████████████| 596 kB 64.6 MB/s 
[K     |████████████████████████████████| 127 kB 80.7 MB/s 
[K     |████████████████████████████████| 144 kB 72.2 MB/s 
[K     |████████████████████████████████| 271 kB 77.1 MB/s 
[K     |████████████████████████████████| 94 kB 4.2 MB/s 
[K     |███████████████████████

In [2]:
#Get Movie Rating Data
path = untar_data(URLs.ML_100k)
ratings = pd.read_csv(path/'u.data', delimiter='\t',header=None,names=['user','movie','rating','timestamp'])
ratings.head()

Unnamed: 0,user,movie,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


In [3]:
movies = pd.read_csv(path/'u.item', delimiter='|', encoding='latin-1', usecols=(0,1), names=('movie','title'), header=None)
movies.head()

Unnamed: 0,movie,title
0,1,Toy Story (1995)
1,2,GoldenEye (1995)
2,3,Four Rooms (1995)
3,4,Get Shorty (1995)
4,5,Copycat (1995)


In [4]:
#Merge movies and ratings
ratings = ratings.merge(movies)
ratings.head()

Unnamed: 0,user,movie,rating,timestamp,title
0,196,242,3,881250949,Kolya (1996)
1,63,242,3,875747190,Kolya (1996)
2,226,242,5,883888671,Kolya (1996)
3,154,242,3,879138235,Kolya (1996)
4,306,242,5,876503793,Kolya (1996)


In [5]:
from fastai.collab import *
from fastai.tabular.all import *

#Build Data Loaders
dls = CollabDataLoaders.from_df(ratings, item_name='title', bs=64)
dls.show_batch()

Unnamed: 0,user,title,rating
0,542,My Left Foot (1989),4
1,422,Event Horizon (1997),3
2,311,"African Queen, The (1951)",4
3,595,Face/Off (1997),4
4,617,Evil Dead II (1987),1
5,158,Jurassic Park (1993),5
6,836,Chasing Amy (1997),3
7,474,Emma (1996),3
8,466,Jackie Chan's First Strike (1996),3
9,554,Scream (1996),3


In [6]:
#Represent users and movies as matrices
n_users = len(dls.classes['user'])
n_movies = len(dls.classes['title'])
n_factors = 5

user_factors = torch.randn(n_users, n_factors)
movie_factors = torch.randn(n_movies, n_factors)

Our model will work by learning a certain number of factors (could be genre, or a popular actor, etc.) Then depending on the users likes/dislikes we can take the dotproduct of the user factors and movie factors to get an idea of whether or not they will enjoy a movie.

In [7]:
class DotProduct(Module):
  def __init__(self, n_users, n_movies, n_factors, y_range=(0,5.5)):
    self.user_factors = Embedding(n_users, n_factors)
    self.movie_factors = Embedding(n_movies, n_factors)
    self.y_range = y_range
  
  def forward(self, x):
    #Called when inheriting from a PyTorch class
    users = self.user_factors(x[:,0])
    movies = self.movie_factors(x[:,1])
    return sigmoid_range((users * movies).sum(dim=1), *self.y_range)

In [8]:
model = DotProduct(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3)

epoch,train_loss,valid_loss,time
0,1.005721,0.999734,00:06
1,0.885945,0.905953,00:06
2,0.693833,0.876002,00:06
3,0.484503,0.874067,00:06
4,0.369077,0.877741,00:06


This model can be improved by accounting for how positive/negative a user is in their reviews and how good or bad a movie is. To improve our model by accounting for this we can add Biases to our model

In [9]:
class DotProductBias(Module):
  def __init__(self, n_users, n_movies, n_factors, y_range=(0,5.5)):
    self.user_factors = Embedding(n_users, n_factors)
    self.user_bias = Embedding(n_users, 1)
    self.movie_factors = Embedding(n_movies, n_factors)
    self.movie_bias = Embedding(n_movies, 1)
    self.y_range = y_range

  def forward(self, x):
    #Called when inheriting from a PyTorch class
    users = self.user_factors(x[:,0])
    movies = self.movie_factors(x[:,1])
    res = (users * movies).sum(dim=1, keepdim=True)
    res += self.user_bias(x[:,0]) + self.movie_bias(x[:,1])
    return sigmoid_range(res, *self.y_range)

In [10]:
model = DotProductBias(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5,5e-3)

epoch,train_loss,valid_loss,time
0,0.928226,0.941926,00:07
1,0.821354,0.864699,00:07
2,0.61655,0.869891,00:06
3,0.410764,0.890642,00:07
4,0.292861,0.897089,00:07


Our training loss improved but validation loss worsened - we are overfitting. We can help fix this with Weight Decay. The basic thought is that we will add the sum of all the weights squared to our loss function, encouraging minimization of these values. Higher weights can cause more complex functions and overfitting.

In [11]:
#Use the parameter wd for weight decay
model = DotProductBias(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=.1)

epoch,train_loss,valid_loss,time
0,0.96501,0.95394,00:09
1,0.855112,0.8849,00:07
2,0.753521,0.83965,00:07
3,0.579463,0.824811,00:07
4,0.494321,0.826084,00:07


Now we will break down the Embedding class we used above

In [12]:
#PyTorch requires passing a param as nn.Parameter object to be recognized as a parameter
#Example
class T(Module):
  def __init__(self): self.a = nn.Parameter(torch.ones(3))
L(T().parameters())

(#1) [Parameter containing:
tensor([1., 1., 1.], requires_grad=True)]

In [23]:
#Func to create tensor param with random init
def create_params(size):
  return nn.Parameter(torch.zeros(*size).normal_(0,.01))
#Rewrite our DotProductBias class
class DotProductBias(Module):
  def __init__(self, n_users, n_movies, n_factors, y_range=(0,5.5)):
    self.user_factors = create_params([n_users, n_factors])
    self.user_bias = create_params([n_users,1])
    self.movie_factors = create_params([n_movies, n_factors])
    self.movie_bias = create_params([n_movies,1])
    self.y_range = y_range

  def forward(self, x):
    #Called when inheriting from a PyTorch class
    users = self.user_factors[x[:,0]]
    movies = self.movie_factors[x[:,1]]
    res = (users * movies).sum(dim=1, keepdim=True)
    res += self.user_bias[x[:,0]] + self.movie_bias[x[:,1]]
    return sigmoid_range(res, *self.y_range)

In [24]:
#Train with new model
model = DotProductBias(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5,5e-3,wd=.1)

epoch,train_loss,valid_loss,time
0,0.96764,0.949474,00:07
1,0.831107,0.871592,00:07
2,0.735717,0.835443,00:07
3,0.577442,0.825345,00:07
4,0.477205,0.825371,00:07


We can view our learned biases to get info about the movies

In [25]:
#Low bias means people don't like it even if it fits their typical movie type
movie_bias = learn.model.movie_bias.squeeze()
idxs = movie_bias.argsort()[:5]
[dls.classes['title'][i] for i in idxs]

['Children of the Corn: The Gathering (1996)',
 'Lawnmower Man 2: Beyond Cyberspace (1996)',
 'Beautician and the Beast, The (1997)',
 'Robocop 3 (1993)',
 'Mortal Kombat: Annihilation (1997)']

In [26]:
#High bias means its popular even if its not typically what a user watches
idxs = movie_bias.argsort(descending=True)[:5]
[dls.classes['title'][i] for i in idxs]

['Titanic (1997)',
 'L.A. Confidential (1997)',
 'Shawshank Redemption, The (1994)',
 'Star Wars (1977)',
 "Schindler's List (1993)"]

Using fastai.collab

In [27]:
learn = collab_learner(dls, n_factors=50, y_range=(0,5.5))
learn.fit_one_cycle(5, 5e-3, wd=.1)

epoch,train_loss,valid_loss,time
0,0.938367,0.951819,00:10
1,0.853161,0.879172,00:07
2,0.728618,0.837595,00:07
3,0.57375,0.823362,00:08
4,0.490188,0.822555,00:11


In [28]:
#Print layer names
learn.model

EmbeddingDotBias(
  (u_weight): Embedding(944, 50)
  (i_weight): Embedding(1665, 50)
  (u_bias): Embedding(944, 1)
  (i_bias): Embedding(1665, 1)
)

In [29]:
#use layer names to replicate previous analysis
movie_bias = learn.model.i_bias.weight.squeeze()
idxs = movie_bias.argsort(descending=True)[:5]
[dls.classes['title'][i] for i in idxs]

['L.A. Confidential (1997)',
 'Titanic (1997)',
 'Silence of the Lambs, The (1991)',
 'Shawshank Redemption, The (1994)',
 'Star Wars (1977)']

We can create a distance formula sqrt(*layers^2) to find similar movies - movies with similar distances should be alike

In [30]:
movie_factors = learn.model.i_weight.weight
idx = dls.classes['title'].o2i['Silence of the Lambs, The (1991)']
distances = nn.CosineSimilarity(dim=1)(movie_factors, movie_factors[idx][None])
idx = distances.argsort(descending=True)[1]
dls.classes['title'][idx]

'Bewegte Mann, Der (1994)'

###Or... Deep Learning

We can also make a recomendation system by changing our architecture to a deep learning model

In [31]:
embs = get_emb_sz(dls)
embs

[(944, 74), (1665, 102)]

In [36]:
class CollabNN(Module):
  def __init__(self, user_sz, item_sz, y_range=(0,5.5), n_act=100):
    self.user_factors = Embedding(*user_sz)
    self.item_factors = Embedding(*item_sz)
    self.layers = nn.Sequential(
        nn.Linear(user_sz[1] + item_sz[1], n_act),
        nn.ReLU(),
        nn.Linear(n_act, 1)
    )
    self.y_range = y_range

  def forward(self, x):
    embs = self.user_factors(x[:,0]), self.item_factors(x[:,1])
    x = self.layers(torch.cat(embs, dim=1))
    return sigmoid_range(x, *self.y_range)

model = CollabNN(*embs)

In [37]:
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=.1)

epoch,train_loss,valid_loss,time
0,0.952649,0.967145,00:08
1,0.931209,0.909929,00:07
2,0.869018,0.89534,00:07
3,0.835766,0.878659,00:07
4,0.78591,0.879087,00:07
