<a href="https://colab.research.google.com/github/Firojpaudel/Machine-Learning-Notes/blob/main/Practical%20Deep%20Learning%20For%20Coders/Chapter_8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Collaborative Filtering: _Movie Recommendation_

**Key Idea:** Recommend items based on user behavior patterns, not item features.

**Example:** Netflix suggests movies by finding users with similar viewing histories.

**Latent Factors:** Hidden preferences (e.g., genre, era) inferred from data, not explicitly stated.
***

#### **Dataset: MovieLens**
We use a 100k subset of MovieLens, containing:

- `User ID, Movie ID, Rating, Timestamp`

In [1]:
## First Setting up the notebook

%reload_ext autoreload
%autoreload 2
%matplotlib inline

## Installing the dependencies
!pip install -Uqq fastbook
import fastbook
# fastbook.setup_book()

## Importing the necessary libraries
from fastbook import *
from fastai.callback.fp16 import *
from fastai.collab import *
from fastai.tabular.all import *
from fastai.vision.all import *

##### A. Getting Dataset

In [2]:
path = untar_data(URLs.ML_100k)
path.ls()

(#23) [Path('/root/.fastai/data/ml-100k/u1.test'),Path('/root/.fastai/data/ml-100k/u4.test'),Path('/root/.fastai/data/ml-100k/u.info'),Path('/root/.fastai/data/ml-100k/u.data'),Path('/root/.fastai/data/ml-100k/u1.base'),Path('/root/.fastai/data/ml-100k/u.user'),Path('/root/.fastai/data/ml-100k/u4.base'),Path('/root/.fastai/data/ml-100k/ub.base'),Path('/root/.fastai/data/ml-100k/u5.base'),Path('/root/.fastai/data/ml-100k/u.occupation'),Path('/root/.fastai/data/ml-100k/u3.base'),Path('/root/.fastai/data/ml-100k/u.item'),Path('/root/.fastai/data/ml-100k/u5.test'),Path('/root/.fastai/data/ml-100k/ua.base'),Path('/root/.fastai/data/ml-100k/mku.sh'),Path('/root/.fastai/data/ml-100k/README'),Path('/root/.fastai/data/ml-100k/ua.test'),Path('/root/.fastai/data/ml-100k/allbut.pl'),Path('/root/.fastai/data/ml-100k/u2.test'),Path('/root/.fastai/data/ml-100k/u.genre')...]

In [3]:
ratings = pd.read_csv(
    path/'u.data', delimiter='\t', header=None,
    names= ['user', 'movie', 'rating', 'timestamp']
)

ratings.head()

Unnamed: 0,user,movie,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


In [4]:
##@ Optional Block....
last_skywalker = np.array([0.98, 0.9, -0.9])
user1 = np.array([0.9, 0.8, -0.6])
(user1 * last_skywalker).sum() ## Getting the dot product

2.1420000000000003

##### B. Creating DataLoaders

In [5]:
#@ Loading  the Movie Titles
movies= pd.read_csv(path/'u.item', delimiter='|', encoding='latin-1', usecols=(0,1),
                    names= ('movie', 'title'), header= None)
ratings= ratings.merge(movies)
ratings.head()

Unnamed: 0,user,movie,rating,timestamp,title
0,196,242,3,881250949,Kolya (1996)
1,186,302,3,891717742,L.A. Confidential (1997)
2,22,377,1,878887116,Heavyweights (1994)
3,244,51,2,880606923,Legends of the Fall (1994)
4,166,346,1,886397596,Jackie Brown (1997)


In [6]:
#@ Constructing DataLoaders

dls = CollabDataLoaders.from_df(ratings, item_name= 'title', bs=64)
dls.show_batch()

Unnamed: 0,user,title,rating
0,782,Starship Troopers (1997),2
1,943,Judge Dredd (1995),3
2,758,Mission: Impossible (1996),4
3,94,Farewell My Concubine (1993),5
4,23,Psycho (1960),4
5,296,Secrets & Lies (1996),5
6,940,"American President, The (1995)",4
7,334,Star Trek VI: The Undiscovered Country (1991),1
8,380,Braveheart (1995),4
9,690,So I Married an Axe Murderer (1993),1


In [7]:
#@ CONVERTING INTO MATRICES:
n_users = len(dls.classes["user"])
n_movies = len(dls.classes["title"])
n_factors = 5
user_factors = torch.randn(n_users, n_factors)
movie_factors = torch.randn(n_movies, n_factors)

In [8]:
#@ IMPLEMENTATION OF ONE HOT VECTORS:
one_hot_3 = one_hot(3, n_users).float()
user_factors.t() @ one_hot_3
user_factors[3]

tensor([-0.4586, -0.9915, -0.4052, -0.3621, -0.5908])

**Embedding:**
- The special layer that indexes into a vector using an integer but has its derivative calculated in such a way that it is identical to what it would have been if it had done a matrix multiplication with a one hot encoded vector is called **Embedding**. Multiplying by a one hot encoded matrix using the computational shortcut that it can be implemented by simply indexing directly. The thing that multiply the one hot encoded matrix is called the **Embedding Matrix**.

##### **1. Introduction to Collaborative Filtering**

Collaborative filtering predicts user preferences by analyzing their interactions with items and comparing them to similar users.

- Example: Movie recommendation systems.
- **Core Idea**: Use a matrix of user-item interactions to learn latent factors for users and items.
- Often modeled as a regression task to predict interaction scores (e.g., ratings).

##### **2. Dot Product Model**





A basic model that computes user-item interaction using the dot product of their latent factors.

In [9]:
#@ Model Definition
class DotProduct(Module):
    def __init__(self, n_users, n_movies, n_factors):
        self.user_factors = Embedding(n_users, n_factors)
        self.movie_factors = Embedding(n_movies, n_factors)

    def forward(self, x):
        users = self.user_factors(x[:, 0])
        movies = self.movie_factors(x[:, 1])
        return (users * movies).sum(dim=1)

In [10]:
x, y = dls.one_batch()
x.shape

torch.Size([64, 2])

In [11]:
'''
Training:
Using `Learner` from FastAI to train the model.
'''
model = DotProduct(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3)

epoch,train_loss,valid_loss,time
0,1.323817,1.340935,00:16
1,1.017078,1.092095,00:07
2,0.873056,0.974699,00:09
3,0.762055,0.89763,00:08
4,0.719645,0.874279,00:08


---

##### **3. Adding Constraints to Predictions**
Real-world interaction scores often have fixed bounds (e.g., movie ratings between 0 and 5). Add constraints using a **sigmoid range**.


In [12]:
#@@ Modified Model
class DotProductWithRange(Module):
    def __init__(self, n_users, n_movies, n_factors, y_range=(0, 5.5)):
        self.user_factors = Embedding(n_users, n_factors)
        self.movie_factors = Embedding(n_movies, n_factors)
        self.y_range = y_range

    def forward(self, x):
        users = self.user_factors(x[:, 0])
        movies = self.movie_factors(x[:, 1])
        res = (users * movies).sum(dim=1)
        return sigmoid_range(res, *self.y_range)

In [13]:
##@ Training again
model = DotProduct(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3)

epoch,train_loss,valid_loss,time
0,1.339878,1.337699,00:16
1,1.049726,1.099613,00:12
2,0.919396,1.007986,00:08
3,0.818623,0.911449,00:08
4,0.793476,0.886434,00:08


---

##### **4. Incorporating User and Movie Bias**
Some users and items exhibit biases:
- Users might rate higher/lower on average.
- Some movies might generally be rated higher/lower.

In [14]:
#@ Model with Bias
class DotProductBias(Module):
    def __init__(self, n_users, n_movies, n_factors, y_range=(0, 5.5)):
        self.user_factors = Embedding(n_users, n_factors)
        self.movie_factors = Embedding(n_movies, n_factors)
        self.user_bias = Embedding(n_users, 1)
        self.movie_bias = Embedding(n_movies, 1)
        self.y_range = y_range

    def forward(self, x):
        users = self.user_factors(x[:, 0])
        movies = self.movie_factors(x[:, 1])
        res = (users * movies).sum(dim=1, keepdim=True)
        res += self.user_bias(x[:, 0]) + self.movie_bias(x[:, 1])
        return sigmoid_range(res, *self.y_range)

In [15]:
#@ Training again for the model with bias
model = DotProductBias(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3)

epoch,train_loss,valid_loss,time
0,0.867532,0.946176,00:08
1,0.564744,0.906652,00:09
2,0.418702,0.939039,00:09
3,0.316487,0.951245,00:08
4,0.303472,0.952837,00:10


---

##### **5. Regularization: Weight Decay**
Overfitting occurs when the model fits the training data too well, leading to poor generalization.  
Apply **Weight Decay** (L2 regularization) to reduce this risk.

In [16]:
##@ Training with Weight Decay
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=0.1)

epoch,train_loss,valid_loss,time
0,0.336025,0.935951,00:10
1,0.372306,0.920442,00:08
2,0.341161,0.905442,00:09
3,0.304106,0.893533,00:09
4,0.288899,0.89032,00:08


---

##### **6. Creating a Custom Embedding Layer**
Instead of using `nn.Embedding`, embeddings can be created manually using `nn.Parameter`.

In [17]:
##@ Custom Embedding Model

def create_params(size):
    return nn.Parameter(torch.zeros(*size).normal_(0, 0.01))

class DotProductBiasCustom(Module):
    def __init__(self, n_users, n_movies, n_factors, y_range=(0, 5.5)):
        self.user_factors = create_params([n_users, n_factors])
        self.movie_factors = create_params([n_movies, n_factors])
        self.user_bias = create_params([n_users])
        self.movie_bias = create_params([n_movies])
        self.y_range = y_range

    def forward(self, x):
        users = self.user_factors[x[:, 0]]
        movies = self.movie_factors[x[:, 1]]
        res = (users * movies).sum(dim=1)
        res += self.user_bias[x[:, 0]] + self.movie_bias[x[:, 1]]
        return sigmoid_range(res, *self.y_range)

In [18]:
##@ Training the Custom Embedded Model
model = DotProductBiasCustom(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=0.1)

epoch,train_loss,valid_loss,time
0,0.878707,0.940981,00:10
1,0.676217,0.885803,00:09
2,0.512723,0.869106,00:09
3,0.457896,0.857801,00:09
4,0.444701,0.853635,00:09


In [20]:
## This much for today.
"will continue from here onwards tomorrow Page number 291"

'will continue from here onwards tomorrow Page number 291'