- Here, we will look at different ways to use data to make recommendations
- like in Twitter, Netflix, Youtube etc make recoomendations that you want to follow or watch.
- Let's think of a problem of <u>recommending new interests to a user based on her currently specified interests.</u>

In [2]:
# Let's use this dataset

users_interests = [
["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm", "Cassandra"],
["NoSQL", "MongoDB", "Cassandra", "HBase", "Postgres"],
["Python", "scikit-learn", "scipy", "numpy", "statsmodels", "pandas"],
["R", "Python", "statistics", "regression", "probability"],
["machine learning", "regression", "decision trees", "libsvm"],
["Python", "R", "Java", "C++", "Haskell", "programming languages"],
["statistics", "probability", "mathematics", "theory"],
["machine learning", "scikit-learn", "Mahout", "neural networks"],
["neural networks", "deep learning", "Big Data", "artificial intelligence"],
["Hadoop", "Java", "MapReduce", "Big Data"],
["statistics", "R", "statsmodels"],
["C++", "deep learning", "artificial intelligence", "probability"],
["pandas", "R", "Python"],
["databases", "HBase", "Postgres", "MySQL", "MongoDB"],
["libsvm", "regression", "support vector machines"]
]

# 1. Recommending What's popular
- When the user is new, and we just have a list of their existing interests
- Then we can suggest them what is popular to expand their interests 

In [3]:
from collections import Counter

popular_interests = Counter(interest for user_interests in users_interests
                           for interest in user_interests)
print(f"{popular_interests=}")

popular_interests=Counter({'Python': 4, 'R': 4, 'Big Data': 3, 'HBase': 3, 'Java': 3, 'statistics': 3, 'regression': 3, 'probability': 3, 'Hadoop': 2, 'Cassandra': 2, 'MongoDB': 2, 'Postgres': 2, 'scikit-learn': 2, 'statsmodels': 2, 'pandas': 2, 'machine learning': 2, 'libsvm': 2, 'C++': 2, 'neural networks': 2, 'deep learning': 2, 'artificial intelligence': 2, 'Spark': 1, 'Storm': 1, 'NoSQL': 1, 'scipy': 1, 'numpy': 1, 'decision trees': 1, 'Haskell': 1, 'programming languages': 1, 'mathematics': 1, 'theory': 1, 'Mahout': 1, 'MapReduce': 1, 'databases': 1, 'MySQL': 1, 'support vector machines': 1})


In [7]:
# Suggest most popular interests
from typing import List, Tuple

def most_popular_new_interests(user_interests: List[str],
                               max_results: int = 5) -> List[Tuple[str, int]]:
    suggestions = [(interest, frequency) 
                   for interest, frequency in popular_interests.most_common()
                   if interest not in user_interests]
    return suggestions[:max_results]              

In [8]:
# For user 1 with his user_interests ["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm", "Cassandra"]
# New most popular suggestions would we 
most_popular_new_interests(["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm", "Cassandra"])

[('Python', 4),
 ('R', 4),
 ('statistics', 3),
 ('regression', 3),
 ('probability', 3)]

In [9]:
# For user 3 suggestions would be
most_popular_new_interests(["Python", "scikit-learn", "scipy", "numpy", "statsmodels", "pandas"])

[('R', 4), ('Big Data', 3), ('HBase', 3), ('Java', 3), ('statistics', 3)]

# 2. User-based collaborative filtering

- Here, we search for users with similar interests as our target user.
- Then we use their interests to suggest them to target users.
- We do this by finding similiarity of two users using *cosine similarity*

**How do we do this?**.
- 1. We create a Unique Interests List with all the interests of all the users. (unique_interests)
- 2. Then we create a User Interest Vector based on above list. (1 at repective user's interests and rest 0)(user_interest_vector)
- 3. Then we calculate similarity between each user pair using cosine similarity. (user_similarities)
- 4. Then for each interest in lists of all similar users, we will add the its calculated similarity with target user
- 5. Sort it and suggest the max score new interest to target user.

**Understanding with Example**
- Imagine you have a few users with their respective interests:

> - User 0: ["Python", "Data Science"]
> - User 1: ["Machine Learning", "Data Science"]
> - User 2: ["Python", "Artificial Intelligence"]
> - User 3: ["Machine Learning", "Python"]

- Goal:
We want to suggest new interests to User 0 based on the interests of users who are similar to User 0.

- Steps:

1. Identify Similar Users:  

- Let's say, after calculating similarities using cosine similarity, we find:
> - User 1 has a similarity score of 0.7 with User 0.
> - User 2 has a similarity score of 0.8 with User 0.
> - User 3 has a similarity score of 0.6 with User 0.

2. Collect Interests of Similar Users:

> - User 1's interests: ["Machine Learning", "Data Science"]
> - User 2's interests: ["Python", "Artificial Intelligence"]
> - User 3's interests: ["Machine Learning", "Python"]

3. Aggregate Interest Scores:

- For each interest from the similar users, add the similarity score of the user who has that interest.
- "Machine Learning" is mentioned by User 1 and User 3: Score: 0.7 (from User 1) + 0.6 (from User 3) = 1.3
- "Artificial Intelligence" is mentioned by User 2: Score: 0.8 (from User 2) = 0.8
- "Data Science" and "Python" are already in User 0's interests, so we ignore them for new suggestions.

4. Sort and Suggest New Interests:

- Sort interests by score: [("Machine Learning", 1.3), ("Artificial Intelligence", 0.8)]
- Suggest the top interests not already in User 0's list:
- Suggested Interests: ["Machine Learning", "Artificial Intelligence"]

In [10]:
# Find a sorted set of unique interests of all users

unique_interests = sorted(set(interest for user_interests in users_interests
                          for interest in user_interests))
print(f"{unique_interests=}")

unique_interests=['Big Data', 'C++', 'Cassandra', 'HBase', 'Hadoop', 'Haskell', 'Java', 'Mahout', 'MapReduce', 'MongoDB', 'MySQL', 'NoSQL', 'Postgres', 'Python', 'R', 'Spark', 'Storm', 'artificial intelligence', 'databases', 'decision trees', 'deep learning', 'libsvm', 'machine learning', 'mathematics', 'neural networks', 'numpy', 'pandas', 'probability', 'programming languages', 'regression', 'scikit-learn', 'scipy', 'statistics', 'statsmodels', 'support vector machines', 'theory']


In [11]:
# interest vector for each user

Vector = List[int]

def make_user_interest_vector(user_interests: List[str]) -> Vector:
    """
    Given a list of interests, produce a vector whose ith element is 1
    if unique_interests[i] is in the list, 0 otherwise
    """
    return [1 if interest in user_interests else 0 for interest in unique_interests]

user_1_interests = ["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm", "Cassandra"]
v = make_user_interest_vector(user_1_interests)

print(f"user_1_interest_vector: {v}")

user_1_interest_vector: [1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


In [14]:
# interest vector for all users

user_interest_vectors = [make_user_interest_vector(user_interests) for user_interests in users_interests]
print(user_interest_vectors[:5])  # First five user's vector

[[1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]]


In [17]:
# Compute pairwise similarities

from scratch.nlp import cosine_similarity

user_similarities = ([[cosine_similarity(interest_vector_i, interest_vector_j) for interest_vector_i in user_interest_vectors]
                      for interest_vector_j in user_interest_vectors])
print(f"{user_similarities[0]=}")  # User 0's similarity with other users 
assert 0.15 < user_similarities[0][5] < 0.16   # Very few shared interests between user0 and user5
assert 0.56 < user_similarities[0][9] < 0.57   # Many shared interests between user0 and user9

user_similarities[0]=[1.0, 0.3380617018914066, 0.0, 0.0, 0.0, 0.1543033499620919, 0.0, 0.0, 0.1889822365046136, 0.5669467095138409, 0.0, 0.0, 0.0, 0.1690308509457033, 0.0]


In [19]:
# Function to return a list of users with their similarity with one user

from typing import Tuple
def most_similar_users_to(user_id: int) -> List[Tuple[int, float]]:
    pairs = [(id, similarity) for id, similarity in enumerate(user_similarities[user_id]) if similarity > 0 and id != user_id]
    return sorted(pairs, key=lambda pair: pair[-1], reverse=True)

print(most_similar_users_to(0))

[(9, 0.5669467095138409), (1, 0.3380617018914066), (8, 0.1889822365046136), (13, 0.1690308509457033), (5, 0.1543033499620919)]


In [20]:
# Make suggestions 
# Just add similarity for each interests

from collections import defaultdict

def user_based_suggestions(user_id: int, include_current_interests: bool = False):

    suggestions: Dict[str, float] = defaultdict(float)
    
    for other_user_id, similarity in most_similar_users_to(user_id):
        for interest in users_interests[other_user_id]:
            suggestions[interest] +=similarity
            
    # sort suggestions based on scores
    suggestions = sorted(suggestions.items(), key= lambda pair: pair[-1], reverse=True)

    # Exclude already existing interest of user_id
    if include_current_interests:
        return suggestions
    else:
        return [(suggestion,weight) for suggestion, weight in suggestions
                if suggestion not in users_interests[user_id]]
    

In [21]:
user_based_suggestions(0) 

[('MapReduce', 0.5669467095138409),
 ('MongoDB', 0.50709255283711),
 ('Postgres', 0.50709255283711),
 ('NoSQL', 0.3380617018914066),
 ('neural networks', 0.1889822365046136),
 ('deep learning', 0.1889822365046136),
 ('artificial intelligence', 0.1889822365046136),
 ('databases', 0.1690308509457033),
 ('MySQL', 0.1690308509457033),
 ('Python', 0.1543033499620919),
 ('R', 0.1543033499620919),
 ('C++', 0.1543033499620919),
 ('Haskell', 0.1543033499620919),
 ('programming languages', 0.1543033499620919)]

- The weights calculated (based on similarities) are used to rank and order the suggestions, but these weights are not inherently meaningful beyond that purpose.

- When the number of interests or items is small, user-based collaborative filtering can effectively suggest new interests. However, as the number of interests grows, it becomes increasingly difficult to find similar users, and recommendations may become less relevant or useful.

- In high-dimensional spaces (lots of interests), vectors (representing users' interests) are generally far apart and point in different directions. This makes it challenging to identify users with genuinely similar interest profiles.

- On a platform like Amazon, where a user might have bought thousands of items over many years, it's hard to find another user with a similar purchase history. Even if you find the "most similar" user, their interests might still be very different, leading to poor recommendations.

# 3. Item-based collaborative filtering

- Unlike user-based collaborative filtering, here we will sort suggestions based on similarities of interests.

  
**How will we do it?**

1. We had user_interest_vector, right? like this:

2. We will transpose this matrix to get interest-user matrix, like this:  

3. Then, we will find similarities between each interests, e.g. find similarity between these two vectors:

"Big Data": [1, 0, 1, 0, 1]
"Hadoop": [1, 0, 0, 0, 1]

4. Then we will add scores of interests similar to target users interests,
5. sort and suggest

In [22]:
print(f"{user_interest_vectors[0]=}")

user_interest_vectors[0]=[1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


In [28]:
# Create transpose matrix
from scratch.linear_algebra import shape

interest_user_matrix = [[user_interest_vector[j] for user_interest_vector in user_interest_vectors]
                        for j, _ in enumerate(unique_interests)]

assert shape(interest_user_matrix) == (36,15)        # 36 interests and 15 users                                                                                                                                                                                                                                                                                                                                                                                                                                                           
assert shape(user_interest_vectors) == (15,36) 
print(f"{interest_user_matrix[0]=}")                 # See, user 0, 8, 9 show interest in first interest "Big Data"

interest_user_matrix[0]=[1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0]


In [29]:
# Cosine similarity between each row i.e. each interest
# i.e. similarity of 36 interests with 36 interests
interest_similarities =  [[cosine_similarity(interest_vector_i, interest_vector_j)
                         for interest_vector_i in interest_user_matrix]
                         for interest_vector_j in interest_user_matrix]

assert shape(interest_similarities) == (36,36)
print(interest_similarities[0]) # similarity between "Big Data" and all other interests

[1.0, 0.0, 0.4082482904638631, 0.3333333333333333, 0.8164965809277261, 0.0, 0.6666666666666666, 0.0, 0.5773502691896258, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5773502691896258, 0.5773502691896258, 0.4082482904638631, 0.0, 0.0, 0.4082482904638631, 0.0, 0.0, 0.0, 0.4082482904638631, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]


In [30]:
def most_similar_interests_to(interest_id: int):
    similarities = interest_similarities[interest_id]
    pairs = [(unique_interests[other_interest_id], similarity)
             for other_interest_id, similarity in enumerate(similarities)
             if interest_id != other_interest_id and similarity > 0]
    return sorted(pairs, key= lambda pairs: pairs[-1], reverse=True)

In [31]:
most_similar_interests_to(0)

[('Hadoop', 0.8164965809277261),
 ('Java', 0.6666666666666666),
 ('MapReduce', 0.5773502691896258),
 ('Spark', 0.5773502691896258),
 ('Storm', 0.5773502691896258),
 ('Cassandra', 0.4082482904638631),
 ('artificial intelligence', 0.4082482904638631),
 ('deep learning', 0.4082482904638631),
 ('neural networks', 0.4082482904638631),
 ('HBase', 0.3333333333333333)]

In [32]:
# sum up similarities of interests similar to target user

# interests similar to target user's interest are:
# print(f" most_similar_interests = {most_similar_interests_to(0)}")

def item_based_suggestions(user_id: int,
                           include_current_interests: bool = False):
    # Add similar interests
    suggestions = defaultdict(float)
    user_interest_vector = user_interest_vectors[user_id]
    # print(f"{user_interest_vector=}")

    for interest_id, is_interested in enumerate(user_interest_vector):
        if is_interested == 1:
            # print(f"{interest_id=}")
            similar_interests = most_similar_interests_to(interest_id)
            # print(f"{similar_interests=}")
            for interest, similarity in similar_interests:
                suggestions[interest] += similarity
        # print(suggestions)
    
    # Sort them by weight
    suggestions = sorted(suggestions.items(),
                         key = lambda pair: pair[-1],
                         reverse= True)
    # print(suggestions)

    if include_current_interests:
        return suggestions
    else:
        return[(suggestion, weight) 
               for suggestion, weight in suggestions
               if suggestions not in users_interests[user_id]]
        

In [33]:
item_based_suggestions(0)  

[('Spark', 4.146264369941973),
 ('Storm', 4.146264369941973),
 ('Hadoop', 3.9554550146924106),
 ('Cassandra', 3.547206724228547),
 ('Java', 3.3794454097708404),
 ('Big Data', 3.3794454097708404),
 ('HBase', 3.0461120764375074),
 ('MapReduce', 1.861807319565799),
 ('MongoDB', 1.3164965809277263),
 ('Postgres', 1.3164965809277263),
 ('NoSQL', 1.2844570503761732),
 ('MySQL', 0.5773502691896258),
 ('databases', 0.5773502691896258),
 ('Haskell', 0.5773502691896258),
 ('programming languages', 0.5773502691896258),
 ('artificial intelligence', 0.4082482904638631),
 ('deep learning', 0.4082482904638631),
 ('neural networks', 0.4082482904638631),
 ('C++', 0.4082482904638631),
 ('Python', 0.2886751345948129),
 ('R', 0.2886751345948129)]

# 4. Matrix Factorization for Recommendation

- Sometimes, instead of just 1/0 interests matrix we have numeric ratings.
- Example: Amazon review rating varies 1-5 stars.
- In this section we’ll assume -- we have such ratings data and try to learn a model that can predict the rating for a given user and item.  


1. User preference and matrix representation
   
   - Users' preferences can be represented as a [num_users, num_items] matrix.
   - In scenarios like Amazon reviews, ratings (e.g., 1 to 5 stars) can be represented as numbers in the same matrix format.

2. Predictive Modeling

- Goal: Develop a model to predict ratings for a given user-item pair.
- Latent Types: Assume each user and item has a latent "type" represented by a vector of numbers.
- User Types: Represented as a [num_users, dim] matrix.
- Item Types: Represented as a [dim, num_items] matrix.

3. Matrix Factorization
- Factoring Process: Decompose the preference matrix into the product of a user matrix and an item matrix.
- The product of these matrices approximates the original user-item rating matrix.

4. Dataset used
- MovieLens 100k Dataset: Contains user ratings (0 to 5) for many movies.
- Users rate only a subset of movies.
- The system aims to predict ratings for movies a user hasn't rated.
- Data download link: <link>http://files.grouplens.org/datasets/movielens/ml-100k.zip.<link>

**Let's code matrix factorization**

1. We will only use two files
> - 1. u.item is pipe delimited file with many columns. But we require only first two columns (Movie id | Movie title)
> - 2. u.data is tab delimited file with 4 columns. We need only first three columns (User id  Movie id  Rating)


In [34]:
# variables holding path of file

MOVIES = "ml-100k/u.item"
RATINGS = "ml-100k/u.data"

2. Create a class of named tuple for easy handling of data

In [35]:
from typing import NamedTuple

class Rating(NamedTuple):
    user_id: str
    movies_id: str
    rating: float

# The movie ID and user IDs are actually integers, but they’re not consecutive, which
# means if we worked with them as integers we’d end up with a lot of wasted dimensions
# (unless we renumbered everything). So to keep it simpler we’ll just treat them as strings

3. Read files and explore.

In [36]:
# We specify this encoding to avoid a UnicodeDecodeError.
# See: https://stackoverflow.com/a/53136168/1076346.

import csv

with open(MOVIES, encoding = "iso-8859-1") as f:
    reader = csv.reader(f, delimiter = "|")
    movies = {movie_id: title for movie_id, title, *_ in reader}   # 1st column is id, 2nd column in movie title
print(list(movies.items())[:5])

[('1', 'Toy Story (1995)'), ('2', 'GoldenEye (1995)'), ('3', 'Four Rooms (1995)'), ('4', 'Get Shorty (1995)'), ('5', 'Copycat (1995)')]


In [37]:
# List of [Rating]

with open(RATINGS, encoding="iso-8859-1") as f:
    reader = csv.reader(f, delimiter = "\t")
    ratings = [Rating(user_id, movies_id, float(rating)) for user_id, movies_id, rating, _ in reader]

print(ratings[:5])

[Rating(user_id='196', movies_id='242', rating=3.0), Rating(user_id='186', movies_id='302', rating=3.0), Rating(user_id='22', movies_id='377', rating=1.0), Rating(user_id='244', movies_id='51', rating=2.0), Rating(user_id='166', movies_id='346', rating=1.0)]


In [38]:
assert len(movies) == 1682
assert len(list({rating.user_id for rating in ratings})) == 943  # becuase ratings is a dictionary


3. Let's play with the data -- find average ratings of 'Star Wars' movies

In [39]:
import re
from typing import Dict, List

# Create a dictionary of movie id and an empty list for each id of star wars movies

star_wars_ratings: Dict[int,List] = {
    movie_id: []
    for movie_id, title in movies.items()
    if re.search("Star Wars|Enpire Strikes Back|Jedi", title)
}
star_wars_ratings  # star wars movie id : empty list

{'50': [], '181': []}

In [40]:
# Iterate over ratings, accumulating Star wars ones
for rating in ratings:
    if rating.movies_id in star_wars_ratings:
        star_wars_ratings[rating.movies_id].append(int(rating.rating))  # movie_id : List of ratings by users

In [41]:
# Compute average rating of each movie
avg_ratings = [(sum(title_ratings) / len(title_ratings), movie_id) 
                for movie_id, title_ratings in star_wars_ratings.items()]

In [42]:
# print them in order
for avg_rating, movie_id in sorted(avg_ratings, reverse= True):
    print(f"{avg_rating:2f} {movies[movie_id]}")
    

4.358491 Star Wars (1977)
4.007890 Return of the Jedi (1983)


4. Let's create a model to predict these ratings

In [43]:
# split ratings data for training/testing

import random

random.seed(0)
random.shuffle(ratings)

split1 = int(len(ratings) * 0.7)
split2 = int(len(ratings) * 0.85)

train = ratings[:split1]  # 70% of data
validation = ratings[split1:split2]  # 15% of data
test = ratings[split2:] # 15% of data

4(a). Create a baseline model

- We will calculate avereage of ALL ratings in the table.
- It will provide a simple benchmark to compare against our complex model.
- If the complex trained model does not outperform this baseline model, its not worth of extra effort.(it may be overfitted/poor features selection/impelementation error)
- It allows us to measure error and improve our model.

Overall, it helps us in ensuring that improvement in our mdoel's prediction is due to its capabilities. Not Random or by chance.


In [45]:
# Create a baseline model

avg_rating = sum(int(rating.rating) for rating in train) / len(train)

baseline_error = sum([(rating.rating - avg_rating) ** 2
                     for rating in test]) / len(test)            # MSE error
print(baseline_error)  # We hope to do better than MSE 1.26

1.2609526646939684


4(b). Embeddings

1. We are making embeddings for users and movies. Embeddings are nothing but vectors representing these numbers (users/movies)
2. Matrix product of these embeddings will give us a vector which will be converged to user ratings. We need to train our emebddings in such way that their product will approximate their ratings.
3. We store embeddings in dict {ID:Vector}

In [46]:
from scratch.deep_learning import random_tensor

EMBEDDING_DIM = 2 # 2 element vectors

# Find unique ids

user_ids = {rating.user_id for rating in ratings}   # set of user ids
movie_ids = {rating.movies_id for rating in ratings}  # set of movie ids

In [47]:
# Random vector per id

user_vectors = {user_id: random_tensor(EMBEDDING_DIM) 
                for user_id in user_ids}
movie_vectors = {movie_id: random_tensor(EMBEDDING_DIM)
                 for movie_id in movie_ids}

In [49]:
# Let's write training loop
from typing import List
import tqdm
from scratch.linear_algebra import dot

def loop(dataset: List[Rating], learning_rate: float = None) -> None:
    with tqdm.tqdm(dataset) as t:  # Initializes dataset with progress bar
        loss = 0 
        avg_loss = 0
        for i, rating in enumerate(t):  # i is index of each row of dataset and rating is tuple
            movie_vector = movie_vectors[rating.movies_id]
            user_vector = user_vectors[rating.user_id]
            predicted = dot(user_vector, movie_vector)
            error = predicted - float(rating.rating)
            loss += error**2

            if learning_rate is not None:  # if learning rate is given
                user_gradient = [error * m_j for m_j in movie_vector] # d(error**2)/d(u)) = 2*error*m
                movie_gradient = [error * u_j for u_j in user_vector]

                # Take gradient steps
                for j in range(EMBEDDING_DIM):
                    user_vector[j] -= learning_rate * user_gradient[j]   # In place updatation of user_vectors
                    movie_vector[j] -= learning_rate * movie_gradient[j] # In place updation of movie_vectors
            avg_loss = loss/(i+1)
            if i%1000 == 0:
                t.set_description(f"avg loss: {avg_loss=}")   # Calculate avg loss for each step

In [51]:
# x = int(0.7 * len(ratings))
# train_data2 = ratings[:x]

learning_rate = 0.05
for epoch in range(20):
    learning_rate *= 0.9  #let's reduce LR with each step
    # print(epoch, learning_rate)
    loop(train, learning_rate=learning_rate)
    loop(validation)  
# print(movie_vectors)
loop(test)

avg loss: avg_loss=0.9314589262782678: 100%|██████████| 70000/70000 [00:00<00:00, 534661.39it/s]
avg loss: avg_loss=0.9970691628623705: 100%|██████████| 15000/15000 [00:00<00:00, 1197209.57it/s]
avg loss: avg_loss=0.9189001503117463: 100%|██████████| 70000/70000 [00:00<00:00, 636801.78it/s]
avg loss: avg_loss=0.9810445040724082: 100%|██████████| 15000/15000 [00:00<00:00, 1157773.32it/s]
avg loss: avg_loss=0.8996865255800943: 100%|██████████| 70000/70000 [00:00<00:00, 647063.72it/s]
avg loss: avg_loss=0.9663587952482031: 100%|██████████| 15000/15000 [00:00<00:00, 1151184.95it/s]
avg loss: avg_loss=0.8827232259038751: 100%|██████████| 70000/70000 [00:00<00:00, 640368.52it/s]
avg loss: avg_loss=0.9535585670634156: 100%|██████████| 15000/15000 [00:00<00:00, 1194164.56it/s]
avg loss: avg_loss=0.8678858520642012: 100%|██████████| 70000/70000 [00:00<00:00, 808464.83it/s]
avg loss: avg_loss=0.942482809725308: 100%|██████████| 15000/15000 [00:00<00:00, 401884.14it/s]
avg loss: avg_loss=0.854866

- The training model gives avg_loss of 0.76 after 20 epochs
- In validation data we got avg_loss of 0.89
- In test data it is 0.911
- Now, we will see PCA to inspect the learned vectors.