### **Movie Recommendation with User Preferences**

**Scenario:** You're building a movie recommendation system. Users can rate movies on a scale of 1 (dislike) to 5 (like). Your goal is to create a system that recommends movies to users based on their preferences.

**Tasks:**

**1. Data Representation:**

1. Represent each user as a vector where each element corresponds to a movie genre. Initialize all elements to 0.
2. For each movie rating by a user, update the corresponding genre element in their vector by the rating value.

**2. Recommendation:**

1. Define a function to recommend movies for a user. This function can take the user's preference vector (or the PCA-reduced version) and a list of movies with genre information as input.
2. Calculate a similarity score between the user's preference vector and the genre vector of each movie (e.g., using dot product).
3. Recommend movies with the highest similarity scores to the user.


### Importing Libraries

In [1]:
import pandas as pd
import numpy as np

### Reading CSV file

In [2]:
url = "/home/haria/Documents/Learnings/LLM Advanced/Mathematics/movie_ratings.csv"
movie_ratings = pd.read_csv(url, encoding='utf-8')

### Printing 5 rows

In [3]:
movie_ratings.head()

Unnamed: 0,movieId,userId,title,genres,rating
0,m1,u1,The Godfather,"['Crime', 'Drama']",5
1,m2,u1,The Shawshank Redemption,"['Drama', 'Crime']",4
2,m3,u1,The Dark Knight,"['Action', 'Crime']",3
3,m4,u2,The Lord of the Rings: The Fellowship of the Ring,"['Adventure', 'Fantasy']",4
4,m5,u2,The Matrix,"['Action', 'Sci-Fi']",5


### Getting unique genres

In [4]:
get_genres = set()
for genres in movie_ratings['genres']:
    get_genres.update(eval(genres))

get_genres

{'Action',
 'Adventure',
 'Comedy',
 'Crime',
 'Drama',
 'Fantasy',
 'History',
 'Romance',
 'Sci-Fi',
 'Thriller',
 'War'}

### Adding index to each unique genre

In [5]:
genre_ids = {genre: idx for idx, genre in enumerate(get_genres)}

print("\nUnique Movie Genres with IDs:")
for genre in get_genres:
    print(f"{genre}: {genre_ids[genre]}")


Unique Movie Genres with IDs:
Adventure: 0
Sci-Fi: 1
Comedy: 2
Crime: 3
War: 4
Action: 5
Drama: 6
Thriller: 7
History: 8
Romance: 9
Fantasy: 10


### Creating and intializing User vectors 

In [7]:
user_vector = {}
for user_id in movie_ratings['userId'].unique():
    user_vector[user_id] = [0] * len(get_genres)
user_vector

{'u1': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u2': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u3': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u4': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u5': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u6': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u7': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u8': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u9': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u10': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u11': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u12': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u13': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u14': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'u15': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}

### Updating User vectors with ratings

In [8]:
for idx, row in movie_ratings.iterrows():
    user_id = row['userId']
    genres = eval(row['genres'])
    rating = row['rating']

    for genre in genres:
        genre_idx = genre_ids[genre]
        if (user_vector[user_id][genre_idx] == 0):
            user_vector[user_id][genre_idx] += rating
        else:
            current_mean = user_vector[user_id][genre_idx]
            new_mean = (current_mean + rating) / 2
            user_vector[user_id][genre_idx] = new_mean

print("\nUser Vectors:")
for user_id, vector in user_vector.items():
    print(f"User {user_id}: {vector}")


User Vectors:
User u1: [0, 0, 0, 3.75, 0, 3, 4.5, 0, 0, 0, 0]
User u2: [4, 5, 0, 0, 0, 5, 0, 0, 0, 0, 4]
User u3: [5, 5, 4, 4, 0, 5, 0, 0, 0, 0, 0]
User u4: [0, 0, 0, 5, 0, 0, 4.5, 0, 0, 4, 5]
User u5: [0, 0, 0, 4, 0, 0, 5, 4, 0, 0, 0]
User u6: [0, 0, 0, 0, 4, 0, 4.5, 0, 5, 0, 0]
User u7: [0, 0, 0, 4.5, 0, 0, 4.5, 0, 0, 0, 0]
User u8: [4, 0, 0, 5, 0, 5, 0, 0, 0, 0, 4]
User u9: [4, 5, 0, 0, 4, 5, 0, 0, 0, 0, 0]
User u10: [0, 0, 5, 4.5, 0, 0, 4, 0, 0, 0, 0]
User u11: [0, 0, 0, 5, 0, 0, 4.5, 5, 0, 4, 0]
User u12: [0, 0, 0, 5, 0, 0, 4.5, 4, 0, 0, 5]
User u13: [0, 0, 0, 0, 4, 0, 4.5, 0, 5, 0, 0]
User u14: [0, 0, 0, 4.5, 0, 0, 4.5, 0, 0, 0, 0]
User u15: [0, 0, 0, 5, 0, 5, 0, 0, 0, 0, 0]


### recommend_movies function to recommend movies

In [9]:
def recommend_movies(user_id, user_vectors, movie_df, genre_ids, num_recommendations=3):
    user_vector = np.array(user_vectors[user_id])
    recommendations = []
    for idx, row in movie_df.iterrows():
        movie_title = row['title']
        genres = row['genres'].split('|')
        movie_vector = np.zeros(len(genre_ids))
        for genre in genres:
            if genre in genre_ids:
                genre_idx = genre_ids[genre]
                movie_vector[genre_idx] = 1
        similarity_score = np.dot(user_vector, movie_vector)
        recommendations.append((movie_title, similarity_score))
    recommendations.sort(key=lambda x: x[1], reverse=True)
    return recommendations[:num_recommendations]

### User Input to the function

In [12]:
movies_df = pd.DataFrame({
    'title': ['Life Is Beautiful', 'Cobra', 'Student Of The Year', 'Evaru', 'Ghazi Attack', 'Major'],
    'genres': ['Fantasy|Comedy', 'Action|Thriller', 'Adventure|Comedy', 'Crime|Thriller', 'War|History',
               'Action|Thriller']
})
user_id = "u1"
recommendations = recommend_movies(user_id, user_vector, movies_df, genre_ids)
print("\nRecommended Movies:")
for movie, score in recommendations:
    print(f"{movie} (Similarity Score: {score})")


Recommended Movies:
Evaru (Similarity Score: 3.75)
Cobra (Similarity Score: 3.0)
Major (Similarity Score: 3.0)
