## Individual Recommender system
Running the cell below you can actually put the individual recommender system in practice. Do not add any actual code about the recommenders in this notebook, do that in 'recommender.ipynb' and then import the necessary methods you need. (ask chatgpt how to do so) 

In [10]:
import pandas as pd
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import keyboard
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.neighbors import KNeighborsRegressor

In [57]:

ratings_df = pd.read_csv('Data/filtered_ratings.csv')
movies = pd.read_csv('Data/movies.csv')

movie_ratings = {}

while True:
    movie_name = input("Enter a movie name (or 'done' to finish): ")
    
    if movie_name.lower() == 'done' or keyboard.is_pressed('esc'):
        break
    
    matched_movies = process.extractOne(movie_name, movies['title'], scorer=fuzz.token_set_ratio)

    if matched_movies[1] >= 80:  # Similarity threshold
        movie_id = movies[movies['title'] == matched_movies[0]]['movieId'].values[0]
        correct_movie_name = matched_movies[0]
        while True:
            rating = input(f"Enter a rating for '{correct_movie_name}' (1-5): ")
            try:
                rating = float(rating)
                if 1 <= rating <= 5:
                    break
               
                else:
                    print("Rating must be between 1 and 5.")
            except ValueError:
                print("Invalid rating. Please enter a number between 1 and 5.")
        
        movie_ratings[movie_id] = rating
        print(f"Rating for '{correct_movie_name}' added.")
    else:
        print(f"Movie '{movie_name}' not found in the database.")
    
if movie_ratings:
    while True:
        recommendation_type = input("Enter 'genre' for genre-based recommendation or 'user' for similar user recommendation: ").lower()
        if recommendation_type == 'genre':

            #----------------------------ADD YOUR WORK HERE-----------------------------
            print('ADD HERE YOUR GENRE-BASED RECOMMENDATIONS BASED ON \'movie_ratings\'')
            print('ADD HERE YOUR EXPLANATION ON THE MOVIE RECOMMENDATIONS')
            #----------------------------ADD YOUR WORK HERE-----------------------------

        elif recommendation_type == 'user':

            #----------------------------ADD YOUR WORK HERE-----------------------------
            print("ADD HERE YOUR USER-BASED RECOMMENDATIONS BASED ON \'movie_ratings\'")
            print('ADD HERE YOUR EXPLANATION ON THE MOVIE RECOMMENDATIONS')
            #----------------------------ADD YOUR WORK HERE-----------------------------

        else:
            print("Invalid choice. Please enter 'genre' or 'user'.")
        break


print("MovieId-Rating Dictionary:")
print(movie_ratings)


Enter a movie name (or 'done' to finish): aquaman
Movie 'aquaman' not found in the database.
Enter a movie name (or 'done' to finish): terminator
Enter a rating for 'Terminator 2: Judgment Day (1991)' (1-5): 5
Rating for 'Terminator 2: Judgment Day (1991)' added.
Enter a movie name (or 'done' to finish): rambo
Enter a rating for 'Rambo: First Blood Part II (1985)' (1-5): 4
Rating for 'Rambo: First Blood Part II (1985)' added.
Enter a movie name (or 'done' to finish): jumanji
Enter a rating for 'Jumanji (1995)' (1-5): 4
Rating for 'Jumanji (1995)' added.
Enter a movie name (or 'done' to finish): nemo
Enter a rating for 'Little Nemo: Adventures in Slumberland (1992)' (1-5): 2
Rating for 'Little Nemo: Adventures in Slumberland (1992)' added.
Enter a movie name (or 'done' to finish): done
Enter 'genre' for genre-based recommendation or 'user' for similar user recommendation: genre
ADD HERE YOUR GENRE-BASED RECOMMENDATIONS BASED ON 'movie_ratings'
ADD HERE YOUR EXPLANATION ON THE MOVIE RECO

In [27]:
def get_movie_name(movie_id):
    movie_row = movies[movies['movieId'] == movie_id]

    if not movie_row.empty:
        return movie_row['title'].iloc[0]
    else:
        return "Movie not in the list"

### Indivual Recimmender using KNN

In [58]:
users_ratings = ratings_df.groupby(['userId']).count()

selected = users_ratings['rating'] > 200
selected_users = users_ratings.loc[selected]
random_selected = selected_users.sample() # sample() returns a random row from the dataframe. The returned object is a dataframe with one row. If you pass a number as argument you specify to select more than one row.
select_column_df = random_selected.reset_index()['userId'] # reset_index() create a new index, and the user became a column. Then, we can filter using the column name
selected_user = select_column_df.iloc[0] # iloc select by index, since our dataframe only has one row we read it from the index 0
# selected_user = 19
print("Selected user: " + str(selected_user))

Selected user: 2171


In [59]:
selected_user_ratings = ratings_df.loc[ratings_df['userId'] == selected_user]
selected_user_ratings['item'] = selected_user_ratings.index 
selected_user_ratings.to_csv('selected_user_ratings.csv', index=False)
selected_user_ratings = selected_user_ratings.sort_values(by='item', ascending=True)
print("Rated movies: " + str(selected_user_ratings.shape[0]))
# display(selected_user_ratings.head(10))

Rated movies: 673


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  selected_user_ratings['item'] = selected_user_ratings.index


In [60]:
selected_movie_ids = set(selected_user_ratings['movieId'])
rated_movies_df = movies.reindex(list(selected_user_ratings['movieId']))
# rated_movies_df = movies_df[movies_df['movieId'].isin(selected_movie_ids)]
rated_movies_df = rated_movies_df[['title', 'genres']]
rated_movies_df['item'] = rated_movies_df.index 
rated_movies_df.to_csv('rated_movies_df.csv', index=False)

In [61]:
diff = set(movies.index) - set(rated_movies_df.index)
unrated_movies_df = movies.loc[diff]
# display(unrated_movies_df.head())
unrated_movies_df = unrated_movies_df[['title', 'genres']]
print("Unrated movies: " + str(unrated_movies_df.shape[0]))
# display(unrated_movies_df.head(10))

Unrated movies: 26630


In [62]:
rated_movies_df = rated_movies_df.join(selected_user_ratings.set_index('movieId')['rating'], on='item')
print("Rated movies: " + str(rated_movies_df.shape[0]))
# display(rated_movies_df.head(10))

Rated movies: 673


In [63]:
rated_movies_df['genres'].fillna('', inplace=True)
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(rated_movies_df['genres'])
rated_movies_df['genres'].fillna('', inplace=True)
y = rated_movies_df['rating']
neigh = KNeighborsRegressor(n_neighbors=5)
neigh.fit(X, y)
X_unrated = vectorizer.transform(unrated_movies_df['genres'])
y_unrated = neigh.predict(X_unrated)
unrated_movies_df['predicted_ratings_KNN'] = y_unrated

In [64]:
user_ratings = movie_ratings  # Use the provided 'movie_ratings' dictionary

# Train a KNN model to predict ratings
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(rated_movies_df['genres'])
y = rated_movies_df['rating']

neigh = KNeighborsRegressor(n_neighbors=5)
neigh.fit(X, y)

#Predict ratings for unrated movies
X_unrated = vectorizer.transform(unrated_movies_df['genres'])
y_unrated = neigh.predict(X_unrated)

#Dictionary with movie IDs and predicted ratings
predicted_ratings = {}
for idx, movie_id in enumerate(unrated_movies_df.index):
    predicted_ratings[movie_id] = y_unrated[idx]

user_and_predicted_ratings = {**user_ratings, **predicted_ratings}

# Sort the movies based on the combined ratings
sorted_movies = sorted(user_and_predicted_ratings.items(), key=lambda x: x[1], reverse=True)

# Print the top 10 recommended movies
print("Top 10 Recommended Movies:")
for i, (movie_id, rating) in enumerate(sorted_movies[:10], 1):
    print(f"{i}. Movie ID: {movie_id}, Rating: {rating}")
for i, (movie_id, rating) in enumerate(sorted_movies[:10], 1):
    print(f"{i}. Movie : {get_movie_name(movie_id)}, Rating: {rating}")

Top 10 Recommended Movies:
1. Movie ID: 589, Rating: 5.0
2. Movie ID: 814, Rating: 4.6
3. Movie ID: 3147, Rating: 4.6
4. Movie ID: 4772, Rating: 4.6
5. Movie ID: 6631, Rating: 4.6
6. Movie ID: 7898, Rating: 4.6
7. Movie ID: 8672, Rating: 4.6
8. Movie ID: 10784, Rating: 4.6
9. Movie ID: 11328, Rating: 4.6
10. Movie ID: 16048, Rating: 4.6
1. Movie : Terminator 2: Judgment Day (1991), Rating: 5.0
2. Movie : Boy Called Hate, A (1995), Rating: 4.6
3. Movie : Green Mile, The (1999), Rating: 4.6
4. Movie : Dinner Rush (2000), Rating: 4.6
5. Movie : Man's Best Friend (1993), Rating: 4.6
6. Movie : Junior Bonner (1972), Rating: 4.6
7. Movie : Battle Hymn (1957), Rating: 4.6
8. Movie : Movie not in the list, Rating: 4.6
9. Movie : Movie not in the list, Rating: 4.6
10. Movie : Movie not in the list, Rating: 4.6


## Group Recommender system
In this case I decided not to ask the users for the ratings because it is counterproductive. In this case, the group ratings will be already defined in the code


In [6]:
#Some random group users movie ratings
group_movie_ratings = [
    {1: 5, 2: 3, 13: 4, 3: 4.5},
    {190: 2, 9372: 4, 837: 1.5},
    {89: 3.5, 7521: 3, 90: 3.5}
]

In [7]:
#----------------------------ADD YOUR WORK HERE-----------------------------
print("ADD HERE YOUR MOVIE RECOMMENDATIONS BASED ON \'group_movie_ratings\'")
print('ADD HERE YOUR EXPLANATION ON THE MOVIE RECOMMENDATIONS')
#----------------------------ADD YOUR WORK HERE-----------------------------

ADD HERE YOUR MOVIE RECOMMENDATIONS BASED ON 'group_movie_ratings'
ADD HERE YOUR EXPLANATION ON THE MOVIE RECOMMENDATIONS
