# Movie Recommendation System

This notebook demonstrates a movie recommendation system that uses collaborative filtering to suggest movies to users based on their past ratings.

## Dataset

The dataset consists of 30 users and their ratings for 10 movies. The ratings are on a scale of 1 to 5, with `NaN` indicating that the user has not rated the movie.

## Objective

The goal is to:
1. Identify users who have watched all movies.
2. Recommend movies to users who have not watched all movies based on the ratings of similar users.

## Users Who Have Watched All Movies

The following users have rated all 10 movies:
- User 3
- User 4
- User 18
- User 20
- User 21
- User 30

## Code Explanation

1. **Dataset Creation:**
   - A dictionary `data` is created with keys as `UserID` and `Movie1` to `Movie10`.
   - This dictionary is converted into a pandas DataFrame called `ratings_df`.

2. **Identify Users Who Have Watched All Movies:**
   - Check for `NaN` values in the DataFrame.
   - Identify users with no `NaN` values, indicating they have rated all movies.

3. **User Input:**
   - Prompt the user to enter their User ID (between 1 and 30).

4. **Movie Recommendation:**
   - If the user has not watched all movies, calculate the average ratings for the movies.
   - Use cosine similarity to find users with similar ratings.
   - Predict ratings for the unwatched movies and recommend the movie with the highest predicted rating.



In [4]:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# Create a dataset with 30 users and 10 movies
data = {
    'UserID': range(1, 31),
    'Movie1': [5, 3, 4, 4, np.nan, 5, 4, 1, np.nan, 3, 2, 5, 4, 3, 4, np.nan, 2, 4, 5, 1, 3, 5, np.nan, 4, 2, 3, 4, 5, np.nan, 2],
    'Movie2': [np.nan, 2, 3, 3, 2, 4, np.nan, 3, 3, 2, 4, np.nan, 5, 3, 3, 2, 4, 5, 3, 4, 5, 2, 4, 3, 2, 5, 3, np.nan, 2, 3],
    'Movie3': [4, np.nan, 4, 5, 1, 4, 4, 2, 2, 5, 3, 4, 5, 2, 1, 4, np.nan, 3, 5, 2, 1, 4, 3, 2, 5, 4, np.nan, 1, 5, 3],
    'Movie4': [5, 3, 4, 3, np.nan, 5, np.nan, 3, 3, 4, 5, 2, 4, 3, 5, 2, 4, 3, 2, 5, 4, np.nan, 2, 3, 5, 2, 3, 4, 5, 1],
    'Movie5': [3, 4, 2, 4, 3, np.nan, 5, np.nan, 2, 4, 5, 3, 4, np.nan, 3, 2, 5, 4, 3, 5, 2, 4, 3, np.nan, 4, 5, 2, 3, 4, 5],
    'Movie6': [4, 3, 4, 3, 4, 5, np.nan, 3, 4, 3, 2, 4, 5, 3, np.nan, 2, 4, 5, 3, 2, 4, 5, 3, 4, np.nan, 2, 5, 3, 4, 5],
    'Movie7': [5, 4, 5, 2, np.nan, 4, 4, 3, 3, np.nan, 4, 5, 3, 2, 1, 4, 3, 5, np.nan, 2, 5, 4, 3, 2, 1, 4, 3, 2, 5, 4],
    'Movie8': [4, 3, 5, 4, 3, 5, 5, 3, 4, 4, np.nan, 2, 4, 5, 3, 4, 2, 5, 4, 3, 2, 4, 5, 3, 4, np.nan, 2, 5, 3, 4],
    'Movie9': [3, np.nan, 4, 4, 2, 3, 5, 2, 2, 5, 4, 3, np.nan, 5, 4, 3, 2, 5, 4, 3, 2, 5, 3, 4, 3, 5, 4, 2, 3, 5],
    'Movie10': [4, 3, 5, 5, 3, 4, np.nan, 4, 4, 4, 3, 2, 4, 5, 3, np.nan, 4, 5, 3, 2, 4, 5, 3, 2, 5, 4, 3, 5, 4, 3]
}

# Convert the dictionary to a DataFrame
ratings_df = pd.DataFrame(data)
ratings_df

Unnamed: 0,UserID,Movie1,Movie2,Movie3,Movie4,Movie5,Movie6,Movie7,Movie8,Movie9,Movie10
0,1,5.0,,4.0,5.0,3.0,4.0,5.0,4.0,3.0,4.0
1,2,3.0,2.0,,3.0,4.0,3.0,4.0,3.0,,3.0
2,3,4.0,3.0,4.0,4.0,2.0,4.0,5.0,5.0,4.0,5.0
3,4,4.0,3.0,5.0,3.0,4.0,3.0,2.0,4.0,4.0,5.0
4,5,,2.0,1.0,,3.0,4.0,,3.0,2.0,3.0
5,6,5.0,4.0,4.0,5.0,,5.0,4.0,5.0,3.0,4.0
6,7,4.0,,4.0,,5.0,,4.0,5.0,5.0,
7,8,1.0,3.0,2.0,3.0,,3.0,3.0,3.0,2.0,4.0
8,9,,3.0,2.0,3.0,2.0,4.0,3.0,4.0,2.0,4.0
9,10,3.0,2.0,5.0,4.0,4.0,3.0,,4.0,5.0,4.0


Users Who Have Watched All Movies

Number of Users: 6

Users IDs: 3, 4, 18, 20, 21, 30

In [5]:
# Get user input for the user ID
user_id = int(input("Enter your User ID (1-30): "))

# Check if the user ID is valid
if user_id < 1 or user_id > 30:
    print("Invalid User ID. Please enter a valid User ID between 1 and 30.")
else:
    # Find movies that the user hasn't watched yet
    user_ratings = ratings_df.loc[ratings_df['UserID'] == user_id].drop('UserID', axis=1).values.flatten()
    unwatched_movies = np.isnan(user_ratings)

    # Check if there are any unwatched movies
    if not np.any(unwatched_movies):
        print(f"User {user_id} has watched all the movies.")
    else:
        # Calculate average ratings for these movies
        avg_ratings = ratings_df.drop('UserID', axis=1).mean()

        # Find the most similar users
        user_movie_matrix = ratings_df.drop('UserID', axis=1).fillna(0).values
        similarities = cosine_similarity(user_movie_matrix)

        # Find the index of the user in the similarities matrix
        user_index = user_id - 1
        similar_users = similarities[user_index]

        # Predict ratings for unwatched movies based on similar users
        predicted_ratings = {}
        for i in range(len(unwatched_movies)):
            if unwatched_movies[i]:
                # Average rating of the movie
                avg_rating = avg_ratings[i]

                # Weighted sum of ratings from similar users
                weighted_sum = 0
                sum_of_weights = 0
                for j in range(len(similar_users)):
                    if j != user_index and not np.isnan(ratings_df.iloc[j, i+1]):
                        weighted_sum += similar_users[j] * ratings_df.iloc[j, i+1]
                        sum_of_weights += similar_users[j]

                # Predicted rating
                if sum_of_weights != 0:
                    predicted_rating = weighted_sum / sum_of_weights
                else:
                    predicted_rating = avg_rating

                predicted_ratings[f'Movie{i+1}'] = predicted_rating

        # Check if there are any predicted ratings
        if predicted_ratings:
            # Recommend the movie with the highest predicted rating
            recommended_movie = max(predicted_ratings, key=predicted_ratings.get)
            print(f"Recommended Movie for User {user_id} :--> {recommended_movie} with predicted rating {predicted_ratings[recommended_movie]:.2f}")
        else:
            print(f"Could not predict any unwatched movies for User {user_id}.")

Enter your User ID (1-30): 1
Recommended Movie for User 1 :--> Movie2 with predicted rating 3.23
