### **Generate Random Ratings Data**

This code generates a random **user-movie ratings matrix** for testing the movie recommendation system. It simulates a scenario where multiple users rate various movies. The ratings range from 1 to 5, with 0 representing unrated movies. The matrix is saved as a **CSV file** (`ratings.csv`), which will later be used by the C++ program for prediction and recommendation.


In [4]:
import pandas as pd
import numpy as np

#number of users and movies
num_users = 20   #can be changed accordingly
num_movies = 15  #can be changed accordingly

# Generate a random matrix with ratings (1 to 5) and 0 for unrated movies
np.random.seed(42)
ratings = np.random.choice([0, 1, 2, 3, 4, 5], size=(num_users, num_movies), p=[0.4, 0.1, 0.1, 0.1, 0.1, 0.2])

df = pd.DataFrame(ratings, columns=[f"Movie{i+1}" for i in range(num_movies)])

df.to_csv("ratings.csv", index=False)
print("Generated 'ratings.csv' with random user-movie ratings!")

Generated 'ratings.csv' with random user-movie ratings!



#### **Functions in the C++ Code:**

1. **`loadRatingsMatrix()`**:
   - This function loads the user-movie ratings matrix from a **CSV file** (`ratings.csv`). It skips the header and handles missing values (0 for unrated movies).

2. **`calculateCosineSimilarity()`**:
   - This function computes the **cosine similarity** between two users based on their movie ratings. It only considers movies that both users have rated.

3. **`calculateAllSimilarities()`**:
   - This function calculates the cosine similarities between the target user and all other users.

4. **`predictRatings()`**:
   - This function predicts ratings for unrated movies for the target user by calculating a weighted average of ratings from similar users.

5. **`recommendTopNMovies()`**:
   - This function recommends the **top N movies** based on predicted ratings, sorted in descending order.

6. **`calculateRMSE()`**:
   - This function calculates the **Root Mean Squared Error (RMSE)** to evaluate the accuracy of the predicted ratings compared to the actual ratings.

7. **`main()`**:
   - The program prompts the user for input:
     1. The target user index for whom movie recommendations are needed.
     2. The number of top recommended movies to display.
   - It then calculates the predicted ratings, recommends the top N movies, and displays the RMSE for the given user.


In [20]:
%%writefile movie_recommender.cpp
#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
#include <cmath>
#include <algorithm>

using namespace std;

// Function to load the ratings matrix from a CSV file
vector<vector<int>> loadRatingsMatrix(const string &filename) {
    ifstream file(filename);
    vector<vector<int>> matrix;
    string line;

    bool skipHeader=true;
    while(getline(file, line)){
        if(skipHeader){
            skipHeader=false;
            continue; //Skip the header row
        }
        vector<int>row;
        stringstream ss(line);
        string value;

        while(getline(ss, value, ',')){
            row.push_back(value.empty()?0:stoi(value)); //for empty values
        }
        matrix.push_back(row);
    }
    file.close();
    return matrix;
}

// Function to compute cosine similarity between two users
double calculateCosineSimilarity(const vector<int> &user1, const vector<int> &user2) {
    double dotProduct = 0.0, norm1 = 0.0, norm2 = 0.0;

    for (size_t i = 0; i < user1.size(); ++i) {
        if (user1[i] != 0 && user2[i] != 0) { // Only consider rated movies
            dotProduct += user1[i] * user2[i];
            norm1 += user1[i] * user1[i];
            norm2 += user2[i] * user2[i];
        }
    }

    if (norm1 == 0 || norm2 == 0) return 0; // Avoid division by zero
    return dotProduct / (sqrt(norm1) * sqrt(norm2));
}

// Function to calculate all user-user similarities
vector<double> calculateAllSimilarities(const vector<vector<int>> &matrix, int targetUser) {
    vector<double> similarities(matrix.size(), 0.0);

    for (size_t i = 0; i < matrix.size(); ++i) {
        if (i != targetUser) {
            similarities[i] = calculateCosineSimilarity(matrix[targetUser], matrix[i]);
        }
    }
    return similarities;
}

// Function to predict ratings for a target user
vector<double> predictRatings(const vector<vector<int>> &matrix, int targetUser) {
    int numMovies = matrix[0].size();
    vector<double> predictedRatings(numMovies, 0.0);
    vector<double> similarities = calculateAllSimilarities(matrix, targetUser);

    for (int movie = 0; movie < numMovies; ++movie) {
        if (matrix[targetUser][movie] == 0) { // Predict only for unrated movies
            double numerator = 0.0, denominator = 0.0;

            for (size_t user = 0; user < matrix.size(); ++user) {
                if (matrix[user][movie] != 0 && similarities[user] > 0) {
                    numerator += similarities[user] * matrix[user][movie];
                    denominator += fabs(similarities[user]);
                }
            }

            predictedRatings[movie] = (denominator != 0) ? numerator / denominator : 0;
        }
    }

    return predictedRatings;
}

// Function to recommend top N movies for a user
void recommendTopNMovies(const vector<double> &predictions, int N) {
    vector<pair<double, int>> movies;

    for (int i = 0; i < predictions.size(); ++i) {
        if (predictions[i] > 0) {
            movies.push_back({predictions[i], i + 1}); // Movie index starts at 1
        }
    }

    // Sort movies by predicted ratings in descending order
    sort(movies.rbegin(), movies.rend());

    // Print top N movies
    cout << "Top " << N << " Recommended Movies:\n";
    for (int i = 0; i < N && i < movies.size(); ++i) {
        cout << "Movie " << movies[i].second << " - Predicted Rating: " << movies[i].first << endl;
    }
}

// Function to calculate RMSE (Root Mean Square Error)
double calculateRMSE(const vector<vector<int>> &matrix, const vector<double> &predictions, int targetUser) {
    double error = 0.0;
    int count = 0;

    for (size_t i = 0; i < predictions.size(); ++i) {
        if (matrix[targetUser][i] != 0) { // Compare only for rated movies
            double diff = predictions[i] - matrix[targetUser][i];
            error += diff * diff;
            count++;
        }
    }

    return(count>0)?sqrt(error/count):0.0;
}

int main() {
    string filename="ratings.csv"; // CSV file with user-movie ratings

    // User input for target user and number of recommended movies
    int targetUser, top, temp;

    cout << "Enter the user index for which you want to recommend movies: ";
    cin >> temp;
    targetUser=temp-1;

    cout<<"Enter the number of top recommended movies: ";
    cin>>top;

    // Load the ratings matrix
    vector<vector<int>> ratingsMatrix=loadRatingsMatrix(filename);

    // Predict ratings for the target user
    vector<double> predictedRatings = predictRatings(ratingsMatrix, targetUser);

    // Print predicted ratings
    cout<<"Predicted Ratings for unrated movies for User "<<targetUser+1<<":\n";
    for(int i=0; i<predictedRatings.size(); i++) {
        if(ratingsMatrix[targetUser][i]==0){ // Show only unrated movies
            cout<<"Movie "<<i+1<<": "<<predictedRatings[i]<<endl;
        }
    }

    // Recommend top n movies
    recommendTopNMovies(predictedRatings, top);

    // Calculate and print RMSE
    double rmse=calculateRMSE(ratingsMatrix, predictedRatings, targetUser);
    cout<<"\nPerformance Report:\n";
    cout<<"RMSE for User "<<targetUser+1<<": "<<rmse<<endl;

    return 0;
}


Overwriting movie_recommender.cpp


In [21]:
!g++ movie_recommender.cpp -o movie_recommender
!./movie_recommender

Enter the user index for which you want to recommend movies: 10
Enter the number of top recommended movies: 2
Predicted Ratings for User 10:
Movie 1: 3.8522
Movie 4: 3.16814
Movie 7: 3.31233
Movie 9: 3.43923
Movie 10: 3.34571
Movie 11: 3.82229
Movie 14: 3.63333
Movie 15: 2.98331
Top 2 Recommended Movies:
Movie 1 - Predicted Rating: 3.8522
Movie 11 - Predicted Rating: 3.82229

Performance Report:
RMSE for User 10: 3.4641
