# ML Class Project
   ## Movie Recomendation System Evaluation Using KNN and SVD
        - KNN: We start with a matrix where rows represent movies and columns represent users. Each cell contains the rating a user gave to a movie. Missing ratings are treated as zeros. KNN computes the cosine similarity directly between rows (movies) of the user-item matrix

        - SVD: Decomposes the user-item matrix into three matrices: A=U⋅Σ⋅V^T. The k largest singular values are retained to reduce dimensionality, which filters noise and focuses on the most significant patterns in the data. Cosine similarity is computed between the latent vectors of movies in the reduced space
        
        

## Setup and Data Pre-Processing
    Here we load all necesary libraries:
        - Pandas
        - Numpy
        - Scipy
        - Scikit-learn
    A .bat is included that will install all libraries required (Having pip is necessary to run beforehand)
    We also read the movies and ratings *.csv* files and load them as pandas dataframes
    Then pivot the dataframe into a useful format and sparse the data

In [1]:
from sklearn.neighbors import NearestNeighbors
from sklearn.metrics.pairwise import cosine_similarity
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import svds
import pandas as pd
import numpy as np

# Load datasets and make them DataFrames
movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')

# Create a pivot table: rows = movies, columns = users, values = ratings
movie_user_matrix = ratings.pivot_table(index='movieId', columns='userId', values='rating').fillna(0)
csr_data = csr_matrix(movie_user_matrix.values) # Sparse it

## Searching the Database for Queried Movie
The main program uses this function to find the movie the user is refering to. It is case-insensitive to prevent major errors in search. The searching method includes all results with the search query embedded in the movies title.
The user will then input a number selecting the intended movie amongst the results. The function will return the movie ID in the database alongside its name.

In [2]:
# Function to search and select movie
def find_movie_index_by_name(movie_title):
    # Find matching movies containing the input string (case-insensitive)
    matching_movies = movies[movies['title'].str.contains(movie_title, case=False, regex=False)]
    if matching_movies.empty:
        print("No movies found with that name. Please try again.")
        return None, None
    
    # Display matching movie options
    print("\nMatching Movies:")
    i=0
    for idx, row in matching_movies.iterrows():
        i=i+1;
        print(f"{i}. {row['title']}")
    
    # Choose from findings
    while True:
        try:
            choice = int(input("\nEnter the number of the movie you want: ")) - 1
            if choice < 0 or choice >= len(matching_movies):
                print("Invalid choice. Please select a valid number.")
                continue
            selected_movie = matching_movies.iloc[choice]
            movie_idx = movie_user_matrix.index.get_loc(selected_movie['movieId'])
            return movie_idx, selected_movie['title']
        except ValueError:
            print("Invalid input. Please enter a number.")

## Recomendation Algorithms
## -- Recomendation Algorithm with KNN

In [None]:
# Train the KNN model
knn = NearestNeighbors(metric='cosine', algorithm='brute', n_neighbors=15)
knn.fit(csr_data)
# Recommend similar movies
def recommend_movies(movie_idx, n_recommendations=5):
    # Get k-neighbors distances and indices
    distances, indices = knn.kneighbors(csr_data[movie_idx].reshape(1, -1), 
                                        n_neighbors=n_recommendations+1)
    
    # Print
    print("\nRecommended Movies KNN:")  
    for i in range(1, len(indices.flatten())):  # Skip itself
        movie_id = movie_user_matrix.index[indices.flatten()[i]]
        movie_title = movies[movies['movieId'] == movie_id]['title'].values[0]
        print(f"{i}. {movie_title} (Similarity: {1 - distances.flatten()[i]:.2f})")

## -- Recomendation Algorithm with SVD

In [None]:
# Set SVD Model
U, sigma, Vt = svds(csr_data, k=50)  # k is the number of latent factors
sigma = np.diag(sigma)  # Convert sigma into a diagonal matrix

# Compute the movie embeddings in latent space
movie_embeddings = np.dot(U, sigma)  # Project movies into the reduced space

# Compute cosine similarity between movies
cosine_sim = cosine_similarity(movie_embeddings)
def recommend_movies_svd(movie_idx,n_recommendations=5):
    # Get Cosine Similarity and sort
    sim_scores = cosine_sim[movie_idx]
    similar_movies = sorted(enumerate(sim_scores), key=lambda x: x[1], reverse=True)
    
    # Print
    print(f"\nRecommended Movies SVD:")
    count = 0
    for i, score in similar_movies:
        if i == movie_idx:  # Skip itself
            continue
        movie_id = movie_user_matrix.index[i]
        movie_title = movies[movies['movieId'] == movie_id]['title'].values[0]
        print(f"{count + 1}. {movie_title} (Similarity: {score:.2f})")
        count += 1
        if count >= n_recommendations:
            break

# Main
This runs all necesary steps. It will first ask for a movie name or exit command. If a movie is found it will used the retrieved info from the search function to run the KNN and SVD algorithms and print their recommendations
- The notebook format has shown to be unstalble at times in my computer. If any big problems arise while running I have included a *.py* file that will run better

In [5]:
# Main (make it its own function to prevent slip ups)
def main():
    print("Welcome to Angel's ML Movie Recommendation!")
    while True:
        user_input = input("\nSearch for a movie by name (type 'exit' to quit): ").strip()
        if user_input.lower() == 'exit':
            print("Goodbye!")
            break
        
        movie_idx, movie_title = find_movie_index_by_name(user_input)
        if movie_idx is not None:
            print(f"\nYou selected: {movie_title}")
            recommend_movies(movie_idx)
            recommend_movies_svd(movie_idx)

# Call main (always bottom of the page)
if __name__=="__main__":
  main()

Welcome to Angel's ML Movie Recommendation!

Matching Movies:
1. Dr. No (1962)

You selected: Dr. No (1962)

Recommended Movies:
1. Thunderball (1965) (Similarity: 0.77)
2. Live and Let Die (1973) (Similarity: 0.73)
3. Goldfinger (1964) (Similarity: 0.72)
4. On Her Majesty's Secret Service (1969) (Similarity: 0.67)
5. From Russia with Love (1963) (Similarity: 0.67)

Recommended Movies:
1. Thunderball (1965) (Similarity: 0.93)
2. Live and Let Die (1973) (Similarity: 0.90)
3. Goldfinger (1964) (Similarity: 0.87)
4. On Her Majesty's Secret Service (1969) (Similarity: 0.87)
5. From Russia with Love (1963) (Similarity: 0.84)

Matching Movies:
1. Matrix, The (1999)
2. Matrix Reloaded, The (2003)
3. Matrix Revolutions, The (2003)
4. Animatrix, The (2003)

You selected: Matrix, The (1999)

Recommended Movies:
1. Fight Club (1999) (Similarity: 0.71)
2. Star Wars: Episode V - The Empire Strikes Back (1980) (Similarity: 0.70)
3. Saving Private Ryan (1998) (Similarity: 0.68)
4. Star Wars: Episode 