# **Collaborative Filtering Recommendation System Notebook**
- In this notebook, we will explore and implement a collaborative filtering recommendation system. Collaborative filtering is a popular recommendation technique that relies on user-item interaction data to make personalized recommendations. 

## **1. Introduction**
- **What is Collaborative Filtering?**
    - Collaborative filtering is a recommendation technique that makes automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaboration). It assumes that users who have agreed in the past tend to agree again in the future.

- **How Does it Work?**
    - Collaborative filtering works in the following way:
        - **User-Item Interaction Data**: Collect data on how users interact with items (e.g., ratings, purchase history).

        - **User Similarity or Item Similarity**: Calculate the similarity between users or items based on their interaction patterns. Common similarity metrics include cosine similarity and Pearson correlation.

        - **Recommendation**: To make recommendations, identify users or items that are similar to the target user or item. Recommend items that similar users have liked (user-based) or recommend items that are similar to items the user has liked (item-based).

    - In this notebook, we will use a movie recommendation dataset and implement both user-based and item-based collaborative filtering.

-----------------------------------

## **2. Data Preparation**
**Dataset**
   - We will use two datasets containing animes and user ratings for animes.

In [None]:
# Import needed modules
import pandas as pd
import numpy as np
import scipy as sp
from sklearn.metrics.pairwise import cosine_similarity
import operator

In [None]:
# Load the anime dataset
anime_df = pd.read_csv('/kaggle/input/anime-recommendations-database/anime.csv')
anime_df.head()

In [None]:
# Load the rating dataset
rating_df = pd.read_csv('/kaggle/input/anime-recommendations-database/rating.csv')
rating_df.head()

### **Data Preprocessing**
- Before building the recommendation system, we need to preprocess the data, which includes handling missing values and drop duplicated items.

In [None]:
# For computing reasons I'm limiting the dataframe length to 10,000 users
rating_df = rating_df[rating_df.user_id <= 10000]

In [None]:
# Preprocess the rating dataset (handle -1 ratings as NaN)
rating_df['rating'] = rating_df['rating'].replace(-1, pd.NA)

In [None]:
# Delete duplicated items
rating_df = rating_df.drop_duplicates(['user_id', 'anime_id'])

In [None]:
# Create a user-item interaction matrix
user_item_matrix = rating_df.pivot(index='user_id', columns='anime_id', values='rating')

# **3. User-Based Collaborative Filtering**
**User Similarity**
- To implement user-based collaborative filtering, we need to calculate the similarity between users based on their ratings. We can use similarity metrics such as cosine similarity or Pearson correlation.

In [None]:
# Calculate user similarity using cosine similarity
user_similarity = cosine_similarity(user_item_matrix.fillna(0))

**Making Recommendations**
- To make recommendations for a target user, we identify users similar to the target user and recommend anime that they have rated positively.

In [None]:
def user_based_recommendations(user_id, user_item_matrix, user_similarity, n=5):
    # Get the similarity scores for the target user
    user_scores = user_similarity[user_id - 1]
    
    # Sort users by similarity in descending order
    similar_users = sorted(enumerate(user_scores), key=lambda x: x[1], reverse=True)
    
    # Initialize a list to store recommended anime
    recommended_anime = []
    
    for user, score in similar_users[1:]:  # Exclude the target user
        # Get the anime the similar user has rated positively
        rated_anime = user_item_matrix.loc[user_item_matrix.index[user - 1]]
        positively_rated_anime = rated_anime[rated_anime >= 6].index
        
        # Exclude anime the target user has already rated
        target_user_rated_anime = user_item_matrix.loc[user_id].dropna().index
        recommended_anime.extend(set(positively_rated_anime) - set(target_user_rated_anime))
        
        # Limit the number of recommendations
        if len(recommended_anime) >= n:
            break
    
    return recommended_anime[:n]


# **4. Item-Based Collaborative Filtering**
**Item Similarity**
- To implement item-based collaborative filtering, we calculate the similarity between anime based on user ratings.

In [None]:
# Calculate item similarity using cosine similarity
item_similarity = cosine_similarity(user_item_matrix.fillna(0).T)

**Making Recommendations**
- To make item-based recommendations, we identify anime similar to those the user has already rated positively.

In [None]:
def item_based_recommendations(user_id, user_item_matrix, item_similarity, n=5):
    # Get the anime the user has rated positively
    positively_rated_anime = user_item_matrix.loc[user_id][user_item_matrix.loc[user_id] >= 6].index
    
    # Initialize a list to store recommended anime
    recommended_anime = []
    
    for anime_id in positively_rated_anime:
        # Get anime similar to the liked anime
        similar_anime = item_similarity[anime_id - 1]
        
        # Sort similar anime by similarity in descending order
        similar_anime_sorted = sorted(enumerate(similar_anime), key=lambda x: x[1], reverse=True)
        
        for anime, score in similar_anime_sorted[1:]:  # Exclude the liked anime
            # Exclude anime the user has already rated
            user_rated_anime = user_item_matrix.loc[user_id].dropna().index
            if anime not in user_rated_anime:
                recommended_anime.append(anime)
            
            # Limit the number of recommendations
            if len(recommended_anime) >= n:
                break
    
    return recommended_anime[:n]

# **5. Test Recommendation System**

In [None]:
# Choose a user ID for testing
user_id = 9  # Replace with the user ID you want to test

# Test user-based recommendations
user_recommendations = user_based_recommendations(user_id, user_item_matrix, user_similarity, n=5)

# Test item-based recommendations
item_recommendations = item_based_recommendations(user_id, user_item_matrix, item_similarity, n=5)


In [None]:
# Display the recommendations
print(f"User-Based Recommendations for User {user_id}:")
for anime_id in user_recommendations:
    anime_name = anime_df.loc[anime_df['anime_id'] == anime_id, 'name'].values[0]
    print(f"- {anime_name}")

print(f"\nItem-Based Recommendations for User {user_id}:")
for anime_id in item_recommendations:
    anime_name = anime_df.loc[anime_df['anime_id'] == anime_id, 'name'].values[0]
    print(f"- {anime_name}")