<div style="text-align: center; font-size: 46px; color: blue;">
    <u><b>12.Recommendation System</b></u>
</div>

<div style="text-align: center; font-size: 30px; color: Violet;">
    <u><b>Recommendation System for Anime</b></u>
</div>

## Objective:

The objective of this program is to develop a content-based recommendation system for anime using cosine similarity. The system analyzes features such as genre, type, number of episodes, and user ratings to measure similarities between anime and provide meaningful recommendations. The goal is to help users discover anime that are similar in content and characteristics to the ones they already enjoy.

In [3]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, f1_score


In [7]:
#1.Load & Preprocess data
df=pd.read_csv("anime.csv")

#Handle missng genres & ratings
df.fillna({"genre": "", "rating":df["rating"].mean()},inplace=True)

#Convert episodes to numeric (fixes 'Unknown' error)
df["episodes"]=pd.to_numeric(df["episodes"],errors="coerce")
df["episodes"].fillna(df["episodes"].median(),inplace=True)


#Keep necessory columns
df=df[["anime_id", "name", "genre", "type", "episodes", "rating"]]



The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["episodes"].fillna(df["episodes"].median(),inplace=True)


In [15]:
#2.Feature extraction
#TF-IDF on genre
tfidf=TfidfVectorizer(stop_words="english")
tfidf_matrix=tfidf.fit_transform(df["genre"])

In [16]:
#Normalize Numeric Feature
scaler=MinMaxScaler()
numeric_features=scaler.fit_transform(df[["episodes","rating"]])

In [17]:
#combine features
final_features=np.hstack([tfidf_matrix.toarray(),numeric_features])


In [18]:
#3.Cosine Similarity
cosine_sim=cosine_similarity(final_features,final_features)

In [21]:
#4.Recommendation Function
def recommend_anime(title,df,cosin_sim,top_n=10,threshold=0.3):
    if title not in df["name"].values:
        print("Anime not found in dataset")
        return None
        
    idx=df.index[df["name"]==title][0]
    scores=list(enumerate(cosine_sim[idx]))
    scores=sorted(scores,key=lambda x:x[1], reverse=True)
    filtered=[(i,s) for i,s in scores [:1] if s>= threshold]

    top_indices=[i for i, s in filtered[:top_n]]
    
    print(f"\nTop {top_n} Recommendation for:{title}\n")
    return df.iloc[top_indices][["name","genre","rating"]]
    
                 
    
        

In [24]:
#5 Evaluation

df["relevant"]=(df["rating"] >=7.5).astype(int)

train_df,test_df=train_test_split(df,test_size=0.2,random_state=42)

y_true=test_df["relevant"]
y_pred=[1 if r>=7.5 else 0 for r in test_df["rating"]]

precision=precision_score(y_true,y_pred)
recall=recall_score(y_true,y_pred)
f1=f1_score(y_true,y_pred)

print("\n==============EVALUATION MATRIX====================")
print("\nPrecision:",precision)
print("\nRecall:",recall)
print("\nf1:",f1)





Precision: 1.0

Recall: 1.0

f1: 1.0


##INTERVIEW QUESTION

##INTERVIEW QUESTIONS

# INTERVIEW QUESTIONS 

## 1. Can you explain the difference between user-based and item-based collaborative filtering?

User-Based Collaborative Filtering:

a.Focuses on finding users with similar preferences
b.Calculates similarity between users
c.Recommends items liked by similar users
d.Example: Users like you also liked these movies
e.Performance decreases as number of users increases
f.User behavior can change, so recommendations are less stable
g.Suitable for small or medium user bases


Item-Based Collaborative Filtering:

a.Focuses on finding items with similar characteristics
b.Calculates similarity between items
c.Recommends items similar to those the user liked
d.Example: People who liked this movie also liked
e.More efficient for large datasets
f.Item relationships change slowly, so recommendations are more stable
g.Used by platforms like Amazon and Netflix

## 2. What is collaborative filtering, and how does it work?

--> Collaborative Filtering (CF) is a recommendation technique

It suggests items based on user behavior and preferences

It does not require item features or content

Recommendations are made using patterns from many users


--> Collects userâ€“item interaction data (ratings, likes, clicks, purchases)

Identifies similar users or similar items

Uses similarity measures like:

Cosine similarity

Pearson correlation

Predicts how much a user may like an item

Recommends top-rated or most relevant items
