## Data Description:
    Unique ID of each anime.
    Anime title.
    Anime broadcast type, such as TV, OVA, etc.
    anime genre.
    The number of episodes of each anime.
    The average rating for each anime compared to the number of users who gave ratings.
    Number of community members for each anime.
## Objective:
    The objective of this assignment is to implement a recommendation system using cosine similarity on an anime dataset. 
## Dataset:
    Use the Anime Dataset which contains information about various anime, including their titles, genres,No.of episodes and user ratings etc.

In [1]:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Tasks:
## Data Preprocessing:
    Load the dataset into a suitable data structure (e.g., pandas DataFrame).
    Handle missing values, if any.
    Explore the dataset to understand its structure and attributes.

In [2]:
df=pd.read_csv('anime.csv')

In [3]:
df.head()

Unnamed: 0,anime_id,name,genre,type,episodes,rating,members
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural",Movie,1,9.37,200630
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",TV,64,9.26,793665
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.25,114262
3,9253,Steins;Gate,"Sci-Fi, Thriller",TV,24,9.17,673572
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.16,151266


In [4]:
df.isnull().sum()

anime_id      0
name          0
genre        62
type         25
episodes      0
rating      230
members       0
dtype: int64

In [5]:
df.fillna({'rating':df['rating'].mean(),'type':df['type'].mode()[0]},inplace=True)

In [6]:
df.isnull().sum()

anime_id     0
name         0
genre       62
type         0
episodes     0
rating       0
members      0
dtype: int64

In [7]:
from sklearn.preprocessing import StandardScaler
std=StandardScaler()
df[['rating']]=std.fit_transform(df[['rating']])
df.head()

Unnamed: 0,anime_id,name,genre,type,episodes,rating,members
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural",Movie,1,2.847535,200630
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",TV,64,2.73938,793665
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,2.729547,114262
3,9253,Steins;Gate,"Sci-Fi, Thriller",TV,24,2.650889,673572
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,2.641057,151266


## Feature Extraction:
    Decide on the features that will be used for computing similarity (e.g., genres, user ratings).
    Convert categorical features into numerical representations if necessary.
    Normalize numerical features if required.

In [8]:
df1=df.pivot_table(index='name',columns='type',values='rating')
df1.fillna(0,axis=1,inplace=True)
df1.head()
df1.shape

(12292, 6)

In [9]:
from sklearn.metrics.pairwise import cosine_similarity,euclidean_distances

In [10]:
cos_sim=cosine_similarity(df1)

In [11]:
cos_sim.shape

(12292, 12292)

In [12]:
pd.DataFrame(cos_sim[3])

Unnamed: 0,0
0,0.0
1,0.0
2,1.0
3,1.0
4,0.0
...,...
12287,1.0
12288,1.0
12289,0.0
12290,0.0


In [13]:
df1.index

Index(['&quot;0&quot;',
       '&quot;Aesop&quot; no Ohanashi yori: Ushi to Kaeru, Yokubatta Inu',
       '&quot;Bungaku Shoujo&quot; Kyou no Oyatsu: Hatsukoi',
       '&quot;Bungaku Shoujo&quot; Memoire',
       '&quot;Bungaku Shoujo&quot; Movie', '&quot;Eiji&quot;',
       '&quot;Eiyuu&quot; Kaitai', '.hack//G.U. Returner',
       '.hack//G.U. Trilogy', '.hack//G.U. Trilogy: Parody Mode',
       ...
       's.CRY.ed', 'vivi', 'xxxHOLiC', 'xxxHOLiC Kei',
       'xxxHOLiC Movie: Manatsu no Yoru no Yume', 'xxxHOLiC Rou',
       'xxxHOLiC Shunmuki', 'Üks Uks', 'ēlDLIVE', '◯'],
      dtype='object', name='name', length=12292)

## Recommendation System:
    Design a function to recommend anime based on cosine similarity.
    Given a target anime, recommend a list of similar anime based on cosine similarity scores.
    Experiment with different threshold values for similarity scores to adjust the recommendation list size.

In [14]:
def recommended_movie(similar_movie,threshold=1):
    if similar_movie in df1.index:
        index= np.where(similar_movie==df1.index)[0][0]
        similar=list(enumerate(cos_sim[index]))
        filtered_similar = [
            (i, score) for i, score in similar
            if score >= threshold and i != index
        ]
        filtered_similar = sorted(filtered_similar, reverse=True, key=lambda x: x[1])[1:6]
        print('Recommended movie of ',similar_movie)
        print('*'*30)
        for i in filtered_similar:
            print(df1.index[i[0]])
    else:
        print('Movie is not in the list')

In [15]:
recommended_movie('Kimi no Na wa.')

Recommended movie of  Kimi no Na wa.
******************************
.hack//G.U. Trilogy
.hack//The Movie: Sekai no Mukou ni
009 Re:Cyborg
1000-nen Joou: Queen Millennia
11-nin Iru!


In [16]:
def recommended_movie(similar_movie,threshold=0.8):
    if similar_movie in df1.index:
        index= np.where(similar_movie==df1.index)[0][0]
        similar=list(enumerate(cos_sim[index]))
        filtered_similar = [
            (i, score) for i, score in similar
            if score >= threshold and i != index
        ]
        filtered_similar = sorted(filtered_similar, reverse=True, key=lambda x: x[1])[1:8]
        print('Recommended movie of ',similar_movie)
        print('*'*30)
        for i in filtered_similar:
            print(df1.index[i[0]])
    else:
        print('Movie is not in the list')

In [17]:
recommended_movie('Kimi no Na wa.')

Recommended movie of  Kimi no Na wa.
******************************
.hack//G.U. Trilogy
.hack//The Movie: Sekai no Mukou ni
009 Re:Cyborg
1000-nen Joou: Queen Millennia
11-nin Iru!
8-gatsu no Symphony: Shibuya 2002-2003
Aa! Megami-sama! Movie


## Interview Questions:
    1. Can you explain the difference between user-based and item-based collaborative filtering?
    
           Finds users similar to the target user based on past behavior, and recommends items liked by those similar users.

           Finds items similar to the ones the target user has already interacted with, and recommends similar items.
           
    2. What is collaborative filtering, and how does it work?

           Collaborative Filtering is a  recommendation technique that uses past user-item interactions to predict future user preferences.It works based on two parameters either cosine similarity or Eucledian Distance
           