#__Item-Based Collaborative Filtering__

- Let's explore how to create collaborative filtering based on items.


## Step 1: Import Required Libraries

- Import pandas and cosine_similarity from sklearn

In [None]:
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

## Step 2: Define the Dataset

- Create a dictionary with users and their ratings for different web series


In [None]:
dataset={
        'user1': {'Special Ops': 5,
                  'Criminal Justice': 3,
                  'Panchayat': 3,
                  'Sacred Games': 3,
                  'Apharan': 2,
                  'Mirzapur': 3},
    
        'user2': {'Special Ops': 5,
                    'Criminal Justice': 3,
                    'Sacred Games': 5,
                    'Panchayat':5,
                    'Mirzapur': 3,
                    'Apharan': 3},
    
        'user3': {'Special Ops': 2,
                   'Panchayat': 5,
                   'Sacred Games': 3,
                   'Mirzapur': 4},
    
        'user4': {'Panchayat': 5,
                   'Mirzapur': 4,
                   'Sacred Games': 4,},
    
       'user5': {'Special Ops': 4,
                    'Criminal Justice': 4,
                    'Panchayat': 4,
                    'Mirzapur': 3,
                    'Apharan': 2},
    
       'user6': {'Special Ops': 3,
                  'Panchayat': 4,
                  'Mirzapur': 3,
                  'Sacred Games': 5,
                  'Apharan': 3},
    
      'user7': {'Panchayat':4,
                  'Apharan':1,
                  'Sacred Games':4}}

__Observation:__
- The used dataset is based on the number of users, the web series they watched, and the rating given by them.

## Step 3: Create a DataFrame of the Dataset

- Convert the dataset to a DataFrame
- Once it is converted into a DataFrame, it is observed that some of the users did not watch certain web series.
- Hence, add values **Not Seen Yet**


In [None]:
dataset_df=pd.DataFrame(dataset)
dataset_df.fillna("Not Seen Yet",inplace=True)
dataset_df

Unnamed: 0,user1,user2,user3,user4,user5,user6,user7
Special Ops,5,5,2.0,Not Seen Yet,4.0,3.0,Not Seen Yet
Criminal Justice,3,3,Not Seen Yet,Not Seen Yet,4.0,Not Seen Yet,Not Seen Yet
Panchayat,3,5,5.0,5.0,4.0,4.0,4.0
Sacred Games,3,5,3.0,4.0,Not Seen Yet,5.0,4.0
Apharan,2,3,Not Seen Yet,Not Seen Yet,2.0,3.0,1.0
Mirzapur,3,3,4.0,4.0,3.0,3.0,Not Seen Yet


__Observations:__
- The users who watch various series can be seen above.
- As shown, we have different users and web series they have watched.

## Step 4: Define unique_items Function

- Create a custom function to get the unique web series in the dataset


In [None]:
def unique_items():
    unique_items_list = []
    for person in dataset.keys():
        for items in dataset[person]:
            unique_items_list.append(items)
    s=set(unique_items_list)
    unique_items_list=list(s)
    return unique_items_list

In [None]:
unique_items()

['Apharan',
 'Mirzapur',
 'Special Ops',
 'Criminal Justice',
 'Panchayat',
 'Sacred Games']

__Observations:__
- Here, we can see the list of unique items we have.
- Defining the similarity between two items with text is not possible.
- Hence, define a function to find similarities between two items.

## Step 5: Define a Function to Find item_similarity
- Compare people and items to get ratings.

- Get item ratings for items 1 and 2.

- Calculate the cosine similarity for item 1 and item 2 ratings.

In [None]:
def item_similarity(item1,item2):
    both_rated = {}
    for person in dataset.keys():
        if item1 in dataset[person] and item2 in dataset[person]:
            both_rated[person] = [dataset[person][item1],dataset[person][item2]]

    number_of_ratings = len(both_rated)
    if number_of_ratings == 0:
        return 0

    item1_ratings = [[dataset[k][item1] for k,v in both_rated.items() if item1 in dataset[k] and item2 in dataset[k]]]
    item2_ratings = [[dataset[k][item2] for k, v in both_rated.items() if item1 in dataset[k] and item2 in dataset[k]]]
    
    cs = cosine_similarity(item1_ratings,item2_ratings)
    return cs[0][0]

In [None]:
print("Cosine Similarity:: ",item_similarity('Panchayat','Special Ops'))

Cosine Similarity::  0.9199418174856334


__Observation:__
- The cosine similarity is 0.9199.

##Step 6: Check for Similarity Between Multiple Items
- Define a function and call it the most similar item

In [None]:
def most_similar_items(target_item):
    un_lst=unique_items()
    scores = [(item_similarity(target_item,other_item),target_item+" --> "+other_item) for other_item in un_lst if other_item!=target_item]
    scores.sort(reverse=True)
    return scores

In [None]:
print(most_similar_items('Panchayat'))

[(0.9908301680442989, 'Panchayat --> Mirzapur'), (0.9749005254295224, 'Panchayat --> Sacred Games'), (0.9701425001453318, 'Panchayat --> Criminal Justice'), (0.9563650695950072, 'Panchayat --> Apharan'), (0.9199418174856334, 'Panchayat --> Special Ops')]


__Observations:__
- If we consider the panchayat and look for similar items, we will get a similarity score for all the other items.
- In the above output, we can see similarity scores for various items.

## Step 7: Define Target Movies to Users

- Define target movies to the user based on the target users and unseen movies


In [None]:
def target_movies_to_users(target_person):
    target_person_movie_lst = []
    unique_list =unique_items()
    for movies in dataset[target_person]:
        target_person_movie_lst.append(movies)

    s=set(unique_list)
    recommended_movies=list(s.difference(target_person_movie_lst))
    a = len(recommended_movies)
    if a == 0:
        return 0
    return recommended_movies,target_person_movie_lst

__Observations:__
- The def function returns recommended movies and the target person's movie list
- Check the function unseen movies and seen movies for user 7
- Create a dictionary for unseen movies and seen movies

In [None]:
unseen_movies,seen_movies=target_movies_to_users('user7')

dct = {"Unseen Movies":unseen_movies,"Seen Movies":seen_movies}
pd.DataFrame(dct)

Unnamed: 0,Unseen Movies,Seen Movies
0,Criminal Justice,Panchayat
1,Special Ops,Apharan
2,Mirzapur,Sacred Games


__Observation:__
- In the above output, we can see the list of movies seen and unseen by user 7.