# What Is Item-Based Collaborative Filtering ?

- Item-based collaborative filtering is a technique used in recommender systems to provide personalized recommendations to users based on their preferences and the preferences of similar users. It is a form of collaborative filtering that focuses on the similarity between items rather than users.

- In item-based collaborative filtering, the recommendations are generated by identifying items that are similar to the ones a user has already shown interest in. The underlying assumption is that if a user likes or interacts with a particular item, they are likely to have similar preferences for other similar items.

- The process of item-based collaborative filtering typically involves the following steps:

- Data collection: Gather data on user-item interactions, such as ratings, reviews, or purchase history.

- Item similarity calculation: Calculate the similarity between items based on various metrics, such as cosine similarity or Pearson correlation. The similarity is usually determined by comparing the ratings or preferences of users who have interacted with both items.

- Neighborhood selection: Identify a subset of similar items for each item in the system. This subset, known as the item's neighborhood, consists of items that are most similar to the item in question.

- Recommendation generation: Once the item's neighborhood is established, the system can generate recommendations by considering the preferences of similar users. For a given user, the system identifies the items in their neighborhood that the user has not interacted with and recommends those items based on the assumption that the user will likely be interested in them.

- Item-based collaborative filtering has several advantages. It is computationally efficient and can handle large datasets and item catalogs. It also performs well when dealing with the "cold start" problem, where there is limited information about new users or items. Additionally, it can provide accurate recommendations based on item similarities.

- However, item-based collaborative filtering can suffer from the "sparsity" problem, where the user-item interaction matrix is sparse, meaning that most users have only interacted with a small fraction of the available items. In such cases, it can be challenging to find a sufficient number of similar items for recommendation.

- Overall, item-based collaborative filtering is a popular and effective approach in building recommender systems, particularly in scenarios where item similarities are well-defined and easily calculated.

![](https://predictivehacks.com/wp-content/uploads/2020/06/recommenders_systems.png)

# Road Map

- 1 Preparation of Data Set
- 2 Creating User Skincare Df
- 3 Making Item-Based Skincare Suggestions
- 4 Preparation of Study Script

# Preparation of Data Set

In [None]:
# import Required Libraries

import pandas as pd
import numpy as np

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [None]:
# Adjusting Row Column Settings

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 500)
pd.set_option('display.expand_frame_repr', False)

In [None]:
# Loading the Data Set

skincare = pd.read_csv('skincare.csv')
rating = pd.read_csv('item based.csv')

In [None]:
# Merging movie and rating data sets

df = skincare.merge(rating, how="left", on="SkincareID")

In [None]:
# Preliminary examination of the data set

def check_df(dataframe, head=5):
    print('##################### Shape #####################')
    print(dataframe.shape)
    print('##################### Types #####################')
    print(dataframe.dtypes)
    print('##################### Head #####################')
    print(dataframe.head(head))
    print('##################### Tail #####################')
    print(dataframe.tail(head))
    print('##################### NA #####################')
    print(dataframe.isnull().sum())
    print('##################### Quantiles #####################')
    print(dataframe.describe([0, 0.05, 0.50, 0.95, 0.99, 1]).T)

check_df(df)

##################### Shape #####################
(200, 5)
##################### Types #####################
SkincareID      int64
Nama           object
UserID          int64
Cocok/Tidak    object
Rating          int64
dtype: object
##################### Head #####################
   SkincareID                Nama  UserID  Cocok/Tidak  Rating
0           1  Lightening Day Gel       1        Cocok       3
1           1  Lightening Day Gel       2        Cocok       3
2           1  Lightening Day Gel       3  Tidak Cocok       2
3           1  Lightening Day Gel       4        Cocok       5
4           1  Lightening Day Gel       5  Tidak Cocok       2
##################### Tail #####################
     SkincareID             Nama  UserID  Cocok/Tidak  Rating
195          10  Jelly Mask Cool      16        Cocok       5
196          10  Jelly Mask Cool      17  Tidak Cocok       2
197          10  Jelly Mask Cool      18        Cocok       5
198          10  Jelly Mask Cool      19  T

# Creating User Skincare Df

In [None]:
# We determined how many comments each film received.
skincare_counts = pd.DataFrame(df["Nama"].value_counts())

In [None]:
skincare_counts.head(20)

Unnamed: 0_level_0,count
Nama,Unnamed: 1_level_1
Lightening Day Gel,20
ERHA 1 Facial Wash For Normal & Dry Skin,20
Scarlett Whitening & Hydrating Gel Mask,20
Pixy White Aqua Pore Cleanse Micellar Foam,20
Bija Trouble Facial Wash,20
AcneAct Gentle Acne Mosturizer,20
Peel Off Mask Lavender,20
Wonderskin Ultimate Cream,20
Light Complete White Speed Foam,20
Jelly Mask Cool,20


In [None]:
df["Nama"].nunique()

10

In [None]:
df["Nama"].unique()

array(['Lightening Day Gel', 'ERHA 1 Facial Wash For Normal & Dry Skin',
       'Scarlett Whitening & Hydrating Gel Mask',
       'Pixy White Aqua Pore Cleanse Micellar Foam',
       'Bija Trouble Facial Wash', 'AcneAct Gentle Acne Mosturizer',
       'Peel Off Mask Lavender', 'Wonderskin Ultimate Cream',
       'Light Complete White Speed Foam', 'Jelly Mask Cool'], dtype=object)

In [None]:
# Creating User Skincare Df
user_skincare_df = df.pivot_table(index=["UserID"], columns=["Nama"], values="Rating")

In [None]:
user_skincare_df

Nama,AcneAct Gentle Acne Mosturizer,Bija Trouble Facial Wash,ERHA 1 Facial Wash For Normal & Dry Skin,Jelly Mask Cool,Light Complete White Speed Foam,Lightening Day Gel,Peel Off Mask Lavender,Pixy White Aqua Pore Cleanse Micellar Foam,Scarlett Whitening & Hydrating Gel Mask,Wonderskin Ultimate Cream
UserID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1,4.0,4.0,1.0,2.0,1.0,3.0,4.0,1.0,2.0,1.0
2,2.0,5.0,3.0,2.0,3.0,3.0,4.0,4.0,4.0,3.0
3,4.0,5.0,4.0,3.0,4.0,2.0,2.0,5.0,5.0,4.0
4,5.0,2.0,5.0,4.0,4.0,5.0,5.0,3.0,3.0,4.0
5,5.0,2.0,4.0,3.0,5.0,2.0,2.0,4.0,4.0,2.0
6,2.0,1.0,4.0,4.0,2.0,4.0,2.0,5.0,1.0,5.0
7,4.0,4.0,2.0,2.0,2.0,3.0,5.0,2.0,2.0,5.0
8,5.0,2.0,4.0,5.0,4.0,5.0,5.0,4.0,4.0,5.0
9,4.0,3.0,2.0,4.0,2.0,2.0,4.0,2.0,2.0,4.0
10,5.0,1.0,4.0,5.0,4.0,5.0,4.0,4.0,5.0,4.0


In [None]:
user_skincare_df.shape

(20, 10)

In [None]:
user_skincare_df.columns

Index(['AcneAct Gentle Acne Mosturizer', 'Bija Trouble Facial Wash', 'ERHA 1 Facial Wash For Normal & Dry Skin', 'Jelly Mask Cool', 'Light Complete White Speed Foam', 'Lightening Day Gel', 'Peel Off Mask Lavender', 'Pixy White Aqua Pore Cleanse Micellar Foam', 'Scarlett Whitening & Hydrating Gel Mask', 'Wonderskin Ultimate Cream'], dtype='object', name='Nama')

# 3. Making Item-Based Skincare Suggestions

In [None]:
# custom function to implement cosine similarity between two items i.e. web series

def item_similarity(item1,item2):
    both_rated = {}
    for person in df.keys():
        if item1 in df[person] and item2 in df[person]:
            both_rated[person] = [df[person][item1],df[person][item2]]

    #print(both_rated)
    number_of_ratings = len(both_rated)
    if number_of_ratings == 0:
        return 0

    item1_ratings = [[df[k][item1] for k,v in both_rated.items() if item1 in df[k] and item2 in df[k]]]
    item2_ratings = [[df[k][item2] for k, v in both_rated.items() if item1 in df[k] and item2 in df[k]]]
    #print("{} ratings :: {}".format(item1,item1_ratings))
    #print("{} ratings :: {}".format(item2,item2_ratings))
    cs = cosine_similarity(item1_ratings,item2_ratings)
    return cs[0][0]

In [None]:
print("Cosine Similarity:: ",item_similarity('AcneAct Gentle Acne Mosturizer','Bija Trouble Facial Wash Ops'))

Cosine Similarity::  0


In [None]:
skincare_name = "Lightening Day Gel"

In [None]:
skincare_name = user_skincare_df[skincare_name]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)


Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Lightening Day Gel,1.0
ERHA 1 Facial Wash For Normal & Dry Skin,0.541936
AcneAct Gentle Acne Mosturizer,0.502989
Jelly Mask Cool,0.471783
Wonderskin Ultimate Cream,0.40693
Light Complete White Speed Foam,0.392084
Pixy White Aqua Pore Cleanse Micellar Foam,0.179784
Scarlett Whitening & Hydrating Gel Mask,0.153214
Peel Off Mask Lavender,0.03412
Bija Trouble Facial Wash,-0.655184


In [None]:
skincare_name2 = "Bija Trouble Facial Wash"

In [None]:
skincare_name = user_skincare_df[skincare_name2]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Bija Trouble Facial Wash,1.0
Scarlett Whitening & Hydrating Gel Mask,0.084737
Peel Off Mask Lavender,-0.11559
Pixy White Aqua Pore Cleanse Micellar Foam,-0.168045
ERHA 1 Facial Wash For Normal & Dry Skin,-0.411005
Wonderskin Ultimate Cream,-0.426424
Light Complete White Speed Foam,-0.437856
AcneAct Gentle Acne Mosturizer,-0.534571
Jelly Mask Cool,-0.534803
Lightening Day Gel,-0.655184


In [None]:
skincare_name3 = "Peel Off Mask Lavender"

In [None]:
skincare_name = user_skincare_df[skincare_name3]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Peel Off Mask Lavender,1.0
Wonderskin Ultimate Cream,0.185554
AcneAct Gentle Acne Mosturizer,0.13695
Lightening Day Gel,0.03412
Light Complete White Speed Foam,0.03236
ERHA 1 Facial Wash For Normal & Dry Skin,-0.058116
Pixy White Aqua Pore Cleanse Micellar Foam,-0.089156
Bija Trouble Facial Wash,-0.11559
Scarlett Whitening & Hydrating Gel Mask,-0.145372
Jelly Mask Cool,-0.19335


In [None]:
skincare_name4 = "ERHA 1 Facial Wash For Normal & Dry Skin"

In [None]:
skincare_name = user_skincare_df[skincare_name4]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
ERHA 1 Facial Wash For Normal & Dry Skin,1.0
Light Complete White Speed Foam,0.681586
Pixy White Aqua Pore Cleanse Micellar Foam,0.67402
Scarlett Whitening & Hydrating Gel Mask,0.548009
Lightening Day Gel,0.541936
Wonderskin Ultimate Cream,0.435385
Jelly Mask Cool,0.433952
AcneAct Gentle Acne Mosturizer,0.295315
Peel Off Mask Lavender,-0.058116
Bija Trouble Facial Wash,-0.411005


In [None]:
skincare_name5 = "Pixy White Aqua Pore Cleanse Micellar Foam"

In [None]:
skincare_name = user_skincare_df[skincare_name5]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Pixy White Aqua Pore Cleanse Micellar Foam,1.0
ERHA 1 Facial Wash For Normal & Dry Skin,0.67402
Light Complete White Speed Foam,0.62047
Scarlett Whitening & Hydrating Gel Mask,0.386717
Wonderskin Ultimate Cream,0.224679
Jelly Mask Cool,0.221339
Lightening Day Gel,0.179784
AcneAct Gentle Acne Mosturizer,0.035766
Peel Off Mask Lavender,-0.089156
Bija Trouble Facial Wash,-0.168045


In [None]:
skincare_name = pd.Series(user_skincare_df.columns).sample(1).values[0]


In [None]:
skincare_name = user_skincare_df[skincare_name]

In [None]:
user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Lightening Day Gel,1.0
ERHA 1 Facial Wash For Normal & Dry Skin,0.541936
AcneAct Gentle Acne Mosturizer,0.502989
Jelly Mask Cool,0.471783
Wonderskin Ultimate Cream,0.40693
Light Complete White Speed Foam,0.392084
Pixy White Aqua Pore Cleanse Micellar Foam,0.179784
Scarlett Whitening & Hydrating Gel Mask,0.153214
Peel Off Mask Lavender,0.03412
Bija Trouble Facial Wash,-0.655184


In [None]:
def check_skincare(keyword, user_skincare_df):
    return [col for col in user_skincare_df.columns if keyword in col]

In [None]:
check_skincare("Scarlett", user_skincare_df)

['Scarlett Whitening & Hydrating Gel Mask']

# 4. Preparation of Study Script

In [None]:
def create_user_skincare_df():
    import pandas as pd

    # Load datasets
    skincare = pd.read_csv('skincare.csv')
    rating = pd.read_csv('item based.csv')

    # Merge the datasets on 'SkincareID'
    merged_df = skincare.merge(rating, how="left", on="SkincareID")

    # Count the number of comments per skincare item
    comment_counts = pd.DataFrame(merged_df["Nama"].value_counts())

    # Use the merged dataframe directly to create a pivot table
    user_skincare_df = merged_df.pivot_table(index=["UserID"], columns=["Nama"], values="Rating")

    return user_skincare_df


In [None]:
user_skincare_df = create_user_skincare_df()

In [None]:
def item_based_recommender(skincare_name, user_skincare_df):
    skincare_name = user_skincare_df[skincare_name]
    return user_skincare_df.corrwith(skincare_name).sort_values(ascending=False).head(10)

In [None]:
item_based_recommender("Lightening Day Gel", user_skincare_df)

Unnamed: 0_level_0,0
Nama,Unnamed: 1_level_1
Lightening Day Gel,1.0
ERHA 1 Facial Wash For Normal & Dry Skin,0.541936
AcneAct Gentle Acne Mosturizer,0.502989
Jelly Mask Cool,0.471783
Wonderskin Ultimate Cream,0.40693
Light Complete White Speed Foam,0.392084
Pixy White Aqua Pore Cleanse Micellar Foam,0.179784
Scarlett Whitening & Hydrating Gel Mask,0.153214
Peel Off Mask Lavender,0.03412
Bija Trouble Facial Wash,-0.655184
