## What is Collaborative Filtering?
- Finding similar user and recommend what similar users like by classifying the users into clusters of similar types.
- 4 types :
    - Memory-Based
    - Model-Based
    - Hybrid
    - Deep Learning
## Advantages
- While content-based recommender systems have limited use cases and higher time complexity, collaborative filtering provides user preferences for personalized content.

## Similarity

<div>
<img src="col_fil.png" width="600">
</div>

- Cosine Similarity = $ \frac{A \cdot B}{|A|\times|B|} = \frac{\sum_{i=1}^{n} A_i \times B_i}{\sqrt
{\sum_{i=1}^{n}A_i^2} \times {\sum_{i=1}^{n}B_i^2}
} $ 


- You can recommend a product after finding similar users.
- You can recommend a product after finding similar products.

- Reference Link : https://www.geeksforgeeks.org/collaborative-filtering-ml/

In [3]:
import math
import numpy as np
import pandas as pd
df = pd.DataFrame(
    {'m1':[5,4,None,1],'m2':[4,None,1,2],'m3':[None,3,None,None],'m4':[5,None,2,None]},
    index=['u1','u2','u3','u4']
                 )
df

Unnamed: 0,m1,m2,m3,m4
u1,5.0,4.0,,5.0
u2,4.0,,3.0,
u3,,1.0,,2.0
u4,1.0,2.0,,


In [8]:
df_filled = df.fillna(0)

def cosine_similarity(v1, v2):
    dot_product = sum(a * b for a, b in zip(v1, v2))
    norm_v1 = math.sqrt(sum(a * a for a in v1))
    norm_v2 = math.sqrt(sum(b * b for b in v2))
    
    if norm_v1 == 0 or norm_v2 == 0:
        return 0
    
    return dot_product / (norm_v1 * norm_v2)


rows = df_filled.values.tolist()
cosine_sim_matrix = [[cosine_similarity(row1, row2) for row2 in rows] for row1 in rows]
cosine_sim_df = pd.DataFrame(cosine_sim_matrix, index=df.index, columns=df.index)

cosine_sim_df

Unnamed: 0,u1,u2,u3,u4
u1,1.0,0.492366,0.770675,0.715626
u2,0.492366,1.0,0.0,0.357771
u3,0.770675,0.0,1.0,0.4
u4,0.715626,0.357771,0.4,1.0


In [9]:
rows = df_filled.T.values.tolist()

cosine_sim_matrix = [[cosine_similarity(row1, row2) for row2 in rows] for row1 in rows]

cosine_sim_df = pd.DataFrame(cosine_sim_matrix, index=df.columns, columns=df.columns)

cosine_sim_df

Unnamed: 0,m1,m2,m3,m4
m1,1.0,0.740779,0.617213,0.716335
m2,0.740779,1.0,0.0,0.891485
m3,0.617213,0.0,1.0,0.0
m4,0.716335,0.891485,0.0,1.0
