# Collaborative Filtering Hybrid

1. Create the User-Item Interaction Matrix:
This matrix represents the interactions (e.g., purchases) between users and items (books).

2. Compute Collaborative Filtering Similarity:
Calculate the similarity between users or items using collaborative filtering techniques.

3. Create Item Feature Matrix:
This matrix includes content-based features like book author and category.

4. Compute Content-Based Similarity:
Calculate the similarity between items based on their content features.

5. Combine Similarity Scores:
Combine the similarity scores from collaborative filtering and content-based filtering to get a hybrid similarity score.

6. Generate Recommendations:
Use the hybrid similarity score to generate recommendations for users.

### 1. Create the User-Item Interaction Matrix:

In [12]:
import pandas as pd
import numpy as np

# Example data
data = pd.DataFrame({
    'user_id': [1, 1, 2, 2, 3, 3, 4],
    'book_id': ['A', 'B', 'A', 'C', 'B', 'C', 'D'],
    'interaction': [1, 1, 1, 1, 1, 1, 1]  # Interaction could be a purchase
})
data

Unnamed: 0,user_id,book_id,interaction
0,1,A,1
1,1,B,1
2,2,A,1
3,2,C,1
4,3,B,1
5,3,C,1
6,4,D,1


In [13]:
# Create user-item matrix
user_item_matrix = pd.pivot_table(data, index='user_id', columns='book_id', values='interaction', fill_value=0)
user_item_matrix

book_id,A,B,C,D
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1.0,1.0,0.0,0.0
2,1.0,0.0,1.0,0.0
3,0.0,1.0,1.0,0.0
4,0.0,0.0,0.0,1.0


### 2. Compute Collaborative Filtering Similarity:

In [17]:
from sklearn.metrics.pairwise import cosine_similarity

# Compute item-item similarity using collaborative filtering
item_similarity_cf = cosine_similarity(user_item_matrix.T)
item_similarity_cf

array([[1. , 0.5, 0.5, 0. ],
       [0.5, 1. , 0.5, 0. ],
       [0.5, 0.5, 1. , 0. ],
       [0. , 0. , 0. , 1. ]])

In [18]:
item_similarity_cf_df = pd.DataFrame(item_similarity_cf, index=user_item_matrix.columns, columns=user_item_matrix.columns)
item_similarity_cf_df

book_id,A,B,C,D
book_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,1.0,0.5,0.5,0.0
B,0.5,1.0,0.5,0.0
C,0.5,0.5,1.0,0.0
D,0.0,0.0,0.0,1.0


### 3. Create Item Feature Matrix:

In [20]:
# Example book features (author and category)
book_features = pd.DataFrame({
    'book_id': ['A', 'B', 'C', 'D'],
    'author': ['Author1', 'Author2', 'Author1', 'Author3'],
    'category': ['Fiction', 'Non-Fiction', 'Fiction', 'Non-Fiction']
})
book_features

Unnamed: 0,book_id,author,category
0,A,Author1,Fiction
1,B,Author2,Non-Fiction
2,C,Author1,Fiction
3,D,Author3,Non-Fiction


In [21]:
# Convert categorical features to numerical values (e.g., using one-hot encoding)
book_features_encoded = pd.get_dummies(book_features.set_index('book_id'))
book_features_encoded

Unnamed: 0_level_0,author_Author1,author_Author2,author_Author3,category_Fiction,category_Non-Fiction
book_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A,True,False,False,True,False
B,False,True,False,False,True
C,True,False,False,True,False
D,False,False,True,False,True


### 4. Compute Content-Based Similarity:

In [22]:
# Compute item-item similarity using content-based filtering
item_similarity_cb = cosine_similarity(book_features_encoded)
item_similarity_cb

array([[1. , 0. , 1. , 0. ],
       [0. , 1. , 0. , 0.5],
       [1. , 0. , 1. , 0. ],
       [0. , 0.5, 0. , 1. ]])

In [23]:
item_similarity_cb_df = pd.DataFrame(item_similarity_cb, index=book_features_encoded.index, columns=book_features_encoded.index)
item_similarity_cb_df

book_id,A,B,C,D
book_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,1.0,0.0,1.0,0.0
B,0.0,1.0,0.0,0.5
C,1.0,0.0,1.0,0.0
D,0.0,0.5,0.0,1.0


### 5. Combine Similarity Scores:

In [25]:
# Combine collaborative filtering and content-based similarity scores
hybrid_similarity = (item_similarity_cf_df + item_similarity_cb_df) / 2
hybrid_similarity

book_id,A,B,C,D
book_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,1.0,0.25,0.75,0.0
B,0.25,1.0,0.25,0.25
C,0.75,0.25,1.0,0.0
D,0.0,0.25,0.0,1.0


### 6. Generate Recommendations:

In [26]:
def get_recommendations(book_id, hybrid_similarity, top_n=5):
    # Get the similarity scores for the given book
    similar_books = hybrid_similarity[book_id].sort_values(ascending=False)
    # Exclude the book itself from the recommendations
    similar_books = similar_books.drop(book_id)
    # Return the top N similar books
    return similar_books.head(top_n)

# Example: Get recommendations for book 'A'
recommendations = get_recommendations('A', hybrid_similarity)
recommendations


book_id
C    0.75
B    0.25
D    0.00
Name: A, dtype: float64