## Collaborative Recommendation System using VAECF (Cornac)

## 1. Objective  

The goal of this notebook is to build a **Collaborative Filtering Recommendation System** using the **Variational Autoencoder for Collaborative Filtering (VAECF)** model provided by the Cornac library.  

Unlike sequential or content-based recommenders, collaborative filtering relies purely on **user–item interaction patterns** (views and purchases). This allows the system to uncover latent user and item representations that drive recommendations.  

### Imports and Setups

In [None]:
%pip install scipy
%pip install cornac
%pip install pandas
%pip install numpy
%pip install tensorflow
%pip install matplotlib
%pip install scikit-learn
%pip install torch

In [None]:
import warnings
warnings.simplefilter("ignore")

In [None]:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
print ("Number of GPUS present:",len(tf.config.list_physical_devices('GPU')))

In [None]:
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
import pandas as pd
from cornac.models.vaecf import VAECF
from cornac.eval_methods import RatioSplit
from cornac.metrics import Recall, NDCG
from cornac.data import Dataset
import torch


## 3. Data Loading  

In [None]:
data1 = pd.read_csv('/content/sample_data/processed_data.csv')
data2 = pd.read_csv('/content/sample_data/processed_data1.csv')

data = pd.concat([data1, data2], ignore_index=True)

In [None]:
data.head(20)

## 4. Event Weighting  

To capture **interaction strength**, we assign weights:  

- `view` = **1.0**  
- `purchase` = **3.0**  

These are normalized to the range [0, 1]. Purchases therefore contribute more strongly to training.


In [None]:


# Map weights before encoding
event_mapping = {
    "view": 1.0,
    "purchase": 3.0
}
data["event_weight"] = data["event_type"].map(event_mapping).fillna(0.0)
data["event_weight"] = data["event_weight"] / data["event_weight"].max()

# Now do label encoding


In [None]:
data['event_type'].head().unique()

## 5. Encoding Users and Items  

We convert `user_id` and `product_id` into integer encodings for compatibility with Cornac.


In [None]:
event_encode = LabelEncoder()
data['event_type'] = event_encode.fit_transform(data['event_type'])
user_encoder = LabelEncoder()
item_encoder = LabelEncoder()
data['user'] = user_encoder.fit_transform(data['user_id'])
data['item'] = item_encoder.fit_transform(data['product_id'])


## 6. Train/Test Split  

We use Cornac’s `RatioSplit` to divide the dataset into **80% training** and **20% test**.  

- `rating_threshold=0.5` ensures only interactions above this weight are considered as positive feedback.


In [None]:
interactions = list(zip(data['user'], data['item'], data['event_weight']))



# Split train/test
eval_method = RatioSplit(
    data=interactions,
    test_size=0.2,
    rating_threshold=0.5,  # this part is for implicit feedback
    exclude_unknowns=True,
    verbose=True
)



## 7. VAECF Model Training  

We train a **Variational Autoencoder for Collaborative Filtering (VAECF)** model with the following hyperparameters:  
Training runs on GPU if available.


In [None]:
# VAE-CF model
model = VAECF(
    k=64,  # Latent factors
    autoencoder_structure=[200],
    n_epochs=10,
    batch_size=512,
    learning_rate=1e-4,
    use_gpu=True,  # set to false if cpu is used 
    verbose=True
)



# Train
history= model.fit(eval_method.train_set)
model.save("collab_recommender.keras")

### 8. Save the collaborative Recommender system

In [None]:
vae.save("Collaborative_Recommender_Model")


### 9. Model loading 

In [None]:

# Backup the original torch.load
original_torch_load = torch.load

# Define a patched version of torch.load that always forces CPU
def torch_load_cpu(*args, **kwargs):
    kwargs['map_location'] = torch.device('cpu')
    kwargs['weights_only'] = False  # Needed for PyTorch 2.6+
    return original_torch_load(*args, **kwargs)

# Patch torch.load
torch.load = torch_load_cpu

# Now load the Cornac model (this uses our patched torch.load)
model = VAECF.load("collab_recommender.keras/VAECF")

# Restore original torch.load after loading
torch.load = original_torch_load



### 10. EDA techniques before we make use of the model 

In [None]:
data = pd.read_csv("processed_data1.csv")

In [None]:
data.head()

### 11. Generating Recommendations  

We define a function `recommend_for_users()` that:  
- Scores all unseen items for a given user.  
- Returns **Top-N recommendations**.  
- Handles **cold-start users** by falling back to default items.


In [None]:
def recommend_for_users(model, eval_method, user_ids, top_n=10, fallback_items=None):
    train_set = eval_method.train_set
    uid_map = train_set.uid_map
    iid_map = train_set.iid_map
    iid_map_inv = {v: k for k, v in iid_map.items()}

    results = {}

    for user_id_str in user_ids:
        print(f"\n User: {user_id_str}")

        if user_id_str not in uid_map:
            print("Cold-start user (not in training data).")

            if fallback_items:
                fallback_recs = fallback_items[:top_n]
                print(f"Fallback recommendations: {fallback_recs}")
                results[user_id_str] = [(item_id, None) for item_id in fallback_recs]
            else:
                print("No fallback available.")
                results[user_id_str] = []
            continue

        user_id = uid_map[user_id_str]
        n_items = train_set.num_items

        scores = [model.score(user_id, item_id) for item_id in range(n_items)]

        user_interactions = train_set.matrix[user_id]
        seen_items = set(user_interactions.indices)

        unseen_scores = [(item_id, score) for item_id, score in enumerate(scores) if item_id not in seen_items]
        top_items = sorted(unseen_scores, key=lambda x: x[1], reverse=True)[:top_n]
        rec_items = [(iid_map_inv[item_id], round(score, 4)) for item_id, score in top_items]

        for item_id, score in rec_items:
            print(f"🛍️ Item ID: {item_id} | Score: {score}")

        results[user_id_str] = rec_items

    return results








In [None]:
user_list = ['555475417','1005115']

In [None]:
model.device = torch.device('cpu')


## 12. Example Recommendation  

We now test the recommender on a sample user (`555475417`).


In [None]:
recommendations = recommend_for_users(
    model=model, 
    eval_method=eval_method, 
    user_ids=['555475417'], 
    top_n=5,
   
)


## 13. Overall Conclusion  

- The **collaborative VAECF model** successfully captured user–item interaction patterns.  
- Unlike content-based methods, it does not rely on product metadata, but instead learns from **implicit feedback signals** (views, purchases).  
- The system complements the **Sequential** and **Content-Based Recommenders**, creating a robust **hybrid recommendation pipeline**.  
