# <center>Laptop Recommendation System (Content-Based)</center>


**Recommendation systems** are intelligent algorithms used to suggest relevant items to users based on preferences, behavior, or similarity. They are widely used in e-commerce (Amazon), entertainment (Netflix, Spotify), and social platforms (YouTube, Twitter).

In this notebook, we build a **Content-Based Laptop Recommendation System for the Tunisian market**, using a dataset scraped from popular local e-commerce platforms.

### 📚 Recommendation Techniques Overview

| Technique                  | Description                                                                                          | Best Use Case                                                          |
|----------------------------|------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------|
| Content-Based Filtering    | Recommends items similar to those the user liked based on item attributes                           | When item attributes are available, no user data                       |
| Collaborative Filtering    | Recommends based on what similar users liked (user-item matrix)                                      | Requires user-item interactions (ratings, clicks)                     |
| Hybrid Methods             | Combines both collaborative and content-based features                                               | Used in advanced systems like Netflix                                 |
| Deep Learning Approaches   | Uses embeddings and neural networks for complex pattern recognition                                  | For large-scale platforms with rich data and computation resources    |


### ✅ Why Content-Based Filtering?
We choose **Content-Based Filtering** because:

- ✅ We have rich item attributes: brand, processor, GPU, RAM, screen size, etc.
- ❌ We don’t have user interaction data (no clicks or ratings).
- 🎯 We aim to recommend laptops similar to a selected one based on its specs.

This approach allows us to build a personalized system without needing user history.

### 📏 Evaluation Metrics

We evaluate recommendations using **Cosine Similarit**y and **KNN** methods, complemented by a custom **Overlap Score**, which quantifies how closely the recommended laptops match the input features.

- **Overlap Score:** Percentage of matching features between the input laptop and each recommended laptop averaged across all recommendations — higher scores indicate more relevant suggestions.

Other metrics we consider:

| Metric                  | Description                                                                 |
|-------------------------|-----------------------------------------------------------------------------|
| Visual Similarity Check | Manually verify if recommended laptops match specs                         |
| Diversity               | Recommendations shouldn't all be duplicates                                |
| Future Feedback         | In production, user click-through or satisfaction metrics can be collected |

<br><br>

### <center>Ready to explore Content-Based Laptop Recommendation System?<br> Let’s go! 🚀</center>

## 📦 Load Dataset

In [1]:
import pandas as pd
import numpy as np
import category_encoders as ce
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt
import seaborn as sns
import joblib

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

# Set seaborn theme
sns.set(style="whitegrid", palette="muted", font_scale=1.1)
plt.rcParams['figure.figsize'] = (6, 4)

In [2]:
full_data = pd.read_csv('/kaggle/input/tunisia-laptop-market-cleaned-dataset-2025/tunisia_laptop_prices_2025.csv')
full_data.shape

(2969, 15)

In [3]:
full_data = full_data.drop_duplicates(subset=['store','brand', 'screen_size', 'processor', 'ram', 'SSD', 'HDD', 'gpu', 'os', 'gamer', 'price'])
full_data.shape

(2668, 15)

## 🧹 Preprocessing: Features for Recommendation

In [4]:
# ✅ 3. Preprocessing: Features for Recommendation
features = ['brand', 'screen_size', 'processor', 'ram', 'SSD', 'HDD', 'gpu', 'os', 'gamer', 'price']
data = full_data[features].copy()

In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2668 entries, 0 to 2968
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   brand        2668 non-null   object 
 1   screen_size  2668 non-null   float64
 2   processor    2668 non-null   object 
 3   ram          2668 non-null   int64  
 4   SSD          2668 non-null   int64  
 5   HDD          2668 non-null   int64  
 6   gpu          2668 non-null   object 
 7   os           2668 non-null   object 
 8   gamer        2668 non-null   int64  
 9   price        2668 non-null   float64
dtypes: float64(2), int64(4), object(4)
memory usage: 229.3+ KB


In [6]:
# Target Encoding
categorical_cols = data.select_dtypes(include='object').columns.tolist()
encoder = ce.TargetEncoder(cols=categorical_cols)
data_encoded = encoder.fit_transform(data, data['price'])

In [7]:
# === Feature matrix ===
feature_matrix = data_encoded.values

## 🔹 Cosine Similarity Recommendation

In [8]:
def recommend_cosine(input_row, top_n=6):
    input_encoded = encoder.transform(input_row)
    similarities = cosine_similarity(input_encoded, feature_matrix)[0]
    top_indices = similarities.argsort()[::-1][1:top_n+1]
    return full_data.iloc[top_indices].reset_index(drop=True)

In [9]:
def evaluate_overlap(input_row, recommended_df):
    matches = []

    for _, row in recommended_df.iterrows():
        match_count = 0
        for col in features:
            if row[col] == input_row.iloc[0][col]:
                match_count += 1
        matches.append(match_count / len(features))
    return np.mean(matches)

In [10]:
# Input row to test
input_row = pd.DataFrame([{
    'brand': 'Hp',
    'screen_size': 15.6,
    'processor': 'AMD Ryzen 5',
    'ram': 8,
    'SSD': 512,
    'HDD': 0,
    'gpu': 'AMD',
    'os': 'Windows',
    'gamer': 1,
    'price': 1580.0
}])

In [11]:
cosine_recommendations = recommend_cosine(input_row, top_n=6)
cosine_score = evaluate_overlap(input_row, cosine_recommendations)

print("🔎 Cosine Recommendations:")
display(cosine_recommendations[['brand', 'name', 'price', 'store']])

print(f"✅ Cosine Similarity Overlap Score: {cosine_score * 100:.2f}%")


🔎 Cosine Recommendations:


Unnamed: 0,brand,name,price,store
0,Asus,PC PORTABLE ASUS M3502QA RAYZEN 5 8GO 512GO SS...,1790.0,agora
1,Asus,Pc Portable Asus Vivobook Go 15 E1504FA AMD Ry...,1329.0,SpaceNet
2,Asus,PC Portable ASUS Vivobook 15 X1504VA Intel Cor...,1709.0,Mytek
3,Asus,Pc Portable ASUS Vivobook 16 X1605VA / i5-1342...,1719.0,tunisianet
4,Asus,PC Portable ASUS Vivobook 16 X1605VA i5 13è Gé...,1719.0,Mytek
5,Asus,PC PORTABLE ASUS VIVOVOOK 15 X1502VA I5-13500H...,1689.0,agora


✅ Cosine Similarity Overlap Score: 50.00%


## 🔹 KNN-Based Recommendation

In [12]:
def recommend_knn(input_row, top_n=5):
    input_encoded = encoder.transform(input_row).values
    knn = NearestNeighbors(n_neighbors=top_n + 1, metric='euclidean')
    knn.fit(feature_matrix)
    distances, indices = knn.kneighbors(input_encoded)
    return full_data.iloc[indices[0][1:]].reset_index(drop=True)

In [13]:
knn_recommendations = recommend_knn(input_row, top_n=6)
knn_score = evaluate_overlap(input_row, knn_recommendations)

print("🔎 KNN Recommendations:")
display(knn_recommendations[['brand', 'name', 'price', 'store']])
print(f"✅ KNN Overlap Score: {knn_score * 100:.2f}%")

🔎 KNN Recommendations:


Unnamed: 0,brand,name,price,store
0,Asus,PC PORTABLE ASUS M3502QA RAYZEN 5 8GO 512GO SS...,1790.0,agora
1,Asus,Pc Portable Asus Vivobook Go 15 E1504FA AMD Ry...,1329.0,SpaceNet
2,Hp,Pc Portable HP 15 RYZEN 5 | 8GO - 512SSD - SILVER,1339.0,graiet
3,Dell,Pc Portable DELL Inspiron 3535 / Ryzen 5 7520U...,1229.0,tunisianet
4,Dell,Pc Portable DELL Inspiron 3535 / Ryzen 5 7520U...,1219.0,tunisianet
5,Asus,Pc Portable Asus Vivobook 15 M1502YA Ryzen 7 8...,1625.0,SpaceNet


✅ KNN Overlap Score: 65.00%


## 🏆 Final Selection & Save Model


In [14]:
best_method = 'Cosine Similarity' if cosine_score >= knn_score else 'KNN'

print(f"\n🏆 Best Recommendation Method: {best_method}")


🏆 Best Recommendation Method: KNN


## 🔁 Fit & Save KNN model

In [15]:
knn_model = NearestNeighbors(metric='euclidean')
knn_model.fit(feature_matrix)

np.save("feature_matrix.npy", feature_matrix)
joblib.dump(knn_model, "knn_recommendation_model.joblib")

['knn_recommendation_model.joblib']

## 🎯 Conclusion

In this notebook, we successfully built a **Content-Based Laptop Recommendation System** tailored for the Tunisian market using rich laptop attributes — an essential step given the lack of user interaction data. We explored two effective methods: **Cosine Similarity** and **K-Nearest Neighbors (KNN)**, with **KNN** emerging as the best recommendation technique based on our overlap metric.



### What’s Next?

- 📊 Check out my **[Laptop Price Prediction Notebook](https://www.kaggle.com/code/dhaouadiibtihel98/tunisian-laptop-price-prediction-with-xgboost)** — a complementary project that predicts laptop prices using machine learning models.  
- Explore the full project workflow — from data scrapping and cleaning to model training and deployment — on my **[GitHub repository](https://github.com/ibtihel-dhaouadi/laptop-price-prediction-tn)**


### <center>🙏 Your Support Matters!</center>

**<center>If you found this notebook useful or interesting, please upvote and leave your comments below</center>**
**<center>Your feedback motivates me and helps improve future projects! 😊💻</center>**
