# AI-Powered Product Recommendation System for E-commerce

**Module E: AI Applications – Individual Open Project**

---

## 1. Problem Definition & Objective

### Problem Statement
Modern e-commerce platforms offer a vast number of products, which makes it difficult for customers to easily find items that match their preferences. Without personalized recommendations, users may feel overwhelmed, leading to reduced engagement and lower sales.

### Objective
The objective of this project is to develop an AI-powered product recommendation system that suggests relevant products to users based on product similarity. The system aims to enhance user experience and support better decision-making in e-commerce platforms.


In [19]:
import pandas as pd

# Creating a synthetic e-commerce dataset
data = {
    "product_name": [
        "iPhone 14", "Samsung Galaxy S23", "MacBook Air M2", "HP Pavilion Laptop",
        "Nike Running Shoes", "Adidas Sneakers", "Levi's Denim Jacket", "Zara Summer Dress",
        "Wooden Dining Table", "Office Chair", "LED Study Lamp", "Wall Clock"
    ],
    "category": [
        "Electronics", "Electronics", "Electronics", "Electronics",
        "Fashion", "Fashion", "Fashion", "Fashion",
        "Home & Furniture", "Home & Furniture", "Home & Furniture", "Home & Furniture"
    ],
    "description": [
        "Apple smartphone with advanced camera and iOS",
        "Android smartphone with high performance processor",
        "Lightweight laptop with Apple M2 chip",
        "Affordable laptop suitable for office and study work",
        "Comfortable running shoes for daily workouts",
        "Stylish sneakers for casual wear",
        "Classic denim jacket for all seasons",
        "Lightweight dress perfect for summer outings",
        "Solid wooden dining table for family use",
        "Ergonomic office chair with back support",
        "Energy efficient LED lamp for studying",
        "Decorative wall clock for home interiors"
    ]
}

df = pd.DataFrame(data)
df


Unnamed: 0,product_name,category,description
0,iPhone 14,Electronics,Apple smartphone with advanced camera and iOS
1,Samsung Galaxy S23,Electronics,Android smartphone with high performance proce...
2,MacBook Air M2,Electronics,Lightweight laptop with Apple M2 chip
3,HP Pavilion Laptop,Electronics,Affordable laptop suitable for office and stud...
4,Nike Running Shoes,Fashion,Comfortable running shoes for daily workouts
5,Adidas Sneakers,Fashion,Stylish sneakers for casual wear
6,Levi's Denim Jacket,Fashion,Classic denim jacket for all seasons
7,Zara Summer Dress,Fashion,Lightweight dress perfect for summer outings
8,Wooden Dining Table,Home & Furniture,Solid wooden dining table for family use
9,Office Chair,Home & Furniture,Ergonomic office chair with back support


## 2. Data Understanding & Preparation

### Dataset Source
The dataset used in this project is a **synthetic e-commerce product dataset** created specifically for academic purposes. It contains product names, categories, and textual descriptions, which are used to compute similarity between products.

### Data Exploration
Basic exploration is performed to understand the structure and quality of the dataset.


In [20]:
# Display basic information about the dataset
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   product_name  12 non-null     object
 1   category      12 non-null     object
 2   description   12 non-null     object
dtypes: object(3)
memory usage: 420.0+ bytes


In [21]:
# Check for missing values
df.isnull().sum()


Unnamed: 0,0
product_name,0
category,0
description,0


In [22]:
# Check for duplicate records
df.duplicated().sum()


np.int64(0)

## 3. Model / System Design

### AI Technique Used
This project uses a **Content-Based Recommendation System** based on machine learning techniques. Products are recommended by measuring similarity between product descriptions and categories.

### System Workflow
Product Features → Text Vectorization (TF-IDF) → Similarity Computation → Product Recommendations

### Justification
Content-based filtering is suitable for this project as it does not require user interaction history and works well for small-to-medium datasets. It is also explainable and easy to interpret.


In [23]:
# Combine category and description into a single feature
df['combined_features'] = df['category'] + " " + df['description']

# Display updated dataset
df[['product_name', 'combined_features']].head()


Unnamed: 0,product_name,combined_features
0,iPhone 14,Electronics Apple smartphone with advanced cam...
1,Samsung Galaxy S23,Electronics Android smartphone with high perfo...
2,MacBook Air M2,Electronics Lightweight laptop with Apple M2 chip
3,HP Pavilion Laptop,Electronics Affordable laptop suitable for off...
4,Nike Running Shoes,Fashion Comfortable running shoes for daily wo...


In [24]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Convert text data into numerical vectors
vectorizer = TfidfVectorizer(stop_words='english')
tfidf_matrix = vectorizer.fit_transform(df['combined_features'])

tfidf_matrix.shape


(12, 57)

In [25]:
from sklearn.metrics.pairwise import cosine_similarity

# Compute cosine similarity between products
similarity_matrix = cosine_similarity(tfidf_matrix)

similarity_matrix.shape


(12, 12)

In [26]:
def recommend_products(product_name, top_n=5):
    if product_name not in df['product_name'].values:
        return "Product not found"

    index = df[df['product_name'] == product_name].index[0]
    similarity_scores = list(enumerate(similarity_matrix[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)

    recommended_indices = [i[0] for i in similarity_scores[1:top_n+1]]
    return df['product_name'].iloc[recommended_indices]


In [27]:
recommend_products("iPhone 14")


Unnamed: 0,product_name
2,MacBook Air M2
1,Samsung Galaxy S23
3,HP Pavilion Laptop
4,Nike Running Shoes
5,Adidas Sneakers


## 4. Evaluation & Analysis

### Evaluation Method
The recommendation system is evaluated qualitatively by examining the relevance of recommended products based on cosine similarity scores.

### Sample Output
The system successfully recommends products belonging to similar categories and having related descriptions, indicating effective content-based similarity matching.

### Limitations
- The system does not use user behavior or purchase history  
- Recommendations depend heavily on textual descriptions  
- Scalability may be limited for very large datasets


## 5. Ethical Considerations & Responsible AI

- The dataset used in this project is synthetic and does not contain personal or sensitive user data  
- Bias in product descriptions may affect recommendation diversity  
- The system ensures user privacy by avoiding personal data collection  
- The recommendations are intended for responsible and transparent use


## 6. Conclusion & Future Scope

### Conclusion
This project demonstrates the implementation of an AI-powered content-based recommendation system for e-commerce platforms. The system effectively suggests relevant products based on similarity analysis.

### Future Scope
- Integration of user interaction and purchase history  
- Use of collaborative or hybrid recommendation techniques  
- Deployment as a web-based application  
- Real-time recommendation updates using streaming data
