PersonalizeX – Product Recommendation System

Live demo: https://personalizex-ritwik.streamlit.app

PersonalizeX is a small recommendation system that suggests products to a user based on what they and other users have interacted with. It uses collaborative filtering with cosine similarity, and there's a little Streamlit app to click around in.

I built it mainly to understand how the "recommended for you" sections on shopping sites actually work, instead of importing a library and trusting whatever it returns. The dataset is small and made up on purpose, so I could read the whole thing and check the recommendations by hand.

What it does

Loads the product catalog and the user–item interaction data
Builds a user–item matrix from the interactions
Computes cosine similarity between users and between products
Recommends 5 products for a selected user (from similar users), skipping anything they've already interacted with
Shows products similar to a chosen product
Has a small script that checks recommendation quality with a leave-one-out hit rate

Tech stack

Python, pandas, scikit-learn (for cosine similarity), and Streamlit for the UI.

The data

Both files live in data/, and they're synthetic — I wrote them by hand to keep things small.

products.csv — 24 products across 6 categories:

Column	Meaning
product_id	unique id like P001
product_name	name of the product
category	Electronics, Books, Home, Sports, Fashion, Beauty
price	price in dollars
rating	average rating out of 5

user_interactions.csv — 10 users, about 5 interactions each:

Column	Meaning
user_id	unique id like U1
product_id	the product they interacted with
interaction_score	how strong the interaction was (below)

The interaction score is a rough strength signal: 1 = viewed, 2 = added to cart, 3 = purchased, 4 = liked / reviewed highly. So a 4 counts for more than a 1 when measuring how similar two users are.

How it works

Everything is built around one table — a user–item matrix where rows are users, columns are products, and each cell is the interaction score (0 if the user never touched that product).

Once that table exists, cosine similarity compares two rows (users) or two columns (products) by the angle between them. Two users come out "similar" if they interacted with similar products in similar ways. I like that it looks at the pattern rather than how much total activity a user has.

User-based recommendations: for the selected user, I take every other user's row, weight it by how similar that user is, and add them up. Products that similar users liked a lot float to the top. I divide by the total similarity (otherwise products that lots of people touched always win), drop anything the user has already interacted with, and return the top 5.

Product-based recommendations: same idea, but I transpose the matrix so products become the rows. Two products are similar if the same kinds of users interacted with both — so this is based on behavior, not on the category or description.

Checking if it's any good

Evaluating a recommender was the part I understood least going in. evaluate.py does a simple leave-one-out check: for each user it hides the product they rated highest, rebuilds the model without it, and sees whether that product comes back in the top 5. The hit rate is just hits / users.

On this dataset it lands around 0.9, but I wouldn't read too much into that — the dataset is tiny and the hidden item is each user's strongest signal, so it's really a "is the logic doing something sensible" check, not a real score.

Running it

git clone https://github.com/<your-username>/personalizex.git
cd personalizex
pip install -r requirements.txt
streamlit run app.py

The app opens at http://localhost:8501. To see the logic without the UI:

python recommender.py   # prints sample recommendations
python evaluate.py      # prints the hit rate

If you'd rather use an isolated environment first: python -m venv venv, then source venv/bin/activate (Windows: venv\Scripts\activate).

Screenshots

These live in screenshots/ (there's a short note in that folder on how to capture them).

What I learned

A user–item matrix is just a table, but turning the interaction rows into it (and deciding the blanks should be 0) is what makes the math possible.
I assumed cosine similarity cared about magnitude — it doesn't, it's the angle. That clicked once I saw a very active user and a light user come out as "similar."
User-based and item-based filtering can disagree, which was surprising the first time.
Dividing by the total similarity actually mattered — before I added it, a couple of popular products showed up in almost every recommendation.
Evaluating recommenders is harder than building them. A hit rate on a small handmade dataset is reassuring but not really a metric.

Limitations

It's a learning project, so there's a lot it doesn't do:

The dataset is small and synthetic, so results are illustrative, not realistic.
It only uses interaction patterns — no product text, images, or descriptions.
Nothing is real-time; it's all computed from static CSV files.
No deep learning or matrix factorization.
It can't handle cold-start users or products (no interactions yet) — that needs a separate approach.
Cosine similarity on raw scores doesn't correct for users who rate everything high or low.

If I kept working on it

A content-based fallback (category/price) for cold-start cases
Try matrix factorization (SVD) and compare it against this
Swap in a larger public retail dataset
Add precision@k and recall@k alongside the hit rate
Show why each product was recommended

This is an educational project — the data is synthetic and the goal was to understand a basic recommender, not to build a production system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PersonalizeX – Product Recommendation System

What it does

Tech stack

The data

How it works

Checking if it's any good

Running it

Screenshots

What I learned

Limitations

If I kept working on it

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
app.py		app.py
evaluate.py		evaluate.py
recommender.py		recommender.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PersonalizeX – Product Recommendation System

What it does

Tech stack

The data

How it works

Checking if it's any good

Running it

Screenshots

What I learned

Limitations

If I kept working on it

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages