## 🏷️ Data Scenario: Auto-Tag and Categorize E-commerce Products

At **ShopSphere**, product categorization is vital.

🛒 Each week, thousands of new items are uploaded. Manual categorization is slow and leads to mistakes.

You are tasked to create an **AI solution** that automatically tags and categorizes products using semantic similarity.

In [None]:
import pandas as pd
from langchain_openai import OpenAIEmbeddings
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
possible_categories = ['Electronics', 'Clothing', 'Home Decor', 'Sports', 'Toys']

Let's load our dataset and have a look at it

In [None]:
products = pd.read_csv('products.csv')
products.head()

⚙️ First, we will generate embeddings for product descriptions and category names.

In [None]:
embeddings = OpenAIEmbeddings()
category_embeddings = embeddings.embed_documents(possible_categories)

🔍 Then, we'll assign each product to its closest matching category.

In [None]:
def assign_category(description):
	prod_emb = embeddings.embed_query(description)
	sims = cosine_similarity([prod_emb], category_embeddings)
	return possible_categories[np.argmax(sims)]
products['predicted_category'] = products['description'].apply(assign_category)
products.head()

💾 Save the categorized product list for catalog review.

In [None]:
products.to_csv('products_with_categories.csv', index=False)

## 🎉 Wrapping Up

Hope you enjoyed this automation project! 🎯

👉 What other product metadata would you predict automatically?
👉 Could embeddings help with customer search too?

💬 Share your ideas or experiments in the comments below!