# MANVUE Fashion Product Search with CLIP + FAISS + MongoDB
## AI-Powered Visual Search for Men's Fashion E-commerce

This notebook implements a complete visual search system using:
- **CLIP** (Contrastive Language-Image Pre-training) for image embeddings
- **FAISS** (Facebook AI Similarity Search) for fast similarity search
- **MongoDB GridFS** for image storage and retrieval
- **Fashion Product Dataset** for training and testing

**Features:**
- Upload user images and find similar products
- Store product embeddings in FAISS index
- Save images and metadata in MongoDB
- Train traditional ML algorithms on CLIP embeddings
- Real-time similarity search with sub-second response times

**Integration with MANVUE:**
- Connects to MANVUE MongoDB Atlas database
- Generates embeddings for all product images
- Provides API-ready search functionality


In [None]:
# STEP 1: Install Dependencies
%pip install pymongo gridfs pillow transformers torch torchvision faiss-cpu datasets scikit-learn

print("✅ Dependencies installed successfully!")


In [None]:
# STEP 2: Import Required Libraries
from pymongo import MongoClient
import gridfs
import requests, io, datetime, json
from PIL import Image
import numpy as np
import torch
from transformers import CLIPProcessor, CLIPModel
import faiss
from datasets import load_dataset

print("✅ All libraries imported successfully!")


In [None]:
# STEP 3: Connect to MongoDB Atlas
MONGO_URI = "mongodb+srv://19276146:19276146@manvue.ilich4r.mongodb.net/?retryWrites=true&w=majority&appName=MANVUE"
client = MongoClient(MONGO_URI)
db = client["MANVUE"]
fs = gridfs.GridFS(db)
print("✅ Connected to MongoDB Atlas")


## Dataset Setup Instructions

Before running the next cells, you need to download and upload the Kaggle Fashion Product Text Images Dataset:

### Option 1: Using Kaggle API (Recommended)
```python
# Install Kaggle API
!pip install kaggle

# Upload your kaggle.json file (get it from your Kaggle account settings)
# Then run:
!kaggle datasets download -d nirmalsankalana/fashion-product-text-images-dataset
!unzip fashion-product-text-images-dataset.zip
```

### Option 2: Manual Upload
1. Download the dataset from: https://www.kaggle.com/datasets/nirmalsankalana/fashion-product-text-images-dataset
2. Upload the extracted folder to your Colab environment
3. Update the `DATASET_PATH` variable in the next cell

### Expected Dataset Structure:
```
fashion-product-text-images-dataset/
├── images/           # Folder containing product images
│   ├── 0.jpg
│   ├── 1.jpg
│   └── ...
└── styles.csv        # CSV file with product metadata
```


In [None]:
# OPTIONAL: Auto-download Kaggle Dataset
# Uncomment and run this cell to automatically download the dataset

# !pip install kaggle
# !kaggle datasets download -d nirmalsankalana/fashion-product-text-images-dataset
# !unzip -q fashion-product-text-images-dataset.zip
# !rm fashion-product-text-images-dataset.zip

print("📝 If you haven't downloaded the dataset yet, please:")
print("1. Upload your kaggle.json file to Colab")
print("2. Uncomment and run the commands above")
print("3. Or manually upload the dataset files")


In [None]:
# STEP 4: Load CLIP Model
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"🖥️  Using device: {device}")

clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)
clip_processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

print("✅ CLIP model loaded successfully!")


In [None]:
# STEP 5: Load Kaggle Fashion Dataset
print("📦 Loading Kaggle fashion product dataset...")

# First, download and extract the dataset from Kaggle
# You need to upload the dataset files to your Colab environment
# The dataset should contain:
# - A folder with product images
# - A CSV file with product metadata

import pandas as pd
import os
from pathlib import Path

# Update these paths based on your uploaded dataset structure
DATASET_PATH = "/content/fashion-product-text-images-dataset"  # Update this path
IMAGES_FOLDER = os.path.join(DATASET_PATH, "images")  # Update if different
CSV_FILE = os.path.join(DATASET_PATH, "styles.csv")  # Update if different

# Load the CSV metadata
try:
    df = pd.read_csv(CSV_FILE)
    print(f"✅ Loaded CSV with {len(df)} products")
    print(f"📊 Columns: {list(df.columns)}")
    
    # Display first few rows to understand the structure
    print("\n📋 Sample data:")
    print(df.head())
    
    # Limit to first 500 products for demo (remove this line for full dataset)
    df = df.head(500)
    print(f"📦 Using {len(df)} products for this demo")
    
except FileNotFoundError:
    print("❌ Dataset files not found. Please upload the Kaggle dataset to your Colab environment.")
    print("Expected structure:")
    print("- /content/fashion-product-text-images-dataset/")
    print("  - images/ (folder with product images)")
    print("  - styles.csv (CSV file with product metadata)")
    df = None


In [None]:
# STEP 6: Build Embeddings and Save Products in MongoDB
print("🔄 Processing images and generating embeddings...")

if df is not None:
    image_embeddings = []
    metadata = []
    
    # Common column names in fashion datasets (adjust based on actual CSV structure)
    # These are typical column names - update based on your actual CSV
    id_col = 'id' if 'id' in df.columns else df.columns[0]
    name_col = 'productDisplayName' if 'productDisplayName' in df.columns else 'name'
    category_col = 'masterCategory' if 'masterCategory' in df.columns else 'category'
    color_col = 'baseColour' if 'baseColour' in df.columns else 'color'
    
    print(f"📊 Using columns: ID={id_col}, Name={name_col}, Category={category_col}, Color={color_col}")
    
    for i, row in df.iterrows():
        try:
            # Construct image path - common patterns in fashion datasets
            image_id = str(row[id_col])
            image_path = os.path.join(IMAGES_FOLDER, f"{image_id}.jpg")
            
            # Try different image extensions if .jpg doesn't exist
            if not os.path.exists(image_path):
                for ext in ['.png', '.jpeg', '.JPG', '.PNG']:
                    alt_path = os.path.join(IMAGES_FOLDER, f"{image_id}{ext}")
                    if os.path.exists(alt_path):
                        image_path = alt_path
                        break
            
            if not os.path.exists(image_path):
                print(f"⚠️  Image not found for ID {image_id}")
                continue
                
            # Load and process image
            img = Image.open(image_path).convert("RGB")
            
            # Get product text description
            text = str(row.get(name_col, f"Product {image_id}"))
            
            # Generate CLIP embeddings
            inputs = clip_processor(text=[text], images=[img], return_tensors="pt", padding=True).to(device)
            with torch.no_grad():
                outputs = clip_model(**inputs)
            
            img_emb = outputs.image_embeds / outputs.image_embeds.norm(p=2, dim=-1, keepdim=True)
            emb = img_emb.cpu().numpy()[0]
            
            # Save image in MongoDB
            buf = io.BytesIO()
            img.save(buf, format="JPEG")
            buf.seek(0)
            fs_id = fs.put(buf.read(), filename=f"product_{image_id}.jpg", metadata={
                "name": text,
                "category": str(row.get(category_col, "")),
                "color": str(row.get(color_col, "")),
                "product_id": image_id
            })
            
            # Store embedding + metadata
            image_embeddings.append(emb)
            metadata.append({
                "filename": f"product_{image_id}.jpg",
                "name": text,
                "category": str(row.get(category_col, "")),
                "color": str(row.get(color_col, "")),
                "product_id": image_id
            })
            
            if (i+1) % 100 == 0:
                print(f"Processed {i+1} items")
                
        except Exception as e:
            print(f"❌ Error processing item {i}: {e}")
            continue
    
    if image_embeddings:
        image_embeddings = np.array(image_embeddings).astype("float32")
        print(f"✅ Generated embeddings for {len(image_embeddings)} products")
    else:
        print("❌ No embeddings generated. Please check your dataset structure.")
        image_embeddings = []
        metadata = []
else:
    print("❌ Cannot process images - dataset not loaded")
    image_embeddings = []
    metadata = []


In [None]:
# STEP 7: Save Metadata and Build FAISS Index
print("💾 Saving metadata...")
with open("metadata.json", "w") as f:
    json.dump(metadata, f)

print("🔍 Building FAISS index...")
dim = image_embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(image_embeddings)
faiss.write_index(index, "fashion.index")

print("✅ FAISS index built and saved!")
print(f"📊 Index contains {index.ntotal} vectors with {dim} dimensions")


In [None]:
# STEP 8: Define Search Function
def upload_user_and_find(user_image_path, username="guest", top_k=5):
    """
    Upload a user image and find similar products using CLIP + FAISS
    """
    # Save uploaded image in MongoDB
    with open(user_image_path, "rb") as f:
        img_bytes = f.read()
    user_file_id = fs.put(img_bytes, filename=f"user_{username}_{datetime.datetime.now().timestamp()}.jpg",
                          metadata={"username": username, "upload_time": datetime.datetime.now()})
    
    # Get embedding
    user_img = Image.open(user_image_path).convert("RGB")
    inputs = clip_processor(images=[user_img], return_tensors="pt").to(device)

    with torch.no_grad():
        outputs = clip_model.get_image_features(**inputs)
    emb = outputs / outputs.norm(p=2, dim=-1, keepdim=True)
    emb = emb.cpu().numpy().astype("float32")

    # Search FAISS
    D, I = index.search(emb, top_k)
    results = [metadata[idx] for idx in I[0]]

    # Save query + results in MongoDB
    query_doc = {
        "username": username,
        "uploaded_file_id": str(user_file_id),
        "timestamp": datetime.datetime.now(),
        "results": results
    }
    db["queries"].insert_one(query_doc)

    return results

print("✅ Search function defined!")


In [None]:
# STEP 9: Test the Search Function
# Upload a sample image to test the search functionality
# You can upload any fashion image using the file upload button in Colab

# Example usage (uncomment and modify the path when you have an image):
# user_image = "/content/sample_fashion.jpg"  # Replace with your uploaded image path
# results = upload_user_and_find(user_image, username="test_user", top_k=5)
# 
# print("🔎 Similar Products Found:")
# for i, result in enumerate(results, 1):
#     print(f"{i}. {result['name']} (Category: {result['category']})")

print("📝 Ready to test! Upload an image and uncomment the code above to test the search function.")


## Bonus: Train Traditional ML Algorithms on CLIP Embeddings

Let's train some traditional machine learning algorithms using the CLIP embeddings as features to classify fashion categories.


In [None]:
# Import scikit-learn for traditional ML algorithms
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

print("✅ Scikit-learn imported for ML algorithms")


In [None]:
# Prepare dataset for traditional ML
print("🔄 Preparing dataset for machine learning...")

if len(image_embeddings) > 0 and len(metadata) > 0:
    # X = embeddings, y = category
    X = image_embeddings
    y = [m["category"] if m["category"] and m["category"] != "nan" else "unknown" for m in metadata]
    
    # Convert to numpy arrays
    y = np.array(y)
    
    # Split into train/test
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    print(f"✅ Dataset prepared: {X_train.shape[0]} training samples, {X_test.shape[0]} test samples")
    print(f"📊 Categories: {np.unique(y)}")
    print(f"📊 Category distribution: {np.bincount([hash(cat) % len(np.unique(y)) for cat in y])}")
else:
    print("❌ No data available for ML training. Please ensure the dataset was loaded correctly.")
    X_train, X_test, y_train, y_test = None, None, None, None


In [None]:
# Train and evaluate multiple ML algorithms
if X_train is not None and y_train is not None:
    print("🤖 Training machine learning models...")
    
    # 1. Logistic Regression
    log_reg = LogisticRegression(max_iter=2000, random_state=42)
    log_reg.fit(X_train, y_train)
    y_pred_lr = log_reg.predict(X_test)
    acc_lr = accuracy_score(y_test, y_pred_lr)
    print(f"📊 Logistic Regression Accuracy: {acc_lr:.3f}")
    
    # 2. Support Vector Machine
    svm_clf = SVC(kernel="linear", random_state=42)
    svm_clf.fit(X_train, y_train)
    y_pred_svm = svm_clf.predict(X_test)
    acc_svm = accuracy_score(y_test, y_pred_svm)
    print(f"📊 SVM Accuracy: {acc_svm:.3f}")
    
    # 3. k-Nearest Neighbors
    knn = KNeighborsClassifier(n_neighbors=5)
    knn.fit(X_train, y_train)
    y_pred_knn = knn.predict(X_test)
    acc_knn = accuracy_score(y_test, y_pred_knn)
    print(f"📊 kNN Accuracy: {acc_knn:.3f}")
    
    # 4. Random Forest
    rf = RandomForestClassifier(n_estimators=100, random_state=42)
    rf.fit(X_train, y_train)
    y_pred_rf = rf.predict(X_test)
    acc_rf = accuracy_score(y_test, y_pred_rf)
    print(f"📊 Random Forest Accuracy: {acc_rf:.3f}")
    
    print("\n=== Final Results ===")
    print(f"Logistic Regression: {acc_lr:.3f}")
    print(f"SVM: {acc_svm:.3f}")
    print(f"kNN: {acc_knn:.3f}")
    print(f"Random Forest: {acc_rf:.3f}")
    
    best_model = max([("Logistic Regression", acc_lr), ("SVM", acc_svm), 
                      ("kNN", acc_knn), ("Random Forest", acc_rf)], key=lambda x: x[1])
    print(f"\n🏆 Best performing model: {best_model[0]} with {best_model[1]:.3f} accuracy")
else:
    print("❌ Cannot train ML models - no data available")


## Summary

🎉 **Congratulations!** You've successfully implemented a complete visual search system for MANVUE using the Kaggle Fashion Product Text Images Dataset:

### What We Built:
1. **CLIP-based Image Embeddings** - Using OpenAI's CLIP model to generate rich image representations
2. **FAISS Similarity Search** - Fast, scalable similarity search for finding similar products
3. **MongoDB Integration** - Storing images and metadata in GridFS
4. **Traditional ML Classification** - Training multiple algorithms on CLIP embeddings
5. **Real-time Search API** - Function to upload user images and find similar products

### Dataset Used:
- **Source**: [Kaggle Fashion Product Text Images Dataset](https://www.kaggle.com/datasets/nirmalsankalana/fashion-product-text-images-dataset)
- **Content**: Fashion product images with detailed metadata
- **Structure**: Images folder + CSV metadata file
- **Scale**: Configurable (demo uses 500 products, can be scaled to full dataset)

### Key Features:
- ✅ **Flexible dataset loading** - Works with various CSV column structures
- ✅ **Robust image processing** - Handles multiple image formats
- ✅ **Sub-second search** response times
- ✅ **MongoDB storage** for images and metadata
- ✅ **Multiple ML algorithms** trained and compared
- ✅ **API-ready** search functionality

### Next Steps:
1. **Download the files**: `fashion.index` and `metadata.json` for your production system
2. **Integrate with your API**: Use the search function in your Node.js/Python backend
3. **Scale up**: Process the full dataset (remove the `.head(500)` limit)
4. **Deploy**: Set up the visual search in your MANVUE e-commerce platform

The system is now ready to power visual search in your MANVUE fashion store! 🛍️
