# AI Fashion Assistant v2.4.5 - Multi-Modal RAG

**Day 3: Multimodal Retrieval**

---

**Project:** AI Fashion Assistant (TÜBİTAK 2209-A)  
**Student:** Hatice Baydemir  
**Date:** January 8, 2026  
**Version:** 2.4.5

---

## Goal

Implement multimodal retrieval system:
1. Load FAISS indices (text + image)
2. Build MultiModalRetriever class
3. Implement fusion strategies
4. Test 3 retrieval modes (text-only, image-only, multimodal)
5. Attribute-based filtering
6. Compare results

---

## PART 1: Setup

In [113]:
from google.colab import drive
drive.mount('/content/drive')

import os
os.chdir('/content/drive/MyDrive/ai_fashion_assistant_v2')

print('Drive mounted')
print(f'Working directory: {os.getcwd()}')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Drive mounted
Working directory: /content/drive/MyDrive/ai_fashion_assistant_v2


In [114]:
import json
import numpy as np
import pandas as pd
import pickle
from pathlib import Path
from typing import Dict, List, Tuple, Optional
import matplotlib.pyplot as plt
import torch
from PIL import Image

print('Imports complete')

Imports complete


---

## PART 2: Load FAISS Indices

In [115]:
# Install FAISS
!pip install -q faiss-cpu

import faiss

print('FAISS installed')

FAISS installed


In [116]:
# Load FAISS index (hybrid)
faiss_index_path = 'indexes/faiss_hybrid_hnsw.index'

if Path(faiss_index_path).exists():
    faiss_index = faiss.read_index(faiss_index_path)
    print(f'✓ FAISS index loaded: {faiss_index.ntotal} vectors, {faiss_index.d} dimensions')
else:
    print(f'✗ FAISS index not found')
    faiss_index = None

✓ FAISS index loaded: 44417 vectors, 2304 dimensions


In [117]:
# Check if image FAISS index exists (from v2.1)
image_index_path = 'v2.1-core-ml-plus/embeddings/faiss_index_clip.index'

if Path(image_index_path).exists():
    image_index = faiss.read_index(image_index_path)
    print(f'✓ Image index loaded: {image_index.ntotal} vectors, {image_index.d} dimensions')
else:
    print(f'✗ Image index not found at {image_index_path}')
    print('\nWill create image index if needed...')
    image_index = None

✗ Image index not found at v2.1-core-ml-plus/embeddings/faiss_index_clip.index

Will create image index if needed...


---

## PART 3: Load Product Metadata & Embeddings

In [118]:
# Load product metadata
products_df = pd.read_csv('data/processed/meta_ssot.csv')

print(f'Loaded {len(products_df)} products')
print(f'Columns: {list(products_df.columns)}')

Loaded 44417 products
Columns: ['id', 'productDisplayName', 'masterCategory', 'subCategory', 'articleType', 'baseColour', 'gender', 'season', 'year', 'usage', 'desc', 'image_path', 'text_embedding', 'image_embedding', 'hybrid_embedding']


In [119]:
# Load text embeddings (mpnet 768d)
text_emb_path = 'v2.0-baseline/embeddings/text/mpnet_768d.npy'

if Path(text_emb_path).exists():
    text_embeddings = np.load(text_emb_path)
    print(f'✓ Text embeddings loaded: {text_embeddings.shape}')
else:
    print(f'✗ Text embeddings not found')
    text_embeddings = None

✓ Text embeddings loaded: (44417, 768)


In [120]:
# Load image embeddings (CLIP 768d)
image_emb_path = 'v2.0-baseline/embeddings/image/clip_image_768d.npy'

if Path(image_emb_path).exists():
    image_embeddings = np.load(image_emb_path)
    print(f'✓ Image embeddings loaded: {image_embeddings.shape}')
else:
    print(f'✗ Image embeddings not found')
    image_embeddings = None

✓ Image embeddings loaded: (44417, 768)


In [121]:
# Load CLIP model for encoding queries
from transformers import CLIPProcessor, CLIPModel

model_name = "openai/clip-vit-large-patch14"
print(f'Loading CLIP model: {model_name}')
model = CLIPModel.from_pretrained(model_name)
processor = CLIPProcessor.from_pretrained(model_name)

device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

print(f'✓ CLIP model loaded on {device}')

Loading CLIP model: openai/clip-vit-large-patch14
✓ CLIP model loaded on cpu


In [122]:
# Encoding functions
def encode_text_clip(text: str) -> np.ndarray:
    """Encode text with CLIP (768d)"""
    inputs = processor(text=[text], return_tensors="pt", padding=True).to(device)
    with torch.no_grad():
        outputs = model.get_text_features(**inputs)
    return outputs.cpu().numpy()[0]

def encode_image_clip(image_path: str) -> np.ndarray:
    """Encode image with CLIP (768d)"""
    image = Image.open(image_path).convert('RGB')
    inputs = processor(images=image, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = model.get_image_features(**inputs)
    return outputs.cpu().numpy()[0]

print('Encoding functions ready')

Encoding functions ready


---

## PART 4: MultiModalRetriever Class

In [123]:
# PART 4: MultiModalRetriever Class

class MultiModalRetriever:
    """Multimodal retrieval with learned fusion"""

    def __init__(self,
                 text_index: faiss.Index,
                 image_index: faiss.Index,
                 products_df: pd.DataFrame,
                 alpha: float = 0.7):
        """
        Args:
            text_index: FAISS index for text embeddings
            image_index: FAISS index for image embeddings
            products_df: Product metadata
            alpha: Text weight in fusion (0.7 = 70% text, 30% image)
        """
        self.text_index = text_index
        self.image_index = image_index
        self.products_df = products_df
        self.alpha = alpha

    def retrieve_by_text(self, text_query: str, k: int = 20) -> Tuple[List[int], List[float]]:
        """Text-only retrieval (baseline)"""
        # Encode query
        text_emb = encode_text_clip(text_query)
        text_emb = text_emb.reshape(1, -1).astype('float32')

        # Search
        distances, indices = self.text_index.search(text_emb, k)

        return indices[0].tolist(), distances[0].tolist()

    def retrieve_by_image(self, image_path: str, k: int = 20) -> Tuple[List[int], List[float]]:
        """Image-only retrieval (new capability)"""
        if self.image_index is None:
            return [], []

        # Encode image
        img_emb = encode_image_clip(image_path)
        img_emb = img_emb.reshape(1, -1).astype('float32')

        # Search
        distances, indices = self.image_index.search(img_emb, k)

        return indices[0].tolist(), distances[0].tolist()

    def retrieve_multimodal(self,
                           image_path: str = None,
                           text_query: str = None,
                           k: int = 20) -> List[int]:
        """Multimodal fusion retrieval"""

        # Get rankings from both modalities
        if image_path:
            img_indices, img_scores = self.retrieve_by_image(image_path, k=50)
        else:
            img_indices, img_scores = [], []

        if text_query:
            text_indices, text_scores = self.retrieve_by_text(text_query, k=50)
        else:
            text_indices, text_scores = [], []

        # Fusion strategy
        if not img_indices and not text_indices:
            return []
        elif not img_indices:
            return text_indices[:k]
        elif not text_indices:
            return img_indices[:k]

        # Normalize scores to [0, 1]
        text_scores_norm = self._normalize_scores(text_scores)
        img_scores_norm = self._normalize_scores(img_scores)

        # Create score maps
        text_score_map = {idx: score for idx, score in zip(text_indices, text_scores_norm)}
        img_score_map = {idx: score for idx, score in zip(img_indices, img_scores_norm)}

        # Combine all product IDs
        all_products = set(text_indices) | set(img_indices)

        # Fusion: weighted combination
        fused_scores = {}
        for prod_id in all_products:
            text_score = text_score_map.get(prod_id, 0)
            img_score = img_score_map.get(prod_id, 0)
            fused_scores[prod_id] = self.alpha * text_score + (1 - self.alpha) * img_score

        # Sort by fused score
        ranked = sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)

        return [prod_id for prod_id, score in ranked[:k]]

    def _normalize_scores(self, scores: List[float]) -> List[float]:
        """Normalize scores to [0, 1] using min-max"""
        if not scores:
            return []

        scores = np.array(scores)

        # FAISS L2 distance: smaller distance = more similar
        # Negatif yap: küçük distance = büyük score
        similarities = -scores

        # Min-max normalize to [0, 1]
        min_sim = similarities.min()
        max_sim = similarities.max()

        if max_sim - min_sim > 0:
            normalized = (similarities - min_sim) / (max_sim - min_sim)
        else:
            normalized = np.ones_like(similarities)

        return normalized.tolist()

print('MultiModalRetriever class defined')

MultiModalRetriever class defined


---

## PART 5: Initialize Retriever

In [124]:
# PART 5: Create FAISS Indices with CLIP Text Embeddings

print("Creating CLIP text embeddings for all products...")
print("This will take ~10 minutes for 44,417 products")

clip_text_embeddings = []

batch_size = 100  # Batch processing
total_products = len(products_df)

for i in range(0, total_products, batch_size):
    batch = products_df.iloc[i:i+batch_size]

    for idx, row in batch.iterrows():

        text = f"{row['baseColour']} {row['articleType']}"


        emb = encode_text_clip(text)
        clip_text_embeddings.append(emb)

    # Progress
    if (i + batch_size) % 1000 == 0:
        print(f"  Processed {i + batch_size}/{total_products} products...")

clip_text_embeddings = np.array(clip_text_embeddings).astype('float32')
print(f"✓ CLIP text embeddings created: {clip_text_embeddings.shape}")



text_index = faiss.IndexFlatL2(clip_text_embeddings.shape[1])
text_index.add(clip_text_embeddings)
print(f'✓ Text index created with CLIP: {text_index.ntotal} vectors')


if image_embeddings is not None:
    image_index = faiss.IndexFlatL2(image_embeddings.shape[1])
    image_index.add(image_embeddings.astype('float32'))
    print(f'✓ Image index created: {image_index.ntotal} vectors')
else:
    image_index = None
    print('✗ Cannot create image index')


# Index → Product ID mapping
index_to_product_id = products_df['id'].values
print(f'✓ Created index-to-ID mapping: {len(index_to_product_id)} products')


# Initialize retriever
if text_index and image_index:
    retriever = MultiModalRetriever(
        text_index=text_index,
        image_index=image_index,
        products_df=products_df,
        alpha=0.7
    )
    print('✓ MultiModalRetriever initialized with CLIP text embeddings')
    print(f'  Alpha (text weight): {retriever.alpha}')
    print(f'  Text: CLIP, Image: CLIP → Consistent embedding space!')
else:
    print('✗ Cannot initialize retriever - missing embeddings')
    retriever = None

Creating CLIP text embeddings for all products...
This will take ~10 minutes for 44,417 products
  Processed 1000/44417 products...
  Processed 2000/44417 products...
  Processed 3000/44417 products...
  Processed 4000/44417 products...
  Processed 5000/44417 products...
  Processed 6000/44417 products...
  Processed 7000/44417 products...
  Processed 8000/44417 products...
  Processed 9000/44417 products...
  Processed 10000/44417 products...
  Processed 11000/44417 products...
  Processed 12000/44417 products...
  Processed 13000/44417 products...
  Processed 14000/44417 products...
  Processed 15000/44417 products...
  Processed 16000/44417 products...
  Processed 17000/44417 products...
  Processed 18000/44417 products...
  Processed 19000/44417 products...
  Processed 20000/44417 products...
  Processed 21000/44417 products...
  Processed 22000/44417 products...
  Processed 23000/44417 products...
  Processed 24000/44417 products...
  Processed 25000/44417 products...
  Processed 

---

## PART 6: Test Retrieval Strategies

In [125]:
# PART 6: Test Retrieval Strategies

# Load test queries from Day 2
test_queries_df = pd.read_csv('v2.4.5-multimodal-rag/evaluation/results/image_queries.csv')

print(f'Loaded {len(test_queries_df)} test queries')

# Test all 3 strategies on sample queries
results_comparison = []

for idx, row in test_queries_df.head(5).iterrows():
    product_id = row['product_id']
    product_name = row['original_name']
    text_query = row['generated_query']

    # Get image path
    product_row = products_df[products_df['id'] == product_id].iloc[0]
    image_path = product_row['image_path']

    # Fix path
    if 'ai_fashion_assistant_v1' in str(image_path):
        image_path = image_path.replace('ai_fashion_assistant_v1', 'ai_fashion_assistant_v2')

    # Alternative path
    alt_path = f'data/raw/images/{product_id}.jpg'

    # Check which path exists (with try-except)
    image_exists = False
    try:
        if Path(image_path).exists():
            image_exists = True
    except:
        pass

    if not image_exists:
        try:
            if Path(alt_path).exists():
                image_path = alt_path
                image_exists = True
        except:
            pass

    print(f"\nTesting: {product_name[:50]}")
    print(f"Query: '{text_query}'")

    # Strategy 1: Text-only
    text_results, _ = retriever.retrieve_by_text(text_query, k=10)
    print(f"  Text-only: {len(text_results)} results")

    # Strategy 2: Image-only
    if image_exists:
        try:
            img_results, _ = retriever.retrieve_by_image(image_path, k=10)
            print(f"  Image-only: {len(img_results)} results")
        except Exception as e:
            img_results = []
            print(f"  Image-only: Error - {str(e)[:50]}")
    else:
        img_results = []
        print(f"  Image-only: Image not available")

    # Strategy 3: Multimodal fusion
    if image_exists:
        try:
            mm_results = retriever.retrieve_multimodal(
                image_path=image_path,
                text_query=text_query,
                k=10
            )
            print(f"  Multimodal: {len(mm_results)} results")
        except Exception as e:
            mm_results = text_results[:10]
            print(f"  Multimodal: Using text-only (error)")
    else:
        mm_results = text_results[:10]
        print(f"  Multimodal: Using text-only (no image)")

    # FAISS indices → Product IDs'e çevir
    text_product_ids = [index_to_product_id[idx] for idx in text_results]
    img_product_ids = [index_to_product_id[idx] for idx in img_results] if img_results else []
    mm_product_ids = [index_to_product_id[idx] for idx in mm_results]

    results_comparison.append({
        'product_id': product_id,
        'product_name': product_name,
        'query': text_query,
        'text_only': text_product_ids,
        'image_only': img_product_ids,
        'multimodal': mm_product_ids
    })

print(f'\nTested {len(results_comparison)} queries')

Loaded 5 test queries

Testing: Mark Taylor Men White Striped Shirt
Query: 'white shirts'
  Text-only: 10 results
  Image-only: 10 results
  Multimodal: 10 results

Testing: Arrow Woman Sylvia Blue Shirt
Query: 'blue shirts'
  Text-only: 10 results
  Image-only: 10 results
  Multimodal: 10 results

Testing: Locomotive Men Check Ladislav Purple Shirts
Query: 'purple shirts'
  Text-only: 10 results
  Image-only: 10 results
  Multimodal: 10 results

Testing: Highlander Men Solid Poplin Purple Shirts
Query: 'purple shirts'
  Text-only: 10 results
  Image-only: 10 results
  Multimodal: 10 results

Testing: Jealous 21 Women Supper Zipped Ankle Black Jeans
Query: 'black jeans'
  Text-only: 10 results
  Image-only: 10 results
  Multimodal: 10 results

Tested 5 queries


---

## PART 7: Attribute-Based Filtering

In [126]:
# PART 7: Attribute-Based Filtering - FIXED VERSION

# V2.1 attributes yükle
v21_attrs_long = pd.read_csv('v2.1-core-ml-plus/evaluation/results/product_attributes.csv')
print(f"Loaded V2.1 attributes: {v21_attrs_long.shape[0]} rows")

# Pivot - DOĞRU YÖNTEM
attr_df = v21_attrs_long.pivot_table(
    index='product_id',
    columns='category',
    values='value',
    aggfunc='first'
).reset_index()

# Confidence'ları ayrı pivot
conf_df = v21_attrs_long.pivot_table(
    index='product_id',
    columns='category',
    values='confidence',
    aggfunc='first'
).reset_index()

# Confidence column'larını rename
for col in conf_df.columns:
    if col != 'product_id':
        conf_df.rename(columns={col: f'{col}_confidence'}, inplace=True)

# Merge
attr_df = attr_df.merge(conf_df, on='product_id')

print(f"Transformed: {attr_df.shape}")
print(f"Columns: {[c for c in attr_df.columns if 'confidence' not in c][:10]}")
print(f"\nSample:")
print(attr_df[['product_id', 'pattern', 'style']].head())


# Attribute matching - BASĐTLEŞTĐRĐLMĐŞ
def calculate_attribute_match(query_attrs: Dict, product_attrs: Dict) -> float:
    """Calculate attribute matching score"""
    matches = 0
    total = 0

    # Sadece pattern ve style karşılaştır (color v2.1'de yok olabilir)
    for attr_name in ['pattern', 'style']:
        if attr_name in query_attrs and attr_name in product_attrs:
            q_val = query_attrs.get(attr_name)
            p_val = product_attrs.get(attr_name)

            if pd.notna(q_val) and pd.notna(p_val):
                total += 1
                # Substring match (daha esnek)
                if str(q_val).lower() in str(p_val).lower() or str(p_val).lower() in str(q_val).lower():
                    matches += 1

    return matches / total if total > 0 else 0.0


# Filtering fonksiyonu - BASİTLEŞTİRİLMİŞ
def filter_by_attributes(results: List[int],
                        query_product_id: int,
                        attr_df: pd.DataFrame,
                        threshold: float = 0.2) -> List[Tuple[int, float]]:
    """Filter by attributes"""

    # Query product attributes
    query_row = attr_df[attr_df['product_id'] == query_product_id]
    if len(query_row) == 0:
        return [(pid, 0.0) for pid in results]

    query_attrs = {
        'pattern': query_row.iloc[0].get('pattern'),
        'style': query_row.iloc[0].get('style')
    }

    filtered = []
    for prod_id in results:
        prod_row = attr_df[attr_df['product_id'] == prod_id]

        if len(prod_row) > 0:
            prod_attrs = {
                'pattern': prod_row.iloc[0].get('pattern'),
                'style': prod_row.iloc[0].get('style')
            }
            score = calculate_attribute_match(query_attrs, prod_attrs)
            if score >= threshold:
                filtered.append((prod_id, score))
        else:
            filtered.append((prod_id, 0.0))

    filtered.sort(key=lambda x: x[1], reverse=True)
    return filtered


print('Filtering functions defined')


# Apply filtering
for result in results_comparison:
    product_id = result['product_id']

    filtered = filter_by_attributes(
        result['multimodal'],
        product_id,
        attr_df,
        threshold=0.2
    )

    result['filtered'] = [pid for pid, score in filtered]
    result['filter_scores'] = [score for pid, score in filtered]

    print(f"\n{result['product_name'][:40]}:")
    print(f"  Before: {len(result['multimodal'])}, After: {len(result['filtered'])}")
    if result['filter_scores']:
        print(f"  Avg score: {np.mean(result['filter_scores']):.3f}")

print('\n✓ Filtering applied')

Loaded V2.1 attributes: 307720 rows
Transformed: (42388, 21)
Columns: ['product_id', 'fit', 'formality', 'length', 'material_appearance', 'neckline', 'occasion', 'pattern', 'season', 'sleeve']

Sample:
category  product_id      pattern             style
0                  0    checkered  minimalist style
1                  1  solid color  minimalist style
2                  2          NaN  minimalist style
3                  3  solid color  minimalist style
4                  4  solid color  minimalist style
Filtering functions defined

Mark Taylor Men White Striped Shirt:
  Before: 10, After: 10
  Avg score: 0.000

Arrow Woman Sylvia Blue Shirt:
  Before: 10, After: 9
  Avg score: 0.722

Locomotive Men Check Ladislav Purple Shi:
  Before: 10, After: 8
  Avg score: 0.812

Highlander Men Solid Poplin Purple Shirt:
  Before: 10, After: 7
  Avg score: 0.857

Jealous 21 Women Supper Zipped Ankle Bla:
  Before: 10, After: 9
  Avg score: 0.667

✓ Filtering applied


---

## PART 8: Results Analysis

In [127]:
# Analyze overlap between strategies
overlap_stats = []

for result in results_comparison:
    text_set = set(result['text_only'])
    img_set = set(result['image_only']) if result['image_only'] else set()
    mm_set = set(result['multimodal'])

    overlap_stats.append({
        'product_name': result['product_name'][:40],
        'text_only_count': len(text_set),
        'image_only_count': len(img_set),
        'multimodal_count': len(mm_set),
        'text_image_overlap': len(text_set & img_set) if img_set else 0,
        'multimodal_unique': len(mm_set - text_set - img_set)
    })

overlap_df = pd.DataFrame(overlap_stats)

print('Retrieval Strategy Overlap Analysis')
print('='*60)
print(overlap_df.to_string(index=False))
print('\nAverage Overlap:')
print(f"  Text-Image: {overlap_df['text_image_overlap'].mean():.1f} products")
print(f"  Multimodal unique: {overlap_df['multimodal_unique'].mean():.1f} products")

Retrieval Strategy Overlap Analysis
                            product_name  text_only_count  image_only_count  multimodal_count  text_image_overlap  multimodal_unique
     Mark Taylor Men White Striped Shirt               10                10                10                   0                  7
           Arrow Woman Sylvia Blue Shirt               10                10                10                   0                  7
Locomotive Men Check Ladislav Purple Shi               10                10                10                   0                  5
Highlander Men Solid Poplin Purple Shirt               10                10                10                   0                  6
Jealous 21 Women Supper Zipped Ankle Bla               10                10                10                   2                  5

Average Overlap:
  Text-Image: 0.4 products
  Multimodal unique: 6.0 products


In [128]:
# OVERLAP SORUNUNU ANALİZ ET

print("=== DETAILED OVERLAP ANALYSIS ===\n")

for i, result in enumerate(results_comparison):
    print(f"\n{i+1}. {result['product_name'][:50]}")
    print(f"   Query Product ID: {result['product_id']}")

    text_set = set(result['text_only'])
    img_set = set(result['image_only'])
    mm_set = set(result['multimodal'])

    # Intersection
    text_img_overlap = text_set & img_set

    print(f"\n   Text-only top 5: {result['text_only'][:5]}")
    print(f"   Image-only top 5: {result['image_only'][:5]}")
    print(f"   Multimodal top 5: {result['multimodal'][:5]}")

    print(f"\n   Text ∩ Image: {len(text_img_overlap)} products")
    if text_img_overlap:
        print(f"   Overlapping IDs: {list(text_img_overlap)[:5]}")

    # Check product names
    print(f"\n   Text results (names):")
    for pid in result['text_only'][:3]:
        prod = products_df[products_df['id'] == pid]
        if len(prod) > 0:
            print(f"     {pid}: {prod.iloc[0]['productDisplayName'][:50]}")

    print(f"\n   Image results (names):")
    for pid in result['image_only'][:3]:
        prod = products_df[products_df['id'] == pid]
        if len(prod) > 0:
            print(f"     {pid}: {prod.iloc[0]['productDisplayName'][:50]}")

=== DETAILED OVERLAP ANALYSIS ===


1. Mark Taylor Men White Striped Shirt
   Query Product ID: 8859

   Text-only top 5: [np.int64(4943), np.int64(59297), np.int64(4944), np.int64(14052), np.int64(9463)]
   Image-only top 5: [np.int64(8859), np.int64(8861), np.int64(8853), np.int64(8872), np.int64(8866)]
   Multimodal top 5: [np.int64(8723), np.int64(16358), np.int64(43304), np.int64(10803), np.int64(9464)]

   Text ∩ Image: 0 products

   Text results (names):
     4943: Gini and Jony Boy's Kaleb White Brown Kidswear
     59297: U.S. Polo Assn. Men White & Navy Blue Shirt
     4944: Gini and Jony Boy's Kaleb White Brown Infant Kidsw

   Image results (names):
     8859: Mark Taylor Men White Striped Shirt
     8861: Mark Taylor Men White Striped Shirt
     8853: Mark Taylor Men White Striped Shirt

2. Arrow Woman Sylvia Blue Shirt
   Query Product ID: 18889

   Text-only top 5: [np.int64(22395), np.int64(12190), np.int64(34036), np.int64(13846), np.int64(22359)]
   Image-only top 5: 

---

## PART 9: Save Results

In [130]:
# Save retrieval results
EVAL_DIR = Path('v2.4.5-multimodal-rag/evaluation/results')
EVAL_DIR.mkdir(parents=True, exist_ok=True)

# Save comparison results (JSON)
with open(EVAL_DIR / 'retrieval_comparison.json', 'w') as f:
    json.dump(results_comparison, f, indent=2, default=str)
print(f'Saved: {EVAL_DIR / "retrieval_comparison.json"}')

# Save overlap analysis (CSV)
overlap_df.to_csv(EVAL_DIR / 'strategy_overlap.csv', index=False)
print(f'Saved: {EVAL_DIR / "strategy_overlap.csv"}')

# Save filtered results
filtered_results = []
for result in results_comparison:
    if 'filtered' in result:
        filtered_results.append({
            'product_id': result['product_id'],
            'product_name': result['product_name'],
            'filtered_count': len(result['filtered']),
            'avg_match_score': np.mean(result['filter_scores']) if result['filter_scores'] else 0
        })

filtered_df = pd.DataFrame(filtered_results)
filtered_df.to_csv(EVAL_DIR / 'attribute_filtering_results.csv', index=False)
print(f'Saved: {EVAL_DIR / "attribute_filtering_results.csv"}')

print('\nAll results saved')

Saved: v2.4.5-multimodal-rag/evaluation/results/retrieval_comparison.json
Saved: v2.4.5-multimodal-rag/evaluation/results/strategy_overlap.csv
Saved: v2.4.5-multimodal-rag/evaluation/results/attribute_filtering_results.csv

All results saved


---

## Summary

In [131]:
print('='*60)
print('DAY 3: MULTIMODAL RETRIEVAL COMPLETE')
print('='*60)

print('\nCompleted:')
print(f'  ✓ FAISS indices loaded (text + image)')
print(f'  ✓ MultiModalRetriever class implemented')
print(f'  ✓ 3 retrieval strategies tested')
print(f'  ✓ Fusion with α={retriever.alpha} (70% text, 30% image)')
print(f'  ✓ Attribute-based filtering')
print(f'  ✓ {len(results_comparison)} queries compared')

print('\nKey Findings:')
print(f'  - Text-Image overlap: {overlap_df["text_image_overlap"].mean():.1f} products')
print(f'  - Multimodal adds: {overlap_df["multimodal_unique"].mean():.1f} unique products')
print(f'  - Attribute filtering: {filtered_df["avg_match_score"].mean():.3f} avg match')

print('\nOutput Files:')
print('  - retrieval_comparison.json')
print('  - strategy_overlap.csv')
print('  - attribute_filtering_results.csv')

print('\nNext Steps (Day 4):')
print('  1. Load v2.2 RAG pipeline')
print('  2. Create VisualRAGPipeline class')
print('  3. Generate visual-aware prompts')
print('  4. Test with multimodal retrieval')
print('  5. Response quality check')

print('='*60)

DAY 3: MULTIMODAL RETRIEVAL COMPLETE

Completed:
  ✓ FAISS indices loaded (text + image)
  ✓ MultiModalRetriever class implemented
  ✓ 3 retrieval strategies tested
  ✓ Fusion with α=0.7 (70% text, 30% image)
  ✓ Attribute-based filtering
  ✓ 5 queries compared

Key Findings:
  - Text-Image overlap: 0.4 products
  - Multimodal adds: 6.0 unique products
  - Attribute filtering: 0.612 avg match

Output Files:
  - retrieval_comparison.json
  - strategy_overlap.csv
  - attribute_filtering_results.csv

Next Steps (Day 4):
  1. Load v2.2 RAG pipeline
  2. Create VisualRAGPipeline class
  3. Generate visual-aware prompts
  4. Test with multimodal retrieval
  5. Response quality check
