# Vibe Matcher — Prototype

This notebook replicates the pipeline run here: data prep → embeddings (TF-IDF substitute) → cosine similarity search → evaluation. Replace the TF-IDF embedding with OpenAI embeddings in Colab/GitHub when you have API access.


In [None]:

# Vibe Matcher compact pipeline (runnable)
import pandas as pd, time
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

products = [
    {
        "name": "Boho Dress",
        "desc": "Flowy cotton dress with earthy tones, tassels and relaxed silhouette \u2014 perfect for festival and beach vibes.",
        "tags": [
            "boho",
            "festival",
            "earthy"
        ]
    },
    {
        "name": "Urban Bomber Jacket",
        "desc": "Sleek bomber jacket with matte finish, structured shoulders, and reflective trim for street-smart, energetic urban chic.",
        "tags": [
            "urban",
            "chic",
            "energetic"
        ]
    },
    {
        "name": "Cozy Knit Cardigan",
        "desc": "Oversized knit cardigan, soft wool blend, neutral colors \u2014 cozy, warm, hygge-friendly layering piece.",
        "tags": [
            "cozy",
            "casual",
            "warm"
        ]
    },
    {
        "name": "Minimalist Slip Dress",
        "desc": "Clean lines, satin finish slip dress for minimalist and elegant evening looks.",
        "tags": [
            "minimal",
            "elegant",
            "evening"
        ]
    },
    {
        "name": "Sporty High-Tops",
        "desc": "Lightweight high-top sneakers with breathable mesh and energetic pop \u2014 made for active urban movement.",
        "tags": [
            "sporty",
            "urban",
            "active"
        ]
    },
    {
        "name": "Vintage Denim Jacket",
        "desc": "Washed denim with retro patches and relaxed fit \u2014 nostalgic, casual, and versatile.",
        "tags": [
            "vintage",
            "casual",
            "retro"
        ]
    },
    {
        "name": "Tailored Blazer",
        "desc": "Sharp tailored blazer with slim lapel \u2014 professional, polished, and modern office-ready.",
        "tags": [
            "professional",
            "polished",
            "modern"
        ]
    },
    {
        "name": "Festival Fringe Top",
        "desc": "Beaded fringe crop top with bold colors \u2014 playful, festival-ready statement piece.",
        "tags": [
            "festival",
            "playful",
            "bold"
        ]
    }
]
df = pd.DataFrame(products)

vectorizer = TfidfVectorizer(max_features=500)
tfidf_matrix = vectorizer.fit_transform(df['desc'].tolist())

def get_embeddings(texts, method='tfidf'):
    if method == 'tfidf':
        return vectorizer.transform(texts).toarray()
    elif method == 'openai':
        # Example stub: replace with OpenAI API call in Colab.
        # import openai
        # openai.api_key = "YOUR_KEY"
        # resp = openai.Embedding.create(input=texts, model="text-embedding-ada-002")
        # return [r['embedding'] for r in resp['data']]
        raise RuntimeError("OpenAI not available here. Replace with your API call.")
    else:
        raise ValueError("Unknown")

def search_top_k(query, k=3, embed_method='tfidf', fallback_threshold=0.35):
    t0 = time.perf_counter()
    q_emb = get_embeddings([query], method=embed_method)
    sims = cosine_similarity(q_emb, tfidf_matrix.toarray())[0]
    top_idx = sims.argsort()[::-1][:k]
    results = [{"rank": i+1, "name": df.iloc[idx]['name'], "score": float(sims[idx])} for i, idx in enumerate(top_idx)]
    best_score = float(sims[top_idx[0]])
    fallback = None
    if best_score < fallback_threshold:
        fallback = "No strong match. Consider asking for clarification."
    return {"query": query, "results": results, "best_score": best_score, "fallback": fallback}

# Example run
print(search_top_k("energetic urban chic", embed_method='tfidf'))


## Deliverables

- Notebook: `Vibe_Matcher_Notebook.ipynb` (this file)
- 1-paragraph intro and reflection bullets (below)

### Intro

At Nexora, customers expect rapid, personalized discovery. A lightweight 'Vibe Matcher' uses natural-language descriptions to match product aesthetics to a user's mood or 'vibe'. This prototype demonstrates how embeddings + vector search can surface relevant fashion items quickly — a foundation to scale with Pinecone or OpenAI embeddings for production-grade matching.

### Reflection

- Swap TF-IDF with OpenAI text-embedding-ada-002 (or a domain fine-tuned model) for richer semantic matching.
- Use a vector DB (Pinecone/Weaviate) to scale search and filter by metadata (tags, price, inventory).
- Handle cold-start/low-confidence results with dynamic fallback prompts or ask clarifying follow-up questions.
- Add behavioral signals (clicks, purchases) for learning-to-rank and personalization.
- Evaluate with human judgments and A/B testing; treat similarity >0.7 as 'good' but calibrate per-domain.