Here’s a straightforward guide to fine-tuning or adapting a model to make your fashion bot more "Pinterest-trendy":

✅ Goal: Make your fashion bot recommend outfits closer to Pinterest trends
You have two main options:

Option 1: Fine-Tune a Model (Full or LoRA)
🔧 Use case: You have thousands of Pinterest-style fashion prompts & outputs (text or image-text)
1. Gather Data
Structure your dataset like this:

json
Copy
Edit
{
  "messages": [
    {"role": "user", "content": "What should I wear to a brunch in spring?"},
    {"role": "assistant", "content": "Try a flowy floral midi dress with white sneakers and a small woven handbag—Pinterest-perfect vibes."}
  ]
}
✅ Get ~500–5000+ examples that reflect trendy, Pinterest-style responses.

2. Upload & Fine-Tune
Use OpenAI's CLI or API:

bash
Copy
Edit
openai api fine_tunes.prepare_data -f dataset.jsonl
openai api fine_tunes.create -t "prepared_data.jsonl" -m gpt-3.5-turbo
Docs: https://platform.openai.com/docs/guides/fine-tuning

⏳ Limitations
Only works for GPT-3.5-turbo (not GPT-4 yet)

Cost can add up for large datasets

Not ideal if you want to react to fast-changing Pinterest trends

Option 2: Use Retrieval + Reranking (Smarter Way for Trends)
✅ Use case: Trends change fast → don’t retrain, retrieve and rank from live Pinterest-style data
🧩 Setup:
Scrape or collect Pinterest outfit descriptions (text + maybe image captions)

Embed them into a vector database (like ChromaDB, Pinecone, Weaviate)

User asks a question → fetch top 10 relevant outfits → rerank using a model

python
Copy
Edit
# User: "What’s a trendy outfit for spring 2025 brunch?"

# Backend:
- Embed query
- Search in vector DB
- Return top 5 Pinterest-style outfits
- Use GPT-4 to rerank or summarize
🔁 Why better:
Stay current with trend data

Easily swap out old data for new Pinterest scrapes

No retraining required

✨ Bonus: Vision-Augmented Fashion Bot
If you want it to see Pinterest images and learn from them:

Use GPT-4-vision or a CLIP-based model

Extract captions or keywords from Pinterest image descriptions

Pair with outfit metadata (color, season, vibe)

🚀 Recommendation:
If you're doing fashion and trends:

❗ Use retrieval-based method with GPT-4 reranking for freshness
➕ Optionally fine-tune GPT-3.5-turbo on your branded style if needed

Want a full retrieval + rerank Python setup? Or a fine-tune-ready script for GPT-3.5?

In [3]:
%%capture
!pip install gdown
!gdown "1igAuIEW_4h_51BG1o05WS0Q0-Cp17_-t&confirm=t"
!unzip data


In [1]:
%%capture
!pip install -U fashion-clip

In [4]:
import sys
#sys.path.append("fashion-clip/")
from fashion_clip.fashion_clip import FashionCLIP
import pandas as pd
import numpy as np
from collections import Counter
from PIL import Image
import numpy as np
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.metrics import *
from sklearn.linear_model import LogisticRegression

In [5]:
%%capture
fclip = FashionCLIP('fashion-clip')

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


In [6]:
articles = pd.read_csv("data_for_fashion_clip/articles.csv")

# drop items that have the same description
subset = articles.drop_duplicates("detail_desc").copy()

# remove items of unkown category
subset = subset[~subset["product_group_name"].isin(["Unknown"])]

# FashionCLIP has a limit of 77 tokens, let's play it safe and drop things with more than 40 tokens
subset = subset[subset["detail_desc"].apply(lambda x : 4 < len(str(x).split()) < 40)]

# We also drop products types that do not occur very frequently in this subset of data
most_frequent_product_types = [k for k, v in dict(Counter(subset["product_type_name"].tolist())).items() if v > 10]
subset = subset[subset["product_type_name"].isin(most_frequent_product_types)]

# lots of data here, but we will just use only descriptions and a couple of other columns
subset.head(10)

Unnamed: 0,article_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,graphical_appearance_no,graphical_appearance_name,colour_group_code,colour_group_name,...,department_name,index_code,index_name,index_group_no,index_group_name,section_no,section_name,garment_group_no,garment_group_name,detail_desc
0,108775044,108775,Strap top,253,Vest top,Garment Upper body,1010016,Solid,10,White,...,Jersey Basic,A,Ladieswear,1,Ladieswear,16,Womens Everyday Basics,1002,Jersey Basic,Jersey top with narrow shoulder straps.
1,176754003,176754,2 Row Braided Headband (1),74,Hair/alice band,Accessories,1010016,Solid,17,Yellowish Brown,...,Hair Accessories,C,Ladies Accessories,1,Ladieswear,66,Womens Small accessories,1019,Accessories,Two-strand hairband with braids in imitation s...
3,189634031,189634,Long Leg Leggings,273,Leggings/Tights,Garment Lower body,1010016,Solid,93,Dark Green,...,Basic 1,D,Divided,2,Divided,51,Divided Basics,1002,Jersey Basic,Leggings in stretch jersey with an elasticated...
4,194270044,194270,HELENA 2-pack tanktop,253,Vest top,Garment Upper body,1010016,Solid,51,Light Pink,...,Young Girl Jersey Basic,I,Children Sizes 134-170,4,Baby/Children,79,Girls Underwear & Basics,1002,Jersey Basic,Tops in soft organic cotton jersey.
5,203027047,203027,Linni tee (1),255,T-shirt,Garment Upper body,1010017,Stripe,10,White,...,Basic 1,D,Divided,2,Divided,51,Divided Basics,1002,Jersey Basic,Short-sleeved top in jersey with sewn-in turn-...
6,212042070,212042,Mimmi sneaker,94,Sneakers,Shoes,1010016,Solid,10,White,...,Divided Shoes,D,Divided,2,Divided,52,Divided Accessories,1020,Shoes,Cotton trainers with closed lacing and a loop ...
8,215303001,215303,Coolio sunglasses,81,Sunglasses,Accessories,1010016,Solid,17,Yellowish Brown,...,Sunglasses,C,Ladies Accessories,1,Ladieswear,66,Womens Small accessories,1019,Accessories,Sunglasses with plastic frames and UV-protecti...
9,216081011,216081,Norling Knit,245,Cardigan,Garment Upper body,1010010,Melange,8,Dark Grey,...,Tops Knitwear DS,D,Divided,2,Divided,58,Divided Selected,1003,Knitwear,Cardigan in a bouclé knit made from a wool ble...
10,218829015,218829,Paris glove.,71,Gloves,Accessories,1010016,Solid,17,Yellowish Brown,...,Gloves/Hats,C,Ladies Accessories,1,Ladieswear,65,Womens Big accessories,1019,Accessories,"Gloves in soft, supple leather. Lined."
11,228257001,228257,20 den 2p Tights,304,Underwear Tights,Socks & Tights,1010016,Solid,9,Black,...,Tights basic,B,Lingeries/Tights,1,Ladieswear,62,"Womens Nightwear, Socks & Tigh",1021,Socks and Tights,Tights with an elasticated waist. 20 denier.


Vectorization of Clothing Inputs * will need to adjust the image inputs

In [7]:
from PIL import Image
import torch

# Load FashionCLIP
#from fashion_clip.fashion_clip import FashionCLIP
#fclip = FashionCLIP('fashion-clip')


In [8]:
pip install chromadb



Connect to Chromadb

In [None]:
#pip install chromadb

import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection(name="clothing_image_embeddings")

for idx, row in articles.iterrows():
    try:
        # Load and preprocess image ** replace this section with the
        image_path = f"images/{row['article_id']}.jpg"  # adjust path based on your folder structure
        image = Image.open(image_path).convert("RGB")

        # Generate image embedding
        embedding = fclip.encode_images([image])[0].tolist()  # Convert to list for ChromaDB

        # Construct metadata
        metadata = {
            "category": row["product_type_name"],
            "season": str(row.get("season", "unknown")),  # If season exists
            "gender": row.get("index_name", "unknown")
        }

        # Add to ChromaDB
        collection.add(
            embeddings=[embedding],
            documents=[row["detail_desc"]],
            metadatas=[metadata],
            ids=[f"item_{row['article_id']}"]
        )

    except Exception as e:
        print(f"Skipping item {row['article_id']} due to error: {e}")


Skipping item 108775044 due to error: [Errno 2] No such file or directory: 'images/108775044.jpg'
Skipping item 176754003 due to error: [Errno 2] No such file or directory: 'images/176754003.jpg'
Skipping item 176754019 due to error: [Errno 2] No such file or directory: 'images/176754019.jpg'
Skipping item 189634031 due to error: [Errno 2] No such file or directory: 'images/189634031.jpg'
Skipping item 194270044 due to error: [Errno 2] No such file or directory: 'images/194270044.jpg'
Skipping item 203027047 due to error: [Errno 2] No such file or directory: 'images/203027047.jpg'
Skipping item 212042070 due to error: [Errno 2] No such file or directory: 'images/212042070.jpg'
Skipping item 214844001 due to error: [Errno 2] No such file or directory: 'images/214844001.jpg'
Skipping item 215303001 due to error: [Errno 2] No such file or directory: 'images/215303001.jpg'
Skipping item 216081011 due to error: [Errno 2] No such file or directory: 'images/216081011.jpg'
Skipping item 218829

Querying the db

In [None]:
# Search with an uploaded image
query_image = Image.open("user_upload.jpg").convert("RGB")
query_embedding = fclip.encode_images([query_image])[0].tolist()

results = collection.query(
    query_embeddings=[query_embedding],
    n_results=5,
    where={"season": "summer", "gender": "Ladieswear"}  # Optional metadata filtering
)

** If we Dont Have Image Data Avaiable, can also use encode_text...

tested this one out because I have text data available; has a sample ai search

In [10]:
from fashion_clip.fashion_clip import FashionCLIP
import chromadb
import pandas as pd

client = chromadb.PersistentClient(path="./chroma_db")

# Load FashionCLIP
fclip = FashionCLIP('fashion-clip')

# Load and filter your dataset (if not done already)
df = pd.read_csv("data_for_fashion_clip/articles.csv")
df = df.drop_duplicates("detail_desc")
df = df[~df["product_group_name"].isin(["Unknown"])]
df = df[df["detail_desc"].apply(lambda x: 4 < len(str(x).split()) < 40)]
most_frequent = df["product_type_name"].value_counts()[df["product_type_name"].value_counts() > 10].index
df = df[df["product_type_name"].isin(most_frequent)].copy()

df = df.head(100)

#def encode_batch(example):
    #return {
        #"embedding": fclip.encode_text([example["detail_desc"]], batch_size=1)[0].tolist()
    #}

# 4. Apply it to your dataset
#df = df.map(encode_batch)

# Initialize ChromaDB
collection = client.get_or_create_collection(name="clothing_text_embeddings")

# Batch loop: Encode and insert into ChromaDB
for idx, row in df.iterrows():
    try:
        text = row["detail_desc"]
        text_embedding = fclip.encode_text([text], batch_size = 1)[0].tolist()

        metadata = {
            "category": row["product_type_name"],
            "group": row["product_group_name"],
            "gender": row["index_name"],
            "id": str(row["article_id"])
        }

        collection.add(
            embeddings=[text_embedding],
            documents=[text],
            metadatas=[metadata],
            ids=[f"text_{row['article_id']}"]
        )

    except Exception as e:
        print(f"Error embedding item {row['article_id']}: {e}")

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.58it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.53it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.02it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  8.09it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.02it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.90it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.66it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.74it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.75it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.74it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.68it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.48it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.55it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.28it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.06it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.16it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.23it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.83it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.30it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.19it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.74it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.31it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.51it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.64it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.83it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.69it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  7.51it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.67it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.36it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.86it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.69it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.77it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.58it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.60it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.52it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.57it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.39it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.56it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.36it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.35it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.60it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.53it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.81it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.42it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.39it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.45it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.36it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.28it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.25it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.51it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.61it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.60it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.62it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.84it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.53it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.56it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.36it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.13it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.66it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.43it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.62it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.35it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.71it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.23it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.54it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.13it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.45it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.29it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.52it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.58it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.94it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.46it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.24it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.97it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.44it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.04it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.32it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.31it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.43it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.48it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.37it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.47it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.96it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.53it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.17it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.40it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.41it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.76it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.61it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.05it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.32it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  4.50it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.48it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.22it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.24it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.02it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.30it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  6.23it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.93it/s]


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

100%|██████████| 1/1 [00:00<00:00,  5.66it/s]


In [11]:
pip install gradio



In [12]:
import gradio as gr

def search_clothes(user_query):
    filters = parse_query(user_query)  # GPT or rule-based
    vector = get_embedding(user_query)
    results = search_db(vector, filters)
    return format_results(results)

gr.Interface(fn=search_clothes, inputs="text", outputs="html").launch()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://ae411f36487032d6fc.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




AI Search Interface

In [13]:
# Dummy parser for now (can use GPT later)
def parse_query(text):
    text = text.lower()
    filters = {}
    if "summer" in text: filters["season"] = "summer"
    if "winter" in text: filters["season"] = "winter"
    if "dress" in text: filters["category"] = "Dress"
    return filters

# Search function
def search_clothes(user_query):
    try:
      # encodes our query into parsable, vector emebddings
        query_vector = fclip.encode_text([user_query], batch_size = 1)[0].tolist()
        # searches for known key words
        filters = parse_query(user_query)

        # Vector Search; searches for vectors closest to the query vector
        results = collection.query(
            query_embeddings=[query_vector],
            n_results=5,
            where=filters if filters else None
        )

        # Format results
        items = results["documents"][0]
        metadatas = results["metadatas"][0]
        display = ""
        for item, meta in zip(items, metadatas):
            display += f"<b>{meta.get('category', 'Item')}</b>: {item}<br><i>{meta}</i><br><br>"
        return display or "No results found."

    except Exception as e:
        return f"Error: {str(e)}"

# Launch app
gr.Interface(fn=search_clothes, inputs="text", outputs="html", title="AI Fashion Search").launch()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://dea8bd7716900f7909.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


