<a href="https://colab.research.google.com/github/MehediAhamed/vlmrun-cookbook/blob/fashion-product-catalog-code-fix/notebooks/05_case_study_image_catalogue.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div align="center">
<p align="center" style="width: 100%;">
    <img src="https://raw.githubusercontent.com/vlm-run/.github/refs/heads/main/profile/assets/vlm-black.svg" alt="VLM Run Logo" width="80" style="margin-bottom: -5px; color: #2e3138; vertical-align: middle; padding-right: 5px;"><br>
</p>
<p align="center"><a href="https://docs.vlm.run"><b>Website</b></a> | <a href="https://docs.vlm.run/"><b>API Docs</b></a> | <a href="https://docs.vlm.run/blog"><b>Blog</b></a> | <a href="https://discord.gg/AMApC2UzVY"><b>Discord</b></a>
</p>
<p align="center">
<a href="https://discord.gg/AMApC2UzVY"><img alt="Discord" src="https://img.shields.io/badge/discord-chat-purple?color=%235765F2&label=discord&logo=discord"></a>
<a href="https://twitter.com/vlmrun"><img alt="Twitter Follow" src="https://img.shields.io/twitter/follow/vlmrun.svg?style=social&logo=twitter"></a>
</p>
</div>

Welcome to **[VLM Run Cookbooks](https://github.com/vlm-run/vlmrun-cookbook)**, a comprehensive collection of examples and notebooks demonstrating the power of structured visual understanding using the [VLM Run Platform](https://app.vlm.run).

## Case Study: Fashion Product Catalogue Analysis

This notebook demonstrates how to use VLM Run to analyze fashion product images and extract structured information including:
- Detailed product descriptions
- Product categories
- Target gender
- Seasonal classification

We'll use a sample dataset from the Fashion Product Images dataset to showcase VLM Run's retail product analysis capabilities.

### Environment Setup

To get started, install the VLM Run Python SDK and sign-up for an API key on the [VLM Run App](https://app.vlm.run).
- Store the VLM Run API key under the `VLMRUN_API_KEY` environment variable.

## Prerequisites

* Python 3.9+
* VLM Run API key (get one at [app.vlm.run](https://app.vlm.run))

## Setup

First, let's install the required packages:

In [1]:
! pip install vlmrun --upgrade --quiet
! pip install vlmrun-hub --upgrade --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.4/88.4 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.8/62.8 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.0/66.0 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.6/61.6 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.3/151.3 kB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m28.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.8/43.8 kB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m30.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[

In [2]:
! pip install datasets --quiet

In [3]:
import os
import getpass

VLMRUN_BASE_URL = os.getenv("VLMRUN_BASE_URL", "https://api.vlm.run/v1")
VLMRUN_API_KEY = os.getenv("VLMRUN_API_KEY", None)
if VLMRUN_API_KEY is None:
    VLMRUN_API_KEY = getpass.getpass()

··········


Let's initialize the VLM Run Client

In [4]:
from vlmrun.client import VLMRun

vlm_client = VLMRun(base_url=VLMRUN_BASE_URL, api_key=VLMRUN_API_KEY)

### Load the dataset

Let's load the first 1% of the fashion dataset, and visualize the dataset.

In [None]:
from datasets import load_dataset


print(f"Loading fashion dataset...")
ds = load_dataset("ashraq/fashion-product-images-small", split=f"train[:10]")
print(f"Loaded {len(ds)} images successfully")
ds

Let's define an utility function for visualization

In [6]:
import pandas as pd

# Convert the dataset to a pandas dataframe
ds = ds.select_columns(["image", "productDisplayName", "gender", "masterCategory", "season"])
# Decode the image column and convert to PIL image
df = pd.DataFrame([{**row} for row in ds])
df

Unnamed: 0,image,productDisplayName,gender,masterCategory,season
0,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Turtle Check Men Navy Blue Shirt,Men,Apparel,Fall
1,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Peter England Men Party Blue Jeans,Men,Apparel,Summer
2,<PIL.Image.Image image mode=L size=60x80 at 0x...,Titan Women Silver Watch,Women,Accessories,Winter
3,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Manchester United Men Solid Black Track Pants,Men,Apparel,Fall
4,<PIL.Image.Image image mode=RGB size=60x80 at ...,Puma Men Grey T-shirt,Men,Apparel,Summer
5,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Inkfruit Mens Chain Reaction T-shirt,Men,Apparel,Summer
6,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Fabindia Men Striped Green Shirt,Men,Apparel,Summer
7,<PIL.Image.Image image mode=RGB size=60x80 at ...,Jealous 21 Women Purple Shirt,Women,Apparel,Summer
8,<PIL.Image.Image image mode=RGB size=60x80 at ...,Puma Men Pack of 3 Socks,Men,Accessories,Summer
9,<PIL.Image.Image image mode=RGB size=60x80 at ...,Skagen Men Black Watch,Men,Accessories,Winter


### Image -> JSON with VLM Run


Now, let's call the VLM Run API to get the predictions for the dataframe using the `vlm_client.image.generate` method. In this example, we'll use the `retail.product-catalog` domain to get the predictions.

In [7]:
# Define a function to get predictions for an image from the VLM Run API
def get_predictions(image):
    """Generate predictions for an image using the VLM Run API."""
    try:
        response = vlm_client.image.generate(
            images=[image],
            domain="retail.product-catalog"
        )
        return response.response
    except Exception as e:
        print(f"Error getting predictions for image: {e}")
        return None

# Apply the get_predictions function to the dataframe
df_response = df.assign(
    response=df["image"].apply(lambda image: get_predictions(image))
)
df_response

Unnamed: 0,image,productDisplayName,gender,masterCategory,season,response
0,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Turtle Check Men Navy Blue Shirt,Men,Apparel,Fall,{'description': 'A young man is pictured weari...
1,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Peter England Men Party Blue Jeans,Men,Apparel,Summer,{'description': 'The image displays a pair of ...
2,<PIL.Image.Image image mode=L size=60x80 at 0x...,Titan Women Silver Watch,Women,Accessories,Winter,{'description': 'A sleek silver-toned wristwat...
3,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Manchester United Men Solid Black Track Pants,Men,Apparel,Fall,{'description': 'A pair of black track pants w...
4,<PIL.Image.Image image mode=RGB size=60x80 at ...,Puma Men Grey T-shirt,Men,Apparel,Summer,{'description': 'A young man is pictured weari...
5,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Inkfruit Mens Chain Reaction T-shirt,Men,Apparel,Summer,{'description': 'A smiling man is wearing a gr...
6,<PIL.JpegImagePlugin.JpegImageFile image mode=...,Fabindia Men Striped Green Shirt,Men,Apparel,Summer,"{'description': 'A man models a long-sleeved, ..."
7,<PIL.Image.Image image mode=RGB size=60x80 at ...,Jealous 21 Women Purple Shirt,Women,Apparel,Summer,{'description': 'A woman models a short-sleeve...
8,<PIL.Image.Image image mode=RGB size=60x80 at ...,Puma Men Pack of 3 Socks,Men,Accessories,Summer,"{'description': 'A trio of socks is displayed,..."
9,<PIL.Image.Image image mode=RGB size=60x80 at ...,Skagen Men Black Watch,Men,Accessories,Winter,{'description': 'This is a minimalist black wr...


In [9]:
# Let's explode the response column into a dataframe and display the predictions dataframe
pd.set_option("display.max_colwidth", None)
df_response_json = df_response["response"].apply(lambda x: x if x is not None else None)
df_response_json = pd.concat([df_response[["image"]], pd.json_normalize(df_response_json)], axis=1)
df_response_json


Unnamed: 0,image,description,category,season,gender
0,<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=60x80 at 0x7C72D6A73320>,A young man is pictured wearing a short-sleeved plaid shirt with a dark collar. He is also wearing dark jeans.,apparel,summer,men
1,<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=60x80 at 0x7C72DC0E2990>,"The image displays a pair of men's blue denim jeans, characterized by a slightly faded wash and a classic straight-leg fit. They are worn by a male model with a grey t-shirt and dark shoes, against a plain white background.",Apparel,spring,men
2,<PIL.Image.Image image mode=L size=60x80 at 0x7C72DDB0E420>,A sleek silver-toned wristwatch featuring a round white dial with minimal markings. It has a delicate silver mesh band and a polished case.,accessories,summer,women
3,<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=60x80 at 0x7C72D6A70D10>,A pair of black track pants with a straight fit and an orange patch or logo on the left thigh. They appear to be comfortable and suitable for casual or athletic wear.,Apparel,fall,men
4,<PIL.Image.Image image mode=RGB size=60x80 at 0x7C72D6A77B60>,"A young man is pictured wearing a short-sleeved, collared polo shirt in a muted blue-grey color. The shirt features a small white logo on the left chest and appears to be made of a textured fabric.",Apparel,summer,men
5,<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=60x80 at 0x7C72D6A777D0>,"A smiling man is wearing a grey short-sleeved t-shirt with a colorful graphic print on the front. The t-shirt appears to be made of a soft, casual fabric.",Apparel,summer,men
6,<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=60x80 at 0x7C72D6A76000>,"A man models a long-sleeved, collared shirt in an olive green color, with the sleeves rolled up to the forearms. The shirt is paired with blue jeans, presenting a casual yet neat appearance.",Shirt,spring,men
7,<PIL.Image.Image image mode=RGB size=60x80 at 0x7C72D6A78530>,"A woman models a short-sleeved collared shirt in a dark reddish-purple hue. The shirt features a V-neckline, suggesting a comfortable and casual style.",apparel,summer,women
8,<PIL.Image.Image image mode=RGB size=60x80 at 0x7C72D6A78950>,"A trio of socks is displayed, featuring black, white, and navy/dark grey colors. These are crew-length socks, with one clearly showing a small Puma logo near the cuff.",accessories,fall,men
9,<PIL.Image.Image image mode=RGB size=60x80 at 0x7C72D6A78CB0>,"This is a minimalist black wristwatch with a round face, featuring a monochromatic design with subtle hour markers and a smooth black strap.",watches,fall,men


Now, let's define a few utilities for rendering the dataframe with images / tags.

In [10]:
from IPython.display import HTML, display
from vlmrun.common.image import encode_image

style = """
<style>
    .tag {
        padding: 4px 8px;
        border-radius: 4px;
        font-size: 12px;
        background: #f0f0f0;
        color: #666;
        font-weight: 300;
    }
</style>
"""

formatters = {
    "image": lambda x: f"<img src='{encode_image(x, format='JPEG')}' width='100'>",
    "category": lambda x: f"<span class='tag'>{x}</span>",
    "season": lambda x: f"<span class='tag'>{x}</span>",
    "gender": lambda x: f"<span class='tag'>{x}</span>",
}

Display the dataframe with images and VLM Run predictions.

In [12]:
display(HTML(style + df_response_json.to_html(formatters=formatters, escape=False)))

Unnamed: 0,image,description,category,season,gender
0,,A young man is pictured wearing a short-sleeved plaid shirt with a dark collar. He is also wearing dark jeans.,apparel,summer,men
1,,"The image displays a pair of men's blue denim jeans, characterized by a slightly faded wash and a classic straight-leg fit. They are worn by a male model with a grey t-shirt and dark shoes, against a plain white background.",Apparel,spring,men
2,,A sleek silver-toned wristwatch featuring a round white dial with minimal markings. It has a delicate silver mesh band and a polished case.,accessories,summer,women
3,,A pair of black track pants with a straight fit and an orange patch or logo on the left thigh. They appear to be comfortable and suitable for casual or athletic wear.,Apparel,fall,men
4,,"A young man is pictured wearing a short-sleeved, collared polo shirt in a muted blue-grey color. The shirt features a small white logo on the left chest and appears to be made of a textured fabric.",Apparel,summer,men
5,,"A smiling man is wearing a grey short-sleeved t-shirt with a colorful graphic print on the front. The t-shirt appears to be made of a soft, casual fabric.",Apparel,summer,men
6,,"A man models a long-sleeved, collared shirt in an olive green color, with the sleeves rolled up to the forearms. The shirt is paired with blue jeans, presenting a casual yet neat appearance.",Shirt,spring,men
7,,"A woman models a short-sleeved collared shirt in a dark reddish-purple hue. The shirt features a V-neckline, suggesting a comfortable and casual style.",apparel,summer,women
8,,"A trio of socks is displayed, featuring black, white, and navy/dark grey colors. These are crew-length socks, with one clearly showing a small Puma logo near the cuff.",accessories,fall,men
9,,"This is a minimalist black wristwatch with a round face, featuring a monochromatic design with subtle hour markers and a smooth black strap.",watches,fall,men


## Additional Resources
- [VLM Run Documentation](https://docs.vlm.run)
- [API Reference](https://docs.vlm.run/)
- [More Examples](https://github.com/vlm-run/vlmrun-cookbook)
- [Fashion Dataset](https://huggingface.co/datasets/ashraq/fashion-product-images-small)