OpenTryOn: Open-source AI toolkit for fashion tech and virtual try-on

OpenTryOn is an open-source AI toolkit designed for fashion technology and virtual try-on applications. This project provides a comprehensive suite of tools for garment segmentation, human parsing, pose estimation, and virtual try-on using state-of-the-art diffusion models.

📚 Documentation: Comprehensive documentation is available at https://tryonlabs.github.io/opentryon/

Star History

🎯 Features

Virtual Try-On:
- Amazon Nova Canvas virtual try-on using AWS Bedrock
- Kling AI virtual try-on using Kolors API
- Segmind Try-On Diffusion API integration
- Advanced diffusion-based virtual try-on capabilities using TryOnDiffusion
Image Generation:
- Nano Banana (Gemini 2.5 Flash Image) for fast, efficient image generation
- Nano Banana Pro (Gemini 3 Pro Image Preview) for advanced 4K image generation with search grounding
- FLUX.2 [PRO] high-quality image generation with text-to-image, image editing, and multi-image composition
- FLUX.2 [FLEX] flexible image generation with advanced controls (guidance, steps, prompt upsampling)
Datasets Module:
- Fashion-MNIST dataset loader with automatic download
- VITON-HD dataset loader with lazy loading via PyTorch DataLoader
- Class-based adapter pattern for easy dataset integration
- Support for both small and large datasets
Garment Preprocessing:
- Garment segmentation using U2Net
- Garment extraction and preprocessing
- Human segmentation and parsing
Pose Estimation: OpenPose-based pose keypoint extraction for garments and humans
Outfit Generation: FLUX.1-dev LoRA-based outfit generation from text descriptions
Model Swap: Swap garments on different models
Interactive Demos: Gradio-based web interfaces for all features
Preprocessing Pipeline: Complete preprocessing pipeline for training and inference

📚 Documentation

Complete documentation for OpenTryOn is available at https://tryonlabs.github.io/opentryon/

The documentation includes:

Getting Started guides
API Reference for all modules
Usage examples and tutorials
Datasets documentation (Fashion-MNIST, VITON-HD)
API adapters documentation (Segmind, Kling AI, Amazon Nova Canvas)
Interactive demos and examples
Advanced guides and troubleshooting

Visit the documentation site to explore all features, learn how to use OpenTryOn, and get started quickly!

🚀 Installation

Prerequisites

Python 3.10
CUDA-capable GPU (recommended)
Conda or Miniconda

Step 1: Clone the Repository

git clone https://github.com/tryonlabs/opentryon.git
cd opentryon

Step 2: Create Conda Environment

conda env create -f environment.yml
conda activate opentryon

Alternatively, you can install dependencies using pip:

pip install -r requirements.txt

Step 3: Install Package

pip install -e .

Step 4: Environment Variables

Create a .env file in the project root with the following variables:

U2NET_CLOTH_SEG_CHECKPOINT_PATH=cloth_segm.pth

# AWS Credentials for Amazon Nova Canvas (optional, can use AWS CLI default profile)
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AMAZON_NOVA_REGION=us-east-1  # Optional: us-east-1, ap-northeast-1, eu-west-1
AMAZON_NOVA_MODEL_ID=amazon.nova-canvas-v1:0  # Optional

# Kling AI Credentials (required for Kling AI virtual try-on)
KLING_AI_API_KEY=your_kling_api_key
KLING_AI_SECRET_KEY=your_kling_secret_key
KLING_AI_BASE_URL=https://api-singapore.klingai.com  # Optional, defaults to Singapore endpoint

# Segmind Credentials (required for Segmind virtual try-on)
SEGMIND_API_KEY=your_segmind_api_key

# Google Gemini Credentials (required for Nano Banana image generation)
GEMINI_API_KEY=your_gemini_api_key

# BFL API Credentials (required for FLUX.2 image generation)
BFL_API_KEY=your_bfl_api_key

Notes:

Download the U2Net checkpoint file from the huggingface-cloth-segmentation repository
For Amazon Nova Canvas, ensure you have AWS credentials configured (via .env file or AWS CLI) and Nova Canvas enabled in your AWS Bedrock console
For Kling AI, obtain your API key and secret key from the Kling AI Developer Portal
For Segmind, obtain your API key from the Segmind API Portal
For Nano Banana, obtain your API key from the Google AI Studio
For FLUX.2 models, obtain your API key from BFL AI

🎮 Quick Start

Basic Preprocessing

from dotenv import load_dotenv
load_dotenv()

from tryon.preprocessing import segment_garment, extract_garment, segment_human

# Segment garment
segment_garment(
    inputs_dir="data/original_cloth",
    outputs_dir="data/garment_segmented",
    cls="upper"  # Options: "upper", "lower", "all"
)

# Extract garment
extract_garment(
    inputs_dir="data/original_cloth",
    outputs_dir="data/cloth",
    cls="upper",
    resize_to_width=400
)

# Segment human
segment_human(
    image_path="data/original_human/model.jpg",
    output_dir="data/human_segmented"
)

Command Line Interface

# Segment garment
python main.py --dataset data --action segment_garment --cls upper

# Extract garment
python main.py --dataset data --action extract_garment --cls upper

# Segment human
python main.py --dataset data --action segment_human

📖 Usage

Datasets Module

The tryon.datasets module provides easy-to-use interfaces for downloading and loading datasets commonly used in fashion and virtual try-on applications. The module uses a class-based adapter pattern for consistency and extensibility.

Supported Datasets

Fashion-MNIST: A dataset of Zalando's article images (60K training, 10K test, 10 classes, 28×28 grayscale images)
VITON-HD: A high-resolution virtual try-on dataset (11,647 training pairs, 2,032 test pairs, 1024×768 RGB images)
Subjects200K: A large-scale dataset with 200,000 paired images for subject consistency research (loaded from HuggingFace)

Quick Example

from tryon.datasets import FashionMNIST, VITONHD
from torchvision import transforms

# Fashion-MNIST: Small dataset, loads entirely into memory
fashion_dataset = FashionMNIST(download=True)
(train_images, train_labels), (test_images, test_labels) = fashion_dataset.load(
    normalize=True,
    flatten=False
)
print(f"Training set: {train_images.shape}")  # (60000, 28, 28)

# VITON-HD: Large dataset, uses lazy loading via DataLoader
viton_dataset = VITONHD(data_dir="./datasets/viton_hd", download=False)
transform = transforms.Compose([
    transforms.Resize((512, 384)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])
train_loader = viton_dataset.get_dataloader(
    split='train',
    batch_size=8,
    shuffle=True,
    transform=transform
)

# Subjects200K: Large-scale paired images from HuggingFace
from tryon.datasets import Subjects200K

subjects_dataset = Subjects200K()
hf_dataset = subjects_dataset.get_hf_dataset()
sample = hf_dataset['train'][0]
image = sample['image']  # PIL Image with paired images
collection = sample['collection']  # 'collection_1', 'collection_2', or 'collection_3'

# Get PyTorch DataLoader with quality filtering
dataloader = subjects_dataset.get_dataloader(
    batch_size=16,
    transform=transform,
    collection='collection_2',
    filter_high_quality=True
)

Documentation

For comprehensive documentation, API reference, usage examples, and best practices, see the Datasets Module Documentation.

Key Features:

✅ Automatic download for Fashion-MNIST
✅ Lazy loading for large datasets (VITON-HD)
✅ PyTorch DataLoader integration
✅ Consistent API across datasets
✅ Class-based and function-based interfaces
✅ Support for custom transforms and preprocessing

Virtual Try-On with Amazon Nova Canvas

Generate realistic virtual try-on images using Amazon Nova Canvas through AWS Bedrock. This feature combines a source image (person/model) with a reference image (garment/product) to create realistic try-on results.

Prerequisites

AWS Account Setup:
- Ensure you have an AWS account with access to Amazon Bedrock
- Enable Nova Canvas model access in the AWS Bedrock console (Model access section)
- Configure AWS credentials (via .env file or AWS CLI)
Image Requirements:
- Maximum image size: 4.1M pixels (equivalent to 2,048 x 2,048)
- Supported formats: JPG, PNG
- Both source and reference images must meet size requirements

Command Line Usage

# Basic usage with GARMENT mask (default) - Nova Canvas
python vton.py --provider nova --source data/person.jpg --reference data/garment.jpg

# Specify garment class - Nova Canvas
python vton.py --provider nova --source person.jpg --reference garment.jpg --garment-class LOWER_BODY

# Use IMAGE mask type with custom mask - Nova Canvas
python vton.py --provider nova --source person.jpg --reference garment.jpg --mask-type IMAGE --mask-image mask.png

# Use different AWS region - Nova Canvas
python vton.py --provider nova --source person.jpg --reference garment.jpg --region ap-northeast-1

# Basic usage - Kling AI
python vton.py --provider kling --source person.jpg --reference garment.jpg

# Specify model version - Kling AI
python vton.py --provider kling --source person.jpg --reference garment.jpg --model kolors-virtual-try-on-v1-5

# Basic usage - Segmind
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Upper body"

# Specify inference parameters - Segmind
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Lower body" --num-steps 35 --guidance-scale 2.5

# Save output to specific directory
python vton.py --provider nova --source person.jpg --reference garment.jpg --output-dir results/

Python API Usage

from dotenv import load_dotenv
load_dotenv()

from tryon.api import AmazonNovaCanvasVTONAdapter
from PIL import Image

# Initialize adapter
adapter = AmazonNovaCanvasVTONAdapter(region="us-east-1")

# Generate virtual try-on images
images = adapter.generate_and_decode(
    source_image="data/person.jpg",
    reference_image="data/garment.jpg",
    mask_type="GARMENT",  # Options: "GARMENT", "IMAGE"
    garment_class="UPPER_BODY"  # Options: "UPPER_BODY", "LOWER_BODY", "FULL_BODY", "FOOTWEAR"
)

# Save results
for idx, image in enumerate(images):
    image.save(f"outputs/vton_result_{idx}.png")

Mask Types

GARMENT (Default): Automatically detects and masks garment area based on garment class
- UPPER_BODY: Tops, shirts, jackets, hoodies
- LOWER_BODY: Pants, skirts, shorts
- FULL_BODY: Dresses, jumpsuits
- FOOTWEAR: Shoes, boots
IMAGE: Uses a custom black-and-white mask image
- Black areas = replaced with garment
- White areas = preserved from source image

Supported AWS Regions

us-east-1 (US East - N. Virginia) - Default
ap-northeast-1 (Asia Pacific - Tokyo)
eu-west-1 (Europe - Ireland)

Example: Complete Workflow

from tryon.api import AmazonNovaCanvasVTONAdapter

# Initialize adapter
adapter = AmazonNovaCanvasVTONAdapter(region="us-east-1")

# Generate try-on for upper body garment
images = adapter.generate_and_decode(
    source_image="data/person.jpg",
    reference_image="data/shirt.jpg",
    mask_type="GARMENT",
    garment_class="UPPER_BODY"
)

# Generate try-on for lower body garment
images = adapter.generate_and_decode(
    source_image="data/person.jpg",
    reference_image="data/pants.jpg",
    mask_type="GARMENT",
    garment_class="LOWER_BODY"
)

# Save all results
for idx, image in enumerate(images):
    image.save(f"outputs/result_{idx}.png")

Reference: Amazon Nova Canvas Virtual Try-On Documentation

Virtual Try-On with Kling AI

Generate realistic virtual try-on images using Kling AI's Kolors virtual try-on API. This feature combines a source image (person/model) with a reference image (garment/product) to create realistic try-on results with automatic task polling until completion.

Prerequisites

Kling AI Account Setup:
- Sign up for a Kling AI account at Kling AI Developer Portal
- Obtain your API key (access key) and secret key from the developer portal
- Configure credentials in your .env file (see Environment Variables section)
Image Requirements:
- Maximum image size: 16M pixels (equivalent to 4,096 x 4,096)
- Maximum dimension: 4,096 pixels per side
- Supported formats: JPG, PNG
- Both source and reference images must meet size requirements

Command Line Usage

# Basic usage
python vton.py --provider kling --source person.jpg --reference garment.jpg

# Specify model version
python vton.py --provider kling --source person.jpg --reference garment.jpg --model kolors-virtual-try-on-v1-5

# Use custom base URL
python vton.py --provider kling --source person.jpg --reference garment.jpg --base-url https://api-singapore.klingai.com

# Save output to specific directory
python vton.py --provider kling --source person.jpg --reference garment.jpg --output-dir results/

Python API Usage

from dotenv import load_dotenv
load_dotenv()

from tryon.api import KlingAIVTONAdapter
from PIL import Image

# Initialize adapter (uses environment variables by default)
adapter = KlingAIVTONAdapter()

# Or specify credentials directly
adapter = KlingAIVTONAdapter(
    api_key="your_api_key",
    secret_key="your_secret_key",
    base_url="https://api-singapore.klingai.com"  # Optional
)

# Generate virtual try-on images
images = adapter.generate_and_decode(
    source_image="data/person.jpg",
    reference_image="data/garment.jpg",
    model="kolors-virtual-try-on-v1-5"  # Optional, uses API default if not specified
)

# Save results
for idx, image in enumerate(images):
    image.save(f"outputs/vton_result_{idx}.png")

Model Versions

Kling AI supports multiple model versions:

kolors-virtual-try-on-v1: Original model version
kolors-virtual-try-on-v1-5: Enhanced version

If not specified, the API uses the default model version.

Asynchronous Processing

Kling AI processes virtual try-on requests asynchronously. The adapter automatically:

Submits the request and receives a task_id
Polls the task status endpoint until completion
Returns image URLs when the task succeeds
Raises errors if the task fails or times out (default timeout: 5 minutes)

You can customize polling behavior:

# Manual polling
adapter = KlingAIVTONAdapter()

# Submit task
response = adapter.generate(
    source_image="person.jpg",
    reference_image="garment.jpg"
)
# This automatically polls until completion

# Or poll manually with custom settings
task_id = "your_task_id"
image_urls = adapter.poll_task_until_complete(
    task_id=task_id,
    poll_interval=2,  # Check every 2 seconds
    max_wait_time=600  # Maximum 10 minutes
)

Example: Complete Workflow

from tryon.api import KlingAIVTONAdapter

# Initialize adapter
adapter = KlingAIVTONAdapter()

# Generate try-on
images = adapter.generate_and_decode(
    source_image="data/person.jpg",
    reference_image="data/shirt.jpg",
    model="kolors-virtual-try-on-v1-5"
)

# Save all results
for idx, image in enumerate(images):
    image.save(f"outputs/result_{idx}.png")

Supported Base URLs

https://api-singapore.klingai.com (Singapore) - Default
Other regional endpoints may be available (check Kling AI documentation)

Reference: Kling AI API Documentation

Virtual Try-On with Segmind

Generate realistic virtual try-on images using Segmind's Try-On Diffusion API. This feature combines a model image (person) with a cloth image (garment/product) to create realistic try-on results.

Prerequisites

Segmind Account Setup:
- Sign up for a Segmind account at Segmind API Portal
- Obtain your API key from the Segmind dashboard
- Configure credentials in your .env file (see Environment Variables section)
Image Requirements:
- Images can be provided as file paths, URLs, or base64-encoded strings
- Supported formats: JPG, PNG
- Both model and cloth images must be valid image files

Command Line Usage

# Basic usage
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Upper body"

# Specify garment category
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Lower body"

# Use custom inference parameters
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Dress" --num-steps 35 --guidance-scale 2.5 --seed 42

# Save output to specific directory
python vton.py --provider segmind --source person.jpg --reference garment.jpg --category "Upper body" --output-dir results/

Python API Usage

from dotenv import load_dotenv
load_dotenv()

from tryon.api import SegmindVTONAdapter
from PIL import Image

# Initialize adapter (uses environment variable by default)
adapter = SegmindVTONAdapter()

# Or specify API key directly
adapter = SegmindVTONAdapter(api_key="your_api_key")

# Generate virtual try-on images
images = adapter.generate_and_decode(
    model_image="data/person.jpg",
    cloth_image="data/garment.jpg",
    category="Upper body",  # Options: "Upper body", "Lower body", "Dress"
    num_inference_steps=35,  # Optional: 20-100, default: 25
    guidance_scale=2.5,  # Optional: 1-25, default: 2
    seed=42  # Optional: -1 to 999999999999999, default: -1
)

# Save results
for idx, image in enumerate(images):
    image.save(f"outputs/vton_result_{idx}.png")

Garment Categories

Segmind supports three garment categories:

"Upper body": Tops, shirts, jackets, hoodies (default)
"Lower body": Pants, skirts, shorts
"Dress": Dresses, jumpsuits

Inference Parameters

num_inference_steps: Number of denoising steps (default: 25, range: 20-100)
- Higher values may produce better quality but take longer
guidance_scale: Scale for classifier-free guidance (default: 2, range: 1-25)
- Higher values make the model follow the input more closely
seed: Seed for reproducible results (default: -1 for random, range: -1 to 999999999999999)

Example: Complete Workflow

from tryon.api import SegmindVTONAdapter

# Initialize adapter
adapter = SegmindVTONAdapter()

# Generate try-on for upper body garment
images = adapter.generate_and_decode(
    model_image="data/person.jpg",
    cloth_image="data/shirt.jpg",
    category="Upper body"
)

# Generate try-on for lower body garment with custom parameters
images = adapter.generate_and_decode(
    model_image="data/person.jpg",
    cloth_image="data/pants.jpg",
    category="Lower body",
    num_inference_steps=35,
    guidance_scale=2.5,
    seed=42
)

# Save all results
for idx, image in enumerate(images):
    image.save(f"outputs/result_{idx}.png")

Reference: Segmind Try-On Diffusion API Documentation

Image Generation with Nano Banana

Generate high-quality images using Google's Gemini image generation models (Nano Banana and Nano Banana Pro). These models support text-to-image generation, image editing, multi-image composition, and batch generation.

Prerequisites

Google Gemini Account Setup:
- Sign up for a Google AI Studio account at Google AI Studio
- Obtain your API key from the API Keys page
- Configure credentials in your .env file (see Environment Variables section)
Model Selection:
- Nano Banana (Gemini 2.5 Flash Image): Fast, efficient, 1024px resolution - ideal for high-volume tasks
- Nano Banana Pro (Gemini 3 Pro Image Preview): Advanced, up to 4K resolution, search grounding - ideal for professional production

Command Line Usage

# Text-to-image with Nano Banana (Fast)
python image_gen.py --provider nano-banana --prompt "A stylish fashion model wearing a modern casual outfit in a studio setting"

# Text-to-image with Nano Banana Pro (4K)
python image_gen.py --provider nano-banana-pro --prompt "Professional fashion photography of elegant evening wear on a runway" --resolution 4K

# Image editing
python image_gen.py --provider nano-banana --mode edit --image person.jpg --prompt "Change the outfit to a formal business suit"

# Multi-image composition
python image_gen.py --provider nano-banana --mode compose --images outfit1.jpg outfit2.jpg --prompt "Create a fashion catalog layout combining these clothing styles"

# Batch generation
python image_gen.py --provider nano-banana --batch prompts.txt --output-dir results/

Python API Usage

Nano Banana (Fast):

from dotenv import load_dotenv
load_dotenv()

from tryon.api.nano_banana import NanoBananaAdapter

# Initialize adapter
adapter = NanoBananaAdapter()

# Text-to-image generation
images = adapter.generate_text_to_image(
    prompt="A stylish fashion model wearing a modern casual outfit in a studio setting",
    aspect_ratio="16:9"  # Optional: "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"
)

# Image editing
images = adapter.generate_image_edit(
    image="person.jpg",
    prompt="Change the outfit to a formal business suit"
)

# Multi-image composition
images = adapter.generate_multi_image(
    images=["outfit1.jpg", "outfit2.jpg"],
    prompt="Create a fashion catalog layout combining these clothing styles"
)

# Batch generation
results = adapter.generate_batch([
    "A fashion model showcasing summer collection",
    "Professional photography of formal wear",
    "Casual street style outfit on a model"
])

# Save results
for idx, image in enumerate(images):
    image.save(f"outputs/generated_{idx}.png")

Nano Banana Pro (Advanced):

from tryon.api.nano_banana import NanoBananaProAdapter

# Initialize adapter
adapter = NanoBananaProAdapter()

# Text-to-image with 4K resolution
images = adapter.generate_text_to_image(
    prompt="Professional fashion photography of elegant evening wear on a runway",
    resolution="4K",  # Options: "1K", "2K", "4K"
    aspect_ratio="16:9",
    use_search_grounding=True  # Optional: Use Google Search for real-world grounding
)

# Image editing with 2K resolution
images = adapter.generate_image_edit(
    image="person.jpg",
    prompt="Change the outfit to a formal business suit",
    resolution="2K"
)

# Save results
images[0].save("result.png")

Supported Features

Text-to-Image: Generate images from text descriptions
Image Editing: Edit images using text prompts (add, remove, modify elements)
Multi-Image Composition: Combine multiple images with style transfer
Batch Generation: Generate multiple images in batch
Aspect Ratios: 10 supported aspect ratios (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
High Resolution: Up to 4K resolution with Nano Banana Pro
Search Grounding: Real-world grounding using Google Search (Nano Banana Pro only)

Aspect Ratios

Nano Banana (1024px):

"1:1" (1024x1024)
"16:9" (1344x768)
"9:16" (768x1344)
And 7 more options

Nano Banana Pro (1K/2K/4K):

Same aspect ratios with resolution-specific dimensions
"1K": Standard resolution
"2K": High resolution
"4K": Ultra-high resolution

Reference: Gemini Image Generation Documentation

Image Generation with FLUX.2

Generate high-quality images using FLUX.2 [PRO] and FLUX.2 [FLEX] models from BFL AI. These models support text-to-image generation, image editing, multi-image composition, and advanced controls.

Prerequisites

BFL AI Account Setup:
- Sign up for a BFL AI account at BFL AI
- Obtain your API key from the BFL AI dashboard
- Configure credentials in your .env file (see Environment Variables section)
Model Selection:
- FLUX.2 [PRO]: High-quality image generation with standard controls - ideal for most use cases
- FLUX.2 [FLEX]: Flexible generation with advanced controls (guidance scale, steps, prompt upsampling) - ideal for fine-tuned control

Command Line Usage

# Text-to-image with FLUX.2 PRO
python image_gen.py --provider flux2-pro --prompt "A professional fashion model wearing elegant evening wear" --width 1024 --height 1024

# Text-to-image with FLUX.2 FLEX (Advanced controls)
python image_gen.py --provider flux2-flex --prompt "A stylish fashion model wearing elegant evening wear" --width 1024 --height 1024 --guidance 7.5 --steps 50

# Image editing
python image_gen.py --provider flux2-pro --mode edit --image person.jpg --prompt "Change the outfit to casual streetwear"

# Multi-image composition
python image_gen.py --provider flux2-pro --mode compose --images outfit1.jpg outfit2.jpg --prompt "Combine these clothing styles into a cohesive outfit"

Python API Usage

FLUX.2 [PRO]:

from dotenv import load_dotenv
load_dotenv()

from tryon.api import Flux2ProAdapter

# Initialize adapter
adapter = Flux2ProAdapter()

# Text-to-image generation
images = adapter.generate_text_to_image(
    prompt="A professional fashion model wearing elegant evening wear on a runway",
    width=1024,
    height=1024,
    seed=42
)

# Image editing
images = adapter.generate_image_edit(
    prompt="Change the outfit to casual streetwear style",
    input_image="model.jpg",
    width=1024,
    height=1024
)

# Multi-image composition
images = adapter.generate_multi_image(
    prompt="Create a fashion catalog layout combining these clothing styles",
    images=["outfit1.jpg", "outfit2.jpg", "accessories.jpg"],
    width=1024,
    height=1024
)

# Save results
images[0].save("result.png")

FLUX.2 [FLEX]:

from tryon.api import Flux2FlexAdapter

# Initialize adapter
adapter = Flux2FlexAdapter()

# Text-to-image with advanced controls
images = adapter.generate_text_to_image(
    prompt="A stylish fashion model wearing elegant evening wear",
    width=1024,
    height=1024,
    guidance=7.5,  # Higher guidance = more adherence to prompt (1.5-10)
    steps=50,  # More steps = higher quality (default: 28)
    prompt_upsampling=True,  # Enhance prompt quality
    seed=42
)

# Image editing with advanced controls
images = adapter.generate_image_edit(
    prompt="Transform the outfit to match a vintage 1920s fashion style",
    input_image="model.jpg",
    width=1024,
    height=1024,
    guidance=8.0,
    steps=50,
    prompt_upsampling=True
)

# Save results
images[0].save("result.png")

Supported Features

Text-to-Image: Generate images from text descriptions
Image Editing: Edit images using text prompts (add, remove, modify elements)
Multi-Image Composition: Combine up to 8 images with style transfer
Custom Dimensions: Control width and height (minimum: 64 pixels)
Advanced Controls (FLEX only): Guidance scale (1.5-10), steps (default: 28), prompt upsampling
Reproducibility: Seed support for consistent results
Safety Controls: Moderation tolerance (0-5, default: 2)
Output Formats: JPEG or PNG

Key Differences: PRO vs FLEX

FLUX.2 [PRO]: Simpler API, faster generation, good for most use cases
FLUX.2 [FLEX]: Advanced controls (guidance, steps, prompt upsampling), more fine-tuned control over generation quality

Reference: FLUX.2 API Documentation

Preprocessing Functions

Segment Garment

Segments garments from images using U2Net model.

from tryon.preprocessing import segment_garment

segment_garment(
    inputs_dir="path/to/input/images",
    outputs_dir="path/to/output/segments",
    cls="upper"  # "upper", "lower", or "all"
)

Extract Garment

Extracts and preprocesses garments from images.

from tryon.preprocessing import extract_garment

extract_garment(
    inputs_dir="path/to/input/images",
    outputs_dir="path/to/output/garments",
    cls="upper",
    resize_to_width=400
)

Segment Human

Segments human subjects from images.

from tryon.preprocessing import segment_human

segment_human(
    image_path="path/to/human/image.jpg",
    output_dir="path/to/output/directory"
)

🎨 Demos

The project includes several interactive Gradio demos for easy experimentation:

Extract Garment Demo

python run_demo.py --name extract_garment

Model Swap Demo

python run_demo.py --name model_swap

Outfit Generator Demo

python run_demo.py --name outfit_generator

Each demo launches a web interface where you can interact with the models through a user-friendly UI.

📁 Project Structure

opentryon/
├── tryon/                    # Main try-on preprocessing module
│   ├── api/                 # API adapters
│   │   ├── nova_canvas.py  # Amazon Nova Canvas VTON adapter
│   │   ├── kling_ai.py     # Kling AI VTON adapter
│   │   ├── segmind.py      # Segmind Try-On Diffusion adapter
│   │   ├── nano_banana/    # Nano Banana (Gemini) image generation adapters
│   │   │   └── adapter.py  # NanoBananaAdapter and NanoBananaProAdapter
│   │   └── flux2.py        # FLUX.2 [PRO] and [FLEX] image generation adapters
│   ├── datasets/            # Dataset loaders
│   │   ├── base.py         # Base dataset interface
│   │   ├── fashion_mnist.py # Fashion-MNIST dataset
│   │   ├── viton_hd.py     # VITON-HD dataset
│   │   ├── example_usage.py # Usage examples
│   │   └── README.md       # Datasets documentation
│   ├── preprocessing/        # Preprocessing utilities
│   │   ├── captioning/       # Image captioning
│   │   ├── sam2/            # SAM2 segmentation
│   │   ├── u2net/           # U2Net segmentation models
│   │   └── utils.py         # Utility functions
│   └── models/              # Model implementations
│       └── ootdiffusion/    # OOTDiffusion model
├── tryondiffusion/          # TryOnDiffusion implementation
│   ├── diffusion.py         # Diffusion model
│   ├── network.py           # Network architecture
│   ├── trainer.py           # Training utilities
│   ├── pre_processing/      # Preprocessing for training
│   └── utils/               # Utility functions
├── demo/                    # Interactive demos
│   ├── extract_garment/     # Garment extraction demo
│   ├── model_swap/          # Model swap demo
│   └── outfit_generator/    # Outfit generator demo
├── scripts/                 # Installation scripts
├── main.py                  # Main CLI entry point
├── vton.py                  # Virtual try-on CLI (Amazon Nova Canvas, Kling AI, Segmind)
├── image_gen.py             # Image generation CLI (Nano Banana, FLUX.2)
├── run_demo.py              # Demo launcher
├── requirements.txt         # Python dependencies
└── environment.yml          # Conda environment

🗺️ TryOnDiffusion: Roadmap

Based on the TryOnDiffusion paper:

~~Prepare initial implementation~~
Test initial implementation with small dataset (VITON-HD)
Gather sufficient data and compute resources
Prepare and train final implementation
Publicly release parameters

🤝 Contributing

We welcome contributions! Please follow these steps:

1. Open an Issue

We recommend opening an issue (if one doesn't already exist) and discussing your intended changes before making any modifications. This helps us provide feedback and confirm the planned changes.

2. Fork and Set Up

Fork the repository
Set up the environment using the installation instructions above
Install dependencies
Make your changes

3. Create Pull Request

Create a pull request to the main branch from your fork's branch. Please ensure:

Your code follows the project's style guidelines
You've tested your changes
Documentation is updated if needed

4. Review Process

Once the pull request is created, we will review the code changes and merge the pull request as soon as possible.

Writing Documentation

If you're interested in improving documentation, you can:

Add content to README.md
Create new documentation files as needed
Submit a pull request with your documentation improvements

For detailed contribution guidelines, see CONTRIBUTING.md.

📝 Requirements

Key dependencies include:

PyTorch (== 2.1.2)
torchvision (== 0.16.2)
diffusers (== 0.29.2)
transformers (== 4.42.4)
opencv-python (== 4.8.1.78)
scikit-image (== 0.22.0)
numpy (== 1.26.4)
einops (== 0.7.0)
requests (>= 2.31.0)
PyJWT (>= 2.10.1)
boto3 (== 1.40.64)
python-dotenv (== 1.0.1)
google-genai (== 1.52.0)

See requirements.txt or environment.yml for the complete list of dependencies.

📚 Additional Resources

TryOnDiffusion Paper: arXiv:2306.08276
Amazon Nova Canvas: AWS Blog Post
Kling AI: Kling AI API Documentation
Segmind: Segmind Try-On Diffusion API
Nano Banana: Gemini Image Generation Documentation
FLUX.2: BFL AI Documentation
Discord Community: Join our Discord
Outfit Generator Model: FLUX.1-dev LoRA Outfit Generator

📄 License

All material is made available under Creative Commons BY-NC 4.0.

You can use the material for non-commercial purposes, as long as you:

Give appropriate credit by citing our original GitHub repository
Indicate any changes that you've made to the code

Made with ❤️ by TryOn Labs

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
.github/workflows		.github/workflows
demo		demo
docs		docs
fashion-prompt-builder		fashion-prompt-builder
scripts		scripts
tryon		tryon
tryondiffusion		tryondiffusion
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
ROADMAP.md		ROADMAP.md
caption_images.py		caption_images.py
environment.yml		environment.yml
image_gen.py		image_gen.py
main.py		main.py
requirements.txt		requirements.txt
run_demo.py		run_demo.py
run_ootd.py		run_ootd.py
setup.py		setup.py
vton.py		vton.py

tryonlabs/opentryon

Folders and files

Latest commit

History

Repository files navigation

OpenTryOn: Open-source AI toolkit for fashion tech and virtual try-on

Star History

🎯 Features

📋 Table of Contents

📚 Documentation

🚀 Installation

Prerequisites

Step 1: Clone the Repository

Step 2: Create Conda Environment

Step 3: Install Package

Step 4: Environment Variables

🎮 Quick Start

Basic Preprocessing

Command Line Interface

📖 Usage

Datasets Module

Supported Datasets

Quick Example

Documentation

Virtual Try-On with Amazon Nova Canvas

Prerequisites

Command Line Usage

Python API Usage

Mask Types

Supported AWS Regions

Example: Complete Workflow

Virtual Try-On with Kling AI

Prerequisites

Command Line Usage

Python API Usage

Model Versions

Asynchronous Processing

Example: Complete Workflow

Supported Base URLs

Virtual Try-On with Segmind

Prerequisites

Command Line Usage

Python API Usage

Garment Categories

Inference Parameters

Example: Complete Workflow

Image Generation with Nano Banana

Prerequisites

Command Line Usage

Python API Usage

Supported Features

Aspect Ratios

Image Generation with FLUX.2

Prerequisites

Command Line Usage

Python API Usage

Supported Features

Key Differences: PRO vs FLEX

Preprocessing Functions

Segment Garment

Extract Garment

Segment Human

🎨 Demos

Extract Garment Demo

Model Swap Demo

Outfit Generator Demo

📁 Project Structure

🗺️ TryOnDiffusion: Roadmap

🤝 Contributing

1. Open an Issue

2. Fork and Set Up

3. Create Pull Request

4. Review Process

Writing Documentation

📝 Requirements

📚 Additional Resources

📄 License

About

Topics

Resources

Packages