<a href="https://colab.research.google.com/github/MarcoFierimonte/virtual-try-off/blob/main/Python_Script_for_AI_Fashion_Flat_Lay.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

For **Path 2 (Serverless GPU / API)**, you are moving away from all-in-one consumer apps and into building your own logic using professional AI APIs. This is the "Developer Way"—giving you total control over the output while only paying for what you use.

I’ve structured this script using **Replicate** (the most popular platform for this), but the logic works similarly for Fal.ai.

### Prerequisites

You will need your API token from [Replicate](https://www.google.com/search?q=https://replicate.com/account).

```bash
pip install replicate
export REPLICATE_API_TOKEN=your_token_here

```

---

### The Python Script: The "Reconstruction" Pipeline

This script performs the three steps: **Segments** the shirt (SAM 2), **Classifies** the style (Llama Vision), and **Generates** the professional flat-lay (Flux.1).

In [None]:
import replicate
import json

# 1. SEGMENTATION: Isolate the garment from the messy photo
def segment_garment(image_url):
    print("--- Step 1: Segmenting Garment ---")
    # We use SAM 2 to get a high-quality mask
    # For clothing, 'mask' is usually the main subject
    output = replicate.run(
        "lucataco/remove-bg:95fcc2a26d3899cd6c2691c900465aaeff466285a65c14638cc5f36f34befaf1",
        input={"image": image_url}
    )
    # Returns a URL of the garment with transparent background
    print(f"Isolated Garment: {output}")
    return output

# 2. CLASSIFICATION: Identify the item for better prompting
def classify_garment(image_url):
    print("--- Step 2: Classifying Style ---")
    prompt = """Analyze this clothing item. Return ONLY a JSON object with:
    {"type": "t-shirt/hoodie/etc", "color": "color", "material": "material", "fit": "oversized/slim"}"""

    output = replicate.run(
        "meta/llama-3.2-11b-vision-instruct",
        input={
            "image": image_url,
            "prompt": prompt,
            "max_new_tokens": 100
        }
    )
    # Parse the clean JSON from the LLM
    description = "".join(output)
    return json.loads(description)

# 3. GENERATION: Create the professional Flat-Lay
def generate_flat_lay(isolated_image_url, details):
    print("--- Step 3: Generating Flat Lay ---")
    # We use Flux Fill to "re-contextualize" the isolated shirt
    prompt = f"A professional studio flat lay of a {details['fit']} {details['color']} {details['type']} made of {details['material']}. Placed on a clean, minimalist off-white surface with soft natural lighting and elegant shadows. High-end e-commerce photography."

    output = replicate.run(
        "black-forest-labs/flux-fill-pro",
        input={
            "image": isolated_image_url,
            "prompt": prompt,
            "guidance_scale": 30,
            "output_format": "webp"
        }
    )
    return output[0]

# --- EXECUTION ---
INPUT_IMAGE = "https://your-storage.com/messy_photo.jpg"

try:
    # Run the pipeline
    cutout_url = segment_garment(INPUT_IMAGE)
    garment_data = classify_garment(cutout_url)
    final_flat_lay = generate_flat_lay(cutout_url, garment_data)

    print("\n--- Success! ---")
    print(f"Your professional flat-lay is ready: {final_flat_lay}")

except Exception as e:
    print(f"Error in pipeline: {e}")

---

### Why this works

* **SAM 2 / Remove-BG:** Standard background removal often leaves "halos." By using the latest SAM 2-based endpoints, you get the surgical precision needed for clothing edges (stitching, frayed ends).
* **Llama 3.2 Vision:** Rather than just saying "the shirt," this step allows your AI to say "an oversized heavy-weight cotton charcoal tee." This specificity tells **Flux** exactly how to handle the textures and drape in the final image.
* **Flux.1 Fill:** This is the magic. Unlike standard "Text-to-Image," **Fill** models respect the shape of your original cutout while "filling in" the lighting and background to make it look like a real photo.

### Comparison of Costs (Approximate)

| Action | Model | Cost per Run |
| --- | --- | --- |
| **Segmentation** | Remove-BG | ~$0.002 |
| **Classification** | Llama 3.2 Vision | ~$0.005 |
| **Generation** | Flux.1 Fill | ~$0.03 - $0.05 |
| **Total** |  | **<$0.10 per image** |

**Would you like me to refine the prompt in Step 3 to include specific brand "vibes" (e.g., Nike-style vs. luxury boutique)?**