Wardrobe Creation Script

This notebook section will download the dataset and then use the class structure to build your two simulated wardrobes.



In [None]:
# Install all necessary packages for Kaggle, file manipulation, and Gemini Vision API
!pip install pandas numpy matplotlib pillow google-generativeai tqdm kagglehub --quiet

0. Download the Dataset

Run this cell to download the dataset using the Kaggle API.

In [None]:
import kagglehub
import os
from pathlib import Path
import random
import shutil

# --- Configuration ---
DATASET_REF = "ryanbadai/clothes-dataset"
DATASET_PATH = Path(kagglehub.dataset_download(DATASET_REF))
print(f" Dataset downloaded to: {DATASET_PATH}")

# Define the root for the wardrobes
WARDROBE_ROOT = Path("simulated_wardrobes")
WARDROBE_ROOT.mkdir(exist_ok=True)

# Define the input directory where the class folders are located
INPUT_DIR = DATASET_PATH / "Clothes_Dataset" 

if not INPUT_DIR.is_dir():
    # Attempt to adjust for common dataset nesting issues
    INPUT_DIR = DATASET_PATH / DATASET_REF.split("/")[-1] 
    if not INPUT_DIR.is_dir():
        print(f"Error: Could not find the main class folder. Expected it at {DATASET_PATH / 'Clothes_Dataset(15 directories)'} or {INPUT_DIR}.")
        print("Please manually check the downloaded structure at:", DATASET_PATH)
    else:
        print(f"Class folders found at: {INPUT_DIR}")


 Dataset downloaded to: /home/ines/.cache/kagglehub/datasets/ryanbadai/clothes-dataset/versions/1


1. Define Wardrobe Structure

We categorize the 15 dataset classes into Tops, Bottoms, Outers, and Dresses to match your requested counts. We'll make educated guesses on gender suitability for each item type.

In [None]:
# --- Wardrobe Blueprints ---
# Updated to generate 500 images total (250 per wardrobe)

FEMALE_WARDROBE_PLAN = {
    "Outers": (5, ["Blazer", "Hoodie", "Jaket", "Jaket_Denim", "Mantel", "Jaket_Olahraga"]),
    "Bottoms": (8, ["Celana_Panjang", "Celana_Pendek", "Jeans", "Rok"]),
    "Tops": (10, ["Kaos", "Kemeja", "Polo", "Sweter"]),
    "Dresses": (3, ["Gaun"])
}

MALE_WARDROBE_PLAN = {
    "Outers": (5, ["Blazer", "Hoodie", "Jaket", "Jaket_Denim", "Mantel", "Jaket_Olahraga"]),
    "Bottoms": (8, ["Celana_Panjang", "Celana_Pendek", "Jeans"]),
    "Tops": (10, ["Kaos", "Kemeja", "Polo", "Sweter"]),
    "Dresses": (0, [])
}

ALL_PLANS = {
    "Female_Wardrobe": FEMALE_WARDROBE_PLAN,
    "Male_Wardrobe": MALE_WARDROBE_PLAN
}

def create_simulated_wardrobe(wardrobe_name, plan, input_root, output_root):
    """Selects and copies a random sample of images based on the defined plan."""
    
    # Setup output directory
    output_dir = output_root / wardrobe_name
    output_dir.mkdir(exist_ok=True)
    
    total_images = 0
    
    print(f"\n--- Creating {wardrobe_name} ---")
    
    for category_type, (count_needed, class_list) in plan.items():
        if count_needed == 0:
            continue
            
        # Compile all potential image paths for the required classes
        candidate_paths = []
        
        # Calculate how many images to sample from each class (distribute evenly)
        num_classes = len(class_list)
        samples_per_class = max(1, count_needed // num_classes)
        
        for class_name in class_list:
            class_dir = input_root / class_name
            if class_dir.is_dir():
                # Add all image paths from the class directory
                candidate_paths.extend([p for p in class_dir.iterdir() if p.suffix.lower() in ('.jpg', '.jpeg')])

        # Randomly select the required number of items
        # Ensure we don't try to sample more than available
        count_to_sample = min(count_needed, len(candidate_paths))
        
        # Select items randomly without replacement
        selected_images = random.sample(candidate_paths, count_to_sample)
        
        # 4. Copy selected images to the new wardrobe folder
        for i, src_path in enumerate(selected_images):
            # Rename file to include category (e.g., 'Tops_01_Kaos.jpg')
            new_name = f"{category_type}_{i+1:02d}_{src_path.parent.name}_{src_path.name}"
            dest_path = output_dir / new_name
            shutil.copy(src_path, dest_path)
            total_images += 1
            
        print(f"   - {category_type}: Copied {len(selected_images)}/{count_needed} items.")

    print(f"Total items in {wardrobe_name}: {total_images}")
    return output_dir

In [None]:
# Set random seed for reproducibility
random.seed(42)

female_dir = create_simulated_wardrobe(
    "Female_Wardrobe", 
    FEMALE_WARDROBE_PLAN, 
    INPUT_DIR, 
    WARDROBE_ROOT
)

male_dir = create_simulated_wardrobe(
    "Male_Wardrobe", 
    MALE_WARDROBE_PLAN, 
    INPUT_DIR, 
    WARDROBE_ROOT
)

print(f"\n\n Wardrobe creation complete.")
print(f"Female Wardrobe images are in: {female_dir}")
print(f"Male Wardrobe images are in: {male_dir}")
print("\nThese folders now contain the images for testing your Gemini analysis script.")


--- Creating Female_Wardrobe ---
   - Outers: Copied 5/5 items.
   - Bottoms: Copied 8/8 items.
   - Tops: Copied 10/10 items.
   - Dresses: Copied 3/3 items.
Total items in Female_Wardrobe: 26

--- Creating Male_Wardrobe ---
   - Outers: Copied 5/5 items.
   - Bottoms: Copied 8/8 items.
   - Tops: Copied 10/10 items.
Total items in Male_Wardrobe: 23


âœ… Wardrobe creation complete.
Female Wardrobe images are in: simulated_wardrobes/Female_Wardrobe
Male Wardrobe images are in: simulated_wardrobes/Male_Wardrobe

These folders now contain the images for testing your Gemini analysis script.
