# <img src="assets/doro.png" width="32" height="32"> Dataset Preparation

Prepare training datasets with automated tagging and curation tools.

## 📋 **Workflow**
1. **Upload dataset** (ZIP or folder)
2. **Visual curation** with FiftyOne
3. **Auto-tagging** (WD14/BLIP)
4. **Caption editing** and trigger words
5. **Move to training** in `Unified_LoRA_Trainer.ipynb`

## 📚 **Documentation**
- [Dataset Preparation Guide](docs/dataset-guides/dataset_preparation.md) - Complete workflow
- [Creating Characters Guide](docs/dataset-guides/creating-characters.md) - Character LoRA methods

---

## <img src="assets/OTNDORODUSKFIXED.png" width="32" height="32"> 1. Setup Validation

**Purpose:** Environment setup if not already completed.

**When to run:** First-time setup or if getting "module not found" errors.

**Skip if:** Already ran setup in `Unified_LoRA_Trainer.ipynb`.

In [None]:
# **CELL 1A:** Environment Validation

from shared_managers import create_widget

# Initialize and display the simplified setup widget (validation only)
setup_widget = create_widget('setup_simple')
setup_widget.display()

## <img src="assets/doro_fubuki.png" width="32" height="32"> 2. Dataset Management

**Purpose:** Auto-tag and prepare your curated images for training.

**Features:**
- 📁 **Dataset input** and directory selection
- 🏷️ **Auto-tagging** with WD14 v3 or BLIP
- ✏️ **Caption editing** and bulk operations
- 🎯 **Trigger word** injection
- 🚫 **Tag filtering** and blacklists

See [Dataset Preparation Guide](docs/dataset-guides/dataset_preparation.md) for detailed workflow.

In [2]:
# **CELL 3:** Dataset Tagging Widget (After Curation)

from shared_managers import create_widget

# Initialize and display the dataset widget for tagging curated images
dataset_widget = create_widget('dataset')
dataset_widget.display()

VBox(children=(HTML(value='<h2>📊 2. Dataset Manager</h2>'), Accordion(children=(VBox(children=(HTML(value="<h3…

---

## <img src="assets/OTNANGELDOROFIX.png" width="32" height="32"> Next Steps

1. **Note your dataset path** for training setup
2. **Remember your trigger word** for generation
3. **Open** `Unified_LoRA_Trainer.ipynb` for training

---

## <img src="assets/OTNEARTHFIXDORO.png" width="32" height="32"> Troubleshooting

**Common issues:** See [Troubleshooting Guide](docs/guides/troubleshooting.md#dataset-issues) for solutions.

**Quick fixes:**
- **No images found**: Check ZIP structure and file formats
- **Tagging failed**: Verify internet connection and disk space
- **Missing trigger words**: Use bulk edit or check injection settings