# Step 5: Iteration - Hybrid Acquisition Loop

This is where the magic happens. Instead of randomly labeling more data, we use a **hybrid acquisition strategy**:

- **30% Coverage Refresh** (ZCore): Ensures we don't tunnel-vision on failures
- **70% Targeted Failure Mining**: Focus on where the model struggles

This step teaches you to:
1. Implement the hybrid acquisition recipe
2. Use embedding-based neighbor expansion
3. Balance FN vs FP vs class confusion in your selection
4. Iterate without contaminating your test set

**Why This Matters**: Only chasing failures creates a model that's great at edge cases and terrible at normal cases. The coverage budget keeps you honest.

## Load the Dataset

In [None]:
import fiftyone as fo
from fiftyone import ViewField as F
import fiftyone.brain as fob
import numpy as np

# Load dataset
dataset = fo.load_dataset("kitti_annotation_tutorial")

# Get the remaining unlabeled pool
pool_remaining = dataset.match_tags("split:pool").match(F("annotation_status") != "annotated")

# Get validation failures from Step 4
failures_view = dataset.load_saved_view("eval_v0_failures")

print(f"Remaining pool: {len(pool_remaining)} samples")
print(f"Failure samples from eval: {len(failures_view)} samples")

## Define Acquisition Budget

For iteration 1, we'll select another batch. The split:
- 30% from ZCore (coverage)
- 70% from failure mining (targeted)

In [None]:
# Define batch size for iteration 1
batch_size = int(0.15 * len(pool_remaining))  # 15% of remaining pool

# Split budget
coverage_budget = int(0.30 * batch_size)  # 30% for coverage
targeted_budget = batch_size - coverage_budget  # 70% for targeted

# Further split targeted budget
fn_budget = int(0.50 * targeted_budget)  # 50% of targeted for FN
fp_budget = int(0.30 * targeted_budget)  # 30% of targeted for FP
confusion_budget = targeted_budget - fn_budget - fp_budget  # 20% for confusion

print("Acquisition Budget for Iteration 1:")
print(f"  Total batch: {batch_size} samples")
print(f"  ├── Coverage (ZCore): {coverage_budget} (30%)")
print(f"  └── Targeted: {targeted_budget} (70%)")
print(f"      ├── FN mining: {fn_budget} (35% of total)")
print(f"      ├── FP mining: {fp_budget} (21% of total)")
print(f"      └── Confusion: {confusion_budget} (14% of total)")

## Part 1: Coverage Refresh (ZCore)

First, we select coverage-optimized samples from the remaining pool using the same uniqueness-based approach from Step 2.

In [None]:
# Ensure embeddings exist on remaining pool
# (Should already exist from Step 2)
has_embeddings = pool_remaining.exists("embeddings")
print(f"Samples with embeddings: {len(has_embeddings)}/{len(pool_remaining)}")

# If missing, compute them
if len(has_embeddings) < len(pool_remaining):
    print("Computing missing embeddings...")
    missing = pool_remaining.match(~F("embeddings").exists())
    fob.compute_visualization(
        missing,
        embeddings="embeddings",
        brain_key="img_viz_iter1"
    )

In [None]:
# Compute uniqueness on remaining pool
fob.compute_uniqueness(
    pool_remaining,
    uniqueness_field="uniqueness_v1",
    embeddings="embeddings"
)

print("Uniqueness scores computed for remaining pool")

In [None]:
# Select coverage samples (highest uniqueness)
coverage_samples = pool_remaining.sort_by("uniqueness_v1", reverse=True).limit(coverage_budget)
coverage_ids = list(coverage_samples.values("id"))

print(f"Coverage selection: {len(coverage_ids)} samples")

## Part 2: Targeted Failure Mining

Now we mine for samples similar to our failure cases. The strategy:
1. Take the failure samples from evaluation
2. Find their nearest neighbors in embedding space
3. Select neighbors that are still unlabeled

This expands around failure cases without just picking the exact failures.

In [None]:
# Get embeddings for remaining pool
pool_embeddings = np.array([s.embeddings for s in pool_remaining if s.embeddings is not None])
pool_ids_with_emb = [s.id for s in pool_remaining if s.embeddings is not None]

print(f"Pool samples with embeddings: {len(pool_ids_with_emb)}")

In [None]:
# Helper function to find nearest neighbors
from sklearn.metrics.pairwise import cosine_similarity

def find_neighbors(query_embeddings, pool_embeddings, pool_ids, n_neighbors=5):
    """Find nearest neighbors in embedding space."""
    if len(query_embeddings) == 0:
        return []
    
    # Compute similarities
    similarities = cosine_similarity(query_embeddings, pool_embeddings)
    
    # For each query, get top n_neighbors
    neighbor_ids = set()
    for sim_row in similarities:
        top_indices = np.argsort(sim_row)[-n_neighbors:]
        for idx in top_indices:
            neighbor_ids.add(pool_ids[idx])
    
    return list(neighbor_ids)

In [None]:
# Mine around high-FN samples
high_fn_samples = failures_view.match_tags("failure:high_fn")
fn_embeddings = np.array([s.embeddings for s in high_fn_samples if s.embeddings is not None])

fn_neighbors = find_neighbors(
    fn_embeddings, 
    pool_embeddings, 
    pool_ids_with_emb,
    n_neighbors=max(1, fn_budget // max(1, len(fn_embeddings)))
)

# Filter to only include samples not already selected for coverage
fn_selection = [sid for sid in fn_neighbors if sid not in coverage_ids][:fn_budget]
print(f"FN-targeted selection: {len(fn_selection)} samples")

In [None]:
# Mine around high-FP samples
high_fp_samples = failures_view.match_tags("failure:high_fp")
fp_embeddings = np.array([s.embeddings for s in high_fp_samples if s.embeddings is not None])

fp_neighbors = find_neighbors(
    fp_embeddings,
    pool_embeddings,
    pool_ids_with_emb,
    n_neighbors=max(1, fp_budget // max(1, len(fp_embeddings)))
)

# Filter out already selected
already_selected = set(coverage_ids + fn_selection)
fp_selection = [sid for sid in fp_neighbors if sid not in already_selected][:fp_budget]
print(f"FP-targeted selection: {len(fp_selection)} samples")

In [None]:
# For confusion cases, we'd ideally look at class-confused samples
# For simplicity, we'll select random samples from remaining pool
import random

already_selected = set(coverage_ids + fn_selection + fp_selection)
remaining_ids = [sid for sid in pool_ids_with_emb if sid not in already_selected]

random.seed(42)
confusion_selection = random.sample(remaining_ids, min(confusion_budget, len(remaining_ids)))
print(f"Confusion/diversity selection: {len(confusion_selection)} samples")

## Combine and Tag Batch v1

In [None]:
# Combine all selections
batch_v1_ids = coverage_ids + fn_selection + fp_selection + confusion_selection

# Remove duplicates (shouldn't be any, but just in case)
batch_v1_ids = list(set(batch_v1_ids))

print(f"\nBatch v1 Total: {len(batch_v1_ids)} samples")
print(f"  Coverage: {len(coverage_ids)}")
print(f"  FN-targeted: {len(fn_selection)}")
print(f"  FP-targeted: {len(fp_selection)}")
print(f"  Confusion/diversity: {len(confusion_selection)}")

In [None]:
# Tag the batch
batch_v1 = dataset.select(batch_v1_ids)

# Add tags
batch_v1.tag_samples("batch:v1")
batch_v1.tag_samples("to_annotate")

# Update annotation status
batch_v1.set_values("annotation_status", ["selected"] * len(batch_v1))

# Tag by source for tracking
dataset.select(coverage_ids).tag_samples("source:coverage_v1")
dataset.select(fn_selection).tag_samples("source:fn_mining_v1")
dataset.select(fp_selection).tag_samples("source:fp_mining_v1")
dataset.select(confusion_selection).tag_samples("source:diversity_v1")

print("\nBatch v1 tagged and ready for annotation!")

In [None]:
# Save as a view
dataset.save_view("batch_v1_to_annotate", batch_v1)
print(f"Saved view: batch_v1_to_annotate ({len(batch_v1)} samples)")

## Visualize the Selection

Let's see how our hybrid selection looks in embedding space.

In [None]:
# Launch the App
session = fo.launch_app(batch_v1)

In [None]:
# Summary of selection sources
print("\nBatch v1 Selection Sources:")
for source_tag in ["source:coverage_v1", "source:fn_mining_v1", "source:fp_mining_v1", "source:diversity_v1"]:
    count = len(batch_v1.match_tags(source_tag))
    print(f"  {source_tag}: {count} samples")

## The Complete Iteration Recipe

Here's the full loop you'll repeat:

```
1. Annotate batch_v1 (Step 3 workflow)
2. Retrain YOLOv8 on all annotated data (Step 4 workflow)
3. Evaluate on validation (compare to v0)
4. Check frozen test set (occasional reality check)
5. Check golden set (detect label drift)
6. Select batch_v2 using this hybrid recipe
7. Repeat until stopping criteria
```

### Stopping Criteria

Stop iterating when:
- Gains per labeled sample flatten (diminishing returns)
- Remaining failures are mostly label ambiguity (irreducible)
- Production-like metrics hit acceptable thresholds

In [None]:
# Track progress across iterations
print("="*50)
print("ANNOTATION LOOP PROGRESS")
print("="*50)

total_pool = len(dataset.match_tags("split:pool"))
annotated_v0 = len(dataset.match_tags("annotated:v0"))
selected_v1 = len(dataset.match_tags("batch:v1"))
remaining = total_pool - annotated_v0 - selected_v1

print(f"\nPool total:        {total_pool}")
print(f"├── Annotated v0:  {annotated_v0} ({100*annotated_v0/total_pool:.1f}%)")
print(f"├── Selected v1:   {selected_v1} ({100*selected_v1/total_pool:.1f}%)")
print(f"└── Remaining:     {remaining} ({100*remaining/total_pool:.1f}%)")

print(f"\nAfter v1 annotation: {100*(annotated_v0+selected_v1)/total_pool:.1f}% labeled")
print("="*50)

## Important Warnings

### Don't Contaminate Your Test Set

The frozen test set (`split:test`) should NEVER be used for:
- Selection decisions
- Training data
- Frequent evaluation

Only check test set occasionally (e.g., every 3-4 iterations) as a reality check.

### Don't Only Chase Failures

The 30% coverage budget isn't optional. Without it, you'll:
- Overweight rare edge cases
- Distort class priors
- Create a model that's worse on normal cases

### Track Label Quality, Not Just Quantity

Every iteration, spot-check:
- Are FPs actually model errors, or label errors?
- Are FNs missed because they're hard, or because the label is wrong?
- Is the golden set still accurate?

## Summary

In this step, you:

1. **Defined a hybrid acquisition budget**:
   - 30% coverage (ZCore)
   - 35% FN mining
   - 21% FP mining
   - 14% diversity/confusion

2. **Implemented embedding-based neighbor expansion** - Don't just pick failures; expand around them

3. **Selected Batch v1** - Ready for annotation

4. **Learned the complete iteration loop** - Annotate → Train → Evaluate → Select → Repeat

**Key Insight**: Smart iteration beats random labeling. The hybrid strategy balances exploration (coverage) and exploitation (failure mining).

**Artifacts Created**:
- `batch:v1` tag on selected samples
- Source tags for tracking (`source:coverage_v1`, etc.)
- `batch_v1_to_annotate` saved view

**Your Turn**: Repeat Steps 3-5 with batch_v1, then batch_v2, etc. until your model meets your quality bar.