# AutoML Architecture Search

This notebook demonstrates MetaGen's AutoML capabilities for discovering optimal architectures.

## Overview

MetaGen can automatically search for architectures that balance:
- **Parameters**: Stay within budget
- **Latency**: Meet inference requirements
- **Performance**: Maximize capability

In [None]:
# Imports
import json
from pathlib import Path

from metagen.automl.objectives import compute_pareto_front
from metagen.automl.search_engine import ArchitectureSearchEngine
from metagen.specs.loader import load_spec

## 1. Load a Spec

Start with a text LLM spec as our optimization target:

In [None]:
# Load spec
spec_path = Path("../specs/text_llm_8b.yaml")
spec, seed = load_spec(spec_path)

print(f"Spec: {spec.name}")
print(f"Target parameter budget: {spec.constraints.parameter_budget.max}")

## 2. Random Search

Quick exploration with random sampling:

In [None]:
# Create search engine
engine = ArchitectureSearchEngine(spec=spec, seed=42)

# Run random search
results = engine.search(budget=20, strategy="random")

print(f"Searched {len(results.candidates)} candidates")
print("\nTop 5 candidates:")
for i, c in enumerate(results.top_k(5), 1):
    print(
        f"  {i}. Score: {c.score:.3f}, Params: {c.params_billion:.2f}B, "
        f"Hidden: {c.hidden_size}, Layers: {c.num_layers}"
    )

## 3. Evolutionary Search

Better results with evolutionary optimization:

In [None]:
# Create engine with evolution strategy
engine_evo = ArchitectureSearchEngine(spec=spec, seed=42)

# Run evolutionary search
results_evo = engine_evo.search(budget=50, strategy="evolution", generations=5, population_size=10)

print("Evolutionary search complete")
print("\nTop 5 candidates:")
for i, c in enumerate(results_evo.top_k(5), 1):
    print(
        f"  {i}. Score: {c.score:.3f}, Params: {c.params_billion:.2f}B, "
        f"Hidden: {c.hidden_size}, Layers: {c.num_layers}"
    )

## 4. Compare Search Strategies

Let's compare the best results from both strategies:

In [None]:
# Best from each strategy
best_random = results.top_k(1)[0]
best_evo = results_evo.top_k(1)[0]

print("Comparison:")
print(f"{'Metric':<20} {'Random':<15} {'Evolution':<15}")
print("-" * 50)
print(f"{'Score':<20} {best_random.score:<15.3f} {best_evo.score:<15.3f}")
print(f"{'Params (B)':<20} {best_random.params_billion:<15.2f} {best_evo.params_billion:<15.2f}")
print(f"{'Hidden Size':<20} {best_random.hidden_size:<15} {best_evo.hidden_size:<15}")
print(f"{'Layers':<20} {best_random.num_layers:<15} {best_evo.num_layers:<15}")
print(f"{'Heads':<20} {best_random.num_heads:<15} {best_evo.num_heads:<15}")

## 5. Pareto Front Analysis

Find architectures that are optimal trade-offs:

In [None]:
# Compute Pareto front
pareto = compute_pareto_front(results_evo.candidates)

print(f"Pareto-optimal candidates: {len(pareto)}")
print("\nPareto front (non-dominated solutions):")
for i, c in enumerate(pareto[:5], 1):
    print(
        f"  {i}. Params: {c.params_billion:.2f}B, "
        f"Latency: {c.latency_ms:.0f}ms, Score: {c.score:.3f}"
    )

## 6. Visualize Results

Plot the search results:

In [None]:
import matplotlib.pyplot as plt

# Extract data
params = [c.params_billion for c in results_evo.candidates]
scores = [c.score for c in results_evo.candidates]
latencies = [c.latency_ms for c in results_evo.candidates]

# Pareto front data
pareto_params = [c.params_billion for c in pareto]
pareto_scores = [c.score for c in pareto]

# Create figure
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Plot 1: Params vs Score
axes[0].scatter(params, scores, alpha=0.6, label="All candidates")
axes[0].scatter(pareto_params, pareto_scores, c="red", s=100, marker="*", label="Pareto front")
axes[0].set_xlabel("Parameters (B)")
axes[0].set_ylabel("Score")
axes[0].set_title("Parameters vs Score")
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Plot 2: Params vs Latency
axes[1].scatter(params, latencies, alpha=0.6, c=scores, cmap="viridis")
axes[1].set_xlabel("Parameters (B)")
axes[1].set_ylabel("Latency (ms)")
axes[1].set_title("Parameters vs Latency (color = score)")
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 7. Export Best Architecture

Save the best architecture for synthesis:

In [None]:
# Get best candidate
best = results_evo.top_k(1)[0]

# Create architecture config
architecture_config = {
    "hidden_size": best.hidden_size,
    "num_layers": best.num_layers,
    "num_heads": best.num_heads,
    "ffn_multiplier": best.ffn_multiplier,
    "estimated_params_billion": best.params_billion,
    "estimated_latency_ms": best.latency_ms,
    "optimization_score": best.score,
}

# Save to file
output_path = Path("./outputs/best_architecture.json")
output_path.parent.mkdir(parents=True, exist_ok=True)

with open(output_path, "w") as f:
    json.dump(architecture_config, f, indent=2)

print(f"Best architecture saved to: {output_path}")
print(json.dumps(architecture_config, indent=2))

## 8. Search with Different Constraints

Let's try a smaller model:

In [None]:
# Load tiny spec
tiny_spec_path = Path("../specs/text_llm_tiny.yaml")
tiny_spec, tiny_seed = load_spec(tiny_spec_path)

print(f"Tiny spec target: {tiny_spec.constraints.parameter_budget.max}")

# Search
tiny_engine = ArchitectureSearchEngine(spec=tiny_spec, seed=42)
tiny_results = tiny_engine.search(budget=20, strategy="evolution", generations=3)

print("\nTop 3 tiny architectures:")
for i, c in enumerate(tiny_results.top_k(3), 1):
    print(
        f"  {i}. Params: {c.params_billion * 1000:.0f}M, "
        f"Hidden: {c.hidden_size}, Layers: {c.num_layers}"
    )

## Next Steps

- [03_multi_modal.ipynb](03_multi_modal.ipynb) - Different modalities
- [AutoML Guide](../../docs/user-guide/automl_guide.md) - Complete reference