# Goal-Driven Optimization

> Learn how to use OptimizationGoal to control workflow behavior and tolerance settings

**20 minutes** | **Level: Intermediate**

---

## What You'll Learn

By the end of this notebook, you will be able to:

- Understand all 5 `OptimizationGoal` values and their behaviors
- Know which internal settings each goal applies
- Combine goals with `WorkflowTier` for fine-grained control
- Choose the right goal for your specific use case

---

## Learning Path

**You are here:** Workflow System > **Optimization Goals**

```
fit() Quickstart --> Workflow Tiers --> [You are here] --> Workflow Presets
```

**Recommended flow:**
- **Previous:** [02_workflow_tiers.ipynb](02_workflow_tiers.ipynb) - Understanding processing tiers
- **Next:** [04_workflow_presets.ipynb](04_workflow_presets.ipynb) - Named configurations

**Alternative paths:**
- Want global optimization details? Go to [../07_global_optimization/](../07_global_optimization/)
- Need YAML configuration? Go to [05_yaml_configuration.ipynb](05_yaml_configuration.ipynb)

---

## Before You Begin

**Required knowledge:**
- Basic Python and NumPy
- Familiarity with `curve_fit()` from [01_fit_quickstart.ipynb](01_fit_quickstart.ipynb)

**Required software:**
- NLSQ >= 0.3.4
- Python >= 3.12

**First time with NLSQ?** Start here: [01_fit_quickstart.ipynb](01_fit_quickstart.ipynb)

---

## Why This Matters

Different curve fitting scenarios require different optimization strategies:
- **Exploratory analysis** needs speed over precision
- **Publication-quality results** need tight tolerances and validation
- **Memory-constrained environments** need efficient processing
- **Complex problems** need global search capabilities

`OptimizationGoal` lets you express your intent, and NLSQ automatically configures:
- Convergence tolerances (gtol, ftol, xtol)
- Multi-start optimization settings
- Memory/speed tradeoffs

**Common use cases:**
- Quick exploration during development: `OptimizationGoal.FAST`
- Production fitting with unknown problem conditioning: `OptimizationGoal.ROBUST`
- Final publication-quality results: `OptimizationGoal.QUALITY`

---

## Quick Start (30 seconds)

See goal-driven optimization in action:

In [None]:
# Configure matplotlib for inline plotting (MUST come before imports)
%matplotlib inline

In [None]:
import numpy as np
import jax.numpy as jnp
from nlsq import fit, WorkflowConfig, OptimizationGoal

# Define model and generate data
def model(x, a, b): return a * jnp.exp(-b * x)
x = np.linspace(0, 5, 100)
y = 2.5 * np.exp(-1.3 * x) + 0.1 * np.random.randn(100)

# Fit with QUALITY goal for best results
config = WorkflowConfig(goal=OptimizationGoal.QUALITY, enable_multistart=True, n_starts=10)
popt, pcov = fit(model, x, y, p0=[1, 1], multistart=True, n_starts=10)
print(f"Fitted: a={popt[0]:.3f}, b={popt[1]:.3f}")

---

## Setup

In [None]:
import numpy as np
import jax.numpy as jnp
import matplotlib.pyplot as plt

from nlsq import fit, curve_fit
from nlsq import WorkflowConfig, WorkflowTier, OptimizationGoal
from nlsq.workflow import calculate_adaptive_tolerances, DatasetSizeTier

# Set random seed for reproducibility
np.random.seed(42)

---

## Tutorial Content

### Section 1: The OptimizationGoal Enum

NLSQ provides 5 optimization goals, each representing a different priority.

In [None]:
# Display all OptimizationGoal values
print("OptimizationGoal Values:")
print("=" * 60)

for goal in OptimizationGoal:
    print(f"  {goal.name:<20} = {goal.value}")

In [None]:
# Goal descriptions and behaviors
goal_info = {
    OptimizationGoal.FAST: {
        "description": "Prioritize speed with local optimization only",
        "tolerances": "One tier looser",
        "multistart": "Disabled",
        "use_case": "Quick exploration, well-conditioned problems",
    },
    OptimizationGoal.ROBUST: {
        "description": "Standard tolerances with multi-start for better global optimum",
        "tolerances": "Dataset-appropriate",
        "multistart": "Enabled",
        "use_case": "Production use, unknown problem conditioning",
    },
    OptimizationGoal.GLOBAL: {
        "description": "Synonym for ROBUST (emphasizes global optimization)",
        "tolerances": "Dataset-appropriate",
        "multistart": "Enabled",
        "use_case": "Same as ROBUST, semantic clarity",
    },
    OptimizationGoal.MEMORY_EFFICIENT: {
        "description": "Minimize memory usage with standard tolerances",
        "tolerances": "Dataset-appropriate",
        "multistart": "Disabled",
        "use_case": "Memory-constrained environments, very large datasets",
    },
    OptimizationGoal.QUALITY: {
        "description": "Highest precision/accuracy as TOP PRIORITY",
        "tolerances": "One tier tighter",
        "multistart": "Enabled + validation passes",
        "use_case": "Publication-quality results, critical applications",
    },
}

print("\nGoal Details:")
print("=" * 80)

for goal, info in goal_info.items():
    print(f"\n{goal.name}:")
    print(f"  Description:  {info['description']}")
    print(f"  Tolerances:   {info['tolerances']}")
    print(f"  Multi-start:  {info['multistart']}")
    print(f"  Use case:     {info['use_case']}")

### Section 2: How Goals Affect Tolerances

Each goal adjusts convergence tolerances based on dataset size. Let's see how.

In [None]:
# Show tolerance calculation for different dataset sizes and goals
dataset_sizes = [500, 5_000, 50_000, 500_000, 5_000_000]
goals_to_compare = [OptimizationGoal.FAST, OptimizationGoal.ROBUST, OptimizationGoal.QUALITY]

print("Adaptive Tolerances (gtol) by Dataset Size and Goal:")
print("=" * 70)
print(f"{'Dataset Size':<15} {'FAST':<15} {'ROBUST':<15} {'QUALITY':<15}")
print("-" * 70)

for n_points in dataset_sizes:
    tols = {}
    for goal in goals_to_compare:
        tols[goal.name] = calculate_adaptive_tolerances(n_points, goal)['gtol']
    
    print(f"{n_points:>12,}   {tols['FAST']:<15.0e} {tols['ROBUST']:<15.0e} {tols['QUALITY']:<15.0e}")

In [None]:
# Visualize tolerance tiers
print("\nDatasetSizeTier Reference:")
print("=" * 50)
print(f"{'Tier':<15} {'Max Points':<15} {'Base Tolerance'}")
print("-" * 50)

for tier in DatasetSizeTier:
    max_pts = tier.max_points if tier.max_points != float('inf') else 'unlimited'
    if isinstance(max_pts, float):
        max_pts_str = 'unlimited'
    else:
        max_pts_str = f"{int(max_pts):,}"
    print(f"{tier.name:<15} {max_pts_str:<15} {tier.tolerance:.0e}")

### Section 3: Practical Comparison

Let's compare goals on a real fitting problem.

In [None]:
# Define exponential decay model
def exponential_decay(x, a, b, c):
    """Exponential decay: y = a * exp(-b * x) + c"""
    return a * jnp.exp(-b * x) + c

In [None]:
# Generate synthetic data
n_samples = 1000
x_data = np.linspace(0, 5, n_samples)

# True parameters
true_a, true_b, true_c = 3.0, 1.2, 0.5

y_true = true_a * np.exp(-true_b * x_data) + true_c
noise = 0.1 * np.random.randn(n_samples)
y_data = y_true + noise

# Initial guess and bounds
p0 = [1.0, 0.5, 0.0]
bounds = ([0.1, 0.1, -1.0], [10.0, 5.0, 2.0])

print(f"True parameters: a={true_a}, b={true_b}, c={true_c}")
print(f"Dataset size: {n_samples} points")

In [None]:
import time

# Test each goal
results = {}
goals_to_test = ['fast', 'robust', 'global']

print("\nTesting different goals:")
print("=" * 70)

for goal_name in goals_to_test:
    start_time = time.time()
    
    popt, pcov = fit(
        exponential_decay,
        x_data,
        y_data,
        p0=p0,
        bounds=bounds,
        preset=goal_name,
    )
    
    elapsed = time.time() - start_time
    
    # Calculate SSR
    y_pred = exponential_decay(x_data, *popt)
    ssr = float(jnp.sum((y_data - y_pred) ** 2))
    
    # Calculate parameter errors
    param_errors = [abs(popt[i] - [true_a, true_b, true_c][i]) for i in range(3)]
    
    results[goal_name] = {
        'popt': popt,
        'ssr': ssr,
        'time': elapsed,
        'errors': param_errors,
    }
    
    print(f"\n{goal_name.upper()}:")
    print(f"  Time:       {elapsed:.4f}s")
    print(f"  SSR:        {ssr:.6f}")
    print(f"  Parameters: a={popt[0]:.4f}, b={popt[1]:.4f}, c={popt[2]:.4f}")
    print(f"  Errors:     a_err={param_errors[0]:.4f}, b_err={param_errors[1]:.4f}, c_err={param_errors[2]:.4f}")

### Section 4: Using WorkflowConfig with Goals

In [None]:
# Create WorkflowConfig with different goals
print("WorkflowConfig with Different Goals:")
print("=" * 60)

for goal in [OptimizationGoal.FAST, OptimizationGoal.ROBUST, OptimizationGoal.QUALITY]:
    config = WorkflowConfig(goal=goal)
    print(f"\n{goal.name}:")
    print(f"  tier:             {config.tier.name}")
    print(f"  gtol:             {config.gtol}")
    print(f"  enable_multistart:{config.enable_multistart}")

In [None]:
# Combine goal with tier override
print("\nCombining Goals with Tiers:")
print("=" * 60)

# Quality goal with streaming tier for large datasets
config_quality_streaming = WorkflowConfig(
    goal=OptimizationGoal.QUALITY,
    tier=WorkflowTier.STREAMING,
    enable_multistart=True,
    n_starts=20,
)

print("Quality + Streaming:")
print(f"  tier:             {config_quality_streaming.tier.name}")
print(f"  goal:             {config_quality_streaming.goal.name}")
print(f"  enable_multistart:{config_quality_streaming.enable_multistart}")
print(f"  n_starts:         {config_quality_streaming.n_starts}")

In [None]:
# Memory-efficient with chunked tier
config_memory_chunked = WorkflowConfig(
    goal=OptimizationGoal.MEMORY_EFFICIENT,
    tier=WorkflowTier.CHUNKED,
    chunk_size=50000,
)

print("\nMemory-Efficient + Chunked:")
print(f"  tier:       {config_memory_chunked.tier.name}")
print(f"  goal:       {config_memory_chunked.goal.name}")
print(f"  chunk_size: {config_memory_chunked.chunk_size}")

### Section 5: GLOBAL vs ROBUST

Note: `GLOBAL` and `ROBUST` are functionally identical. `GLOBAL` is provided for semantic clarity when you want to emphasize global optimization.

In [None]:
# Demonstrate GLOBAL normalization
print("GLOBAL and ROBUST Equivalence:")
print("=" * 50)

# The normalize function converts GLOBAL to ROBUST
normalized = OptimizationGoal.normalize(OptimizationGoal.GLOBAL)
print(f"  OptimizationGoal.GLOBAL normalizes to: {normalized.name}")

# Both produce same tolerances
tols_global = calculate_adaptive_tolerances(10000, OptimizationGoal.GLOBAL)
tols_robust = calculate_adaptive_tolerances(10000, OptimizationGoal.ROBUST)

print(f"  GLOBAL gtol: {tols_global['gtol']}")
print(f"  ROBUST gtol: {tols_robust['gtol']}")
print(f"  Same: {tols_global['gtol'] == tols_robust['gtol']}")

### Section 6: Visualization

In [None]:
# Create comparison visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Colors for goals
colors = {'fast': 'blue', 'robust': 'green', 'global': 'red'}

# Top left: Tolerance comparison across dataset sizes
ax1 = axes[0, 0]
sizes = np.logspace(2, 8, 50).astype(int)

for goal in [OptimizationGoal.FAST, OptimizationGoal.ROBUST, OptimizationGoal.QUALITY]:
    tols = [calculate_adaptive_tolerances(n, goal)['gtol'] for n in sizes]
    ax1.loglog(sizes, tols, label=goal.name, linewidth=2)

ax1.set_xlabel('Dataset Size (points)')
ax1.set_ylabel('gtol')
ax1.set_title('Adaptive Tolerances by Goal')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Top right: SSR comparison
ax2 = axes[0, 1]
goal_names = list(results.keys())
ssrs = [results[g]['ssr'] for g in goal_names]
bars = ax2.bar(goal_names, ssrs, color=[colors[g] for g in goal_names])
ax2.set_xlabel('Goal')
ax2.set_ylabel('Sum of Squared Residuals')
ax2.set_title('Fit Quality by Goal')
for bar, ssr in zip(bars, ssrs):
    ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
             f"{ssr:.4f}", ha='center', va='bottom', fontsize=9)

# Bottom left: Time comparison
ax3 = axes[1, 0]
times = [results[g]['time'] for g in goal_names]
bars = ax3.bar(goal_names, times, color=[colors[g] for g in goal_names])
ax3.set_xlabel('Goal')
ax3.set_ylabel('Time (seconds)')
ax3.set_title('Computation Time by Goal')
for bar, t in zip(bars, times):
    ax3.text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
             f"{t:.3f}s", ha='center', va='bottom', fontsize=9)

# Bottom right: Parameter errors
ax4 = axes[1, 1]
x_pos = np.arange(len(goal_names))
width = 0.25

for i, param in enumerate(['a', 'b', 'c']):
    errors = [results[g]['errors'][i] for g in goal_names]
    ax4.bar(x_pos + i*width, errors, width, label=f'{param} error')

ax4.set_xlabel('Goal')
ax4.set_ylabel('Absolute Error')
ax4.set_title('Parameter Errors by Goal')
ax4.set_xticks(x_pos + width)
ax4.set_xticklabels(goal_names)
ax4.legend()

plt.tight_layout()
plt.savefig('figures/03_goal_comparison.png', dpi=300, bbox_inches='tight')
plt.show()

---

## Key Takeaways

After completing this notebook, remember:

1. **5 Optimization Goals:**
   - `FAST`: Speed over precision, looser tolerances
   - `ROBUST`/`GLOBAL`: Balanced with multi-start enabled
   - `MEMORY_EFFICIENT`: Prioritizes low memory usage
   - `QUALITY`: Tightest tolerances, validation passes

2. **Goals affect tolerances adaptively:** Tolerances scale with dataset size, and goals shift tolerances tighter (QUALITY) or looser (FAST).

3. **Combine goals with tiers:** Use `WorkflowConfig(goal=..., tier=...)` for fine-grained control over both optimization priority and processing strategy.

4. **GLOBAL = ROBUST:** They are functionally identical; use GLOBAL for semantic emphasis on global optimization.

---

## Common Questions

**Q: Which goal should I use for exploratory analysis?**

A: Use `OptimizationGoal.FAST` for quick exploration. It uses looser tolerances and disables multi-start, giving you fast results when precision isn't critical.

**Q: When should I use QUALITY vs ROBUST?**

A: Use `QUALITY` for final, publication-quality results where accuracy is paramount. Use `ROBUST` for production code where you want good results with reasonable computation time.

**Q: Can I use different goals for the same problem?**

A: Yes! A common pattern is to use `FAST` during development, then `ROBUST` for production, and `QUALITY` for final publication results.

---

## Related Resources

**Next steps:**
- [04_workflow_presets.ipynb](04_workflow_presets.ipynb) - Explore named preset configurations
- [05_yaml_configuration.ipynb](05_yaml_configuration.ipynb) - File-based configuration

**Further reading:**
- [API Documentation](https://nlsq.readthedocs.io/)
- [GitHub Repository](https://github.com/imewei/NLSQ)

**Need help?**
- [Discussions](https://github.com/imewei/NLSQ/discussions)
- [Report issues](https://github.com/imewei/NLSQ/issues)

---

## Glossary

**OptimizationGoal:** An enum that specifies the optimization priority (FAST, ROBUST, GLOBAL, MEMORY_EFFICIENT, QUALITY).

**Tolerance:** Convergence threshold for optimization (gtol for gradient, ftol for function, xtol for parameters).

**Multi-start:** Optimization technique that evaluates multiple starting points to find global optima.

**Tier:** Processing strategy (STANDARD, CHUNKED, STREAMING) based on dataset size and memory.

In [None]:
# Final summary
print("Summary")
print("=" * 60)
print(f"True parameters: a={true_a}, b={true_b}, c={true_c}")
print()
print("Goal recommendations:")
print("  - Exploratory analysis:    OptimizationGoal.FAST")
print("  - Production fitting:      OptimizationGoal.ROBUST")
print("  - Global search emphasis:  OptimizationGoal.GLOBAL")
print("  - Memory constraints:      OptimizationGoal.MEMORY_EFFICIENT")
print("  - Publication quality:     OptimizationGoal.QUALITY")