# WORKFLOW_PRESETS Guide

> Master named workflow configurations for rapid development

**15 minutes** | **Level: Beginner**

---

## What You'll Learn

By the end of this notebook, you will be able to:

- Understand all entries in the `WORKFLOW_PRESETS` dictionary
- Use presets for common fitting scenarios
- Inspect preset configurations
- Customize presets as starting points for advanced use

---

## Learning Path

**You are here:** Workflow System > **Workflow Presets**

```
fit() Quickstart --> Workflow Tiers --> Optimization Goals --> [You are here]
```

**Recommended flow:**
- **Previous:** [03_optimization_goals.ipynb](03_optimization_goals.ipynb) - Goal-driven optimization
- **Next:** [05_yaml_configuration.ipynb](05_yaml_configuration.ipynb) - File-based configuration

**Alternative paths:**
- Need global optimization? Go to [../07_global_optimization/](../07_global_optimization/)
- Large dataset fitting? Go to [../06_large_datasets/](../06_large_datasets/)

---

## Before You Begin

**Required knowledge:**
- Basic Python and NumPy
- Familiarity with `curve_fit()` or `fit()`

**Required software:**
- NLSQ >= 0.3.4
- Python >= 3.12

**First time with NLSQ?** Start here: [01_fit_quickstart.ipynb](01_fit_quickstart.ipynb)

---

## Why This Matters

Configuring optimization parameters can be complex. Presets provide:
- **Quick starts**: Use well-tested configurations immediately
- **Best practices**: Expert-tuned settings for common scenarios
- **Consistency**: Reproducible configurations across projects
- **Customization**: Modify presets to suit your specific needs

**Common use cases:**
- Quick exploratory analysis: `preset='fast'`
- Publication-quality results: `preset='quality'`
- Large dataset processing: `preset='large_robust'` or `preset='streaming'`

---

## Quick Start (30 seconds)

Use a preset in three lines:

In [None]:
# Configure matplotlib for inline plotting (MUST come before imports)
%matplotlib inline

In [None]:
import numpy as np
import jax.numpy as jnp
from nlsq import fit

# Define model and data
def model(x, a, b): return a * jnp.exp(-b * x)
x = np.linspace(0, 5, 100)
y = 2.5 * np.exp(-1.3 * x) + 0.1 * np.random.randn(100)

# Fit using the 'quality' preset
popt, pcov = fit(model, x, y, p0=[1, 1], preset='quality')
print(f"Fitted: a={popt[0]:.3f}, b={popt[1]:.3f}")

---

## Setup

In [None]:
import numpy as np
import jax.numpy as jnp
import matplotlib.pyplot as plt
from pprint import pprint

from nlsq import fit, curve_fit
from nlsq import WorkflowConfig, WORKFLOW_PRESETS

# Set random seed for reproducibility
np.random.seed(42)

---

## Tutorial Content

### Section 1: Available Presets

NLSQ provides 7 built-in workflow presets for different scenarios.

In [None]:
# List all available presets
print("Available WORKFLOW_PRESETS:")
print("=" * 60)

for preset_name in WORKFLOW_PRESETS:
    description = WORKFLOW_PRESETS[preset_name].get('description', 'No description')
    print(f"  {preset_name:<20} - {description}")

### Section 2: Inspecting Presets

You can inspect any preset to see its full configuration.

In [None]:
# Examine the 'standard' preset
print("'standard' preset configuration:")
print("-" * 40)
pprint(WORKFLOW_PRESETS['standard'])

In [None]:
# Examine the 'quality' preset
print("'quality' preset configuration:")
print("-" * 40)
pprint(WORKFLOW_PRESETS['quality'])

In [None]:
# Examine the 'streaming' preset for large datasets
print("'streaming' preset configuration:")
print("-" * 40)
pprint(WORKFLOW_PRESETS['streaming'])

### Section 3: Preset Comparison Table

Let's create a comprehensive comparison of all presets.

In [None]:
# Create comparison table
print("Preset Comparison:")
print("=" * 100)
print(f"{'Preset':<18} {'Tier':<12} {'Goal':<16} {'Multistart':<12} {'n_starts':<10} {'gtol':<12}")
print("-" * 100)

for name, config in WORKFLOW_PRESETS.items():
    tier = config.get('tier', 'STANDARD')
    goal = config.get('goal', 'ROBUST')
    multistart = config.get('enable_multistart', False)
    n_starts = config.get('n_starts', 0)
    gtol = config.get('gtol', 1e-8)
    
    multistart_str = 'Yes' if multistart else 'No'
    
    print(f"{name:<18} {tier:<12} {goal:<16} {multistart_str:<12} {n_starts:<10} {gtol:<12.0e}")

### Section 4: Using Presets with fit()

The simplest way to use a preset is with the `preset` parameter.

In [None]:
# Generate test data
def exponential_model(x, a, b, c):
    """Exponential decay: y = a * exp(-b * x) + c"""
    return a * jnp.exp(-b * x) + c

n_samples = 500
x_data = np.linspace(0, 5, n_samples)

# True parameters
true_a, true_b, true_c = 3.0, 1.2, 0.5
y_true = true_a * np.exp(-true_b * x_data) + true_c
noise = 0.15 * np.random.randn(n_samples)
y_data = y_true + noise

# Initial guess and bounds
p0 = [1.0, 0.5, 0.0]
bounds = ([0.1, 0.1, -1.0], [10.0, 5.0, 2.0])

print(f"True parameters: a={true_a}, b={true_b}, c={true_c}")

In [None]:
import time

# Test different presets
presets_to_test = ['fast', 'standard', 'quality']
results = {}

print("\nTesting presets:")
print("=" * 70)

for preset_name in presets_to_test:
    start_time = time.time()
    
    popt, pcov = fit(
        exponential_model,
        x_data,
        y_data,
        p0=p0,
        bounds=bounds,
        preset=preset_name,
    )
    
    elapsed = time.time() - start_time
    
    # Calculate SSR
    y_pred = exponential_model(x_data, *popt)
    ssr = float(jnp.sum((y_data - y_pred) ** 2))
    
    results[preset_name] = {
        'popt': popt,
        'ssr': ssr,
        'time': elapsed,
    }
    
    print(f"\n{preset_name.upper()}:")
    print(f"  Time:       {elapsed:.4f}s")
    print(f"  SSR:        {ssr:.6f}")
    print(f"  Parameters: a={popt[0]:.4f}, b={popt[1]:.4f}, c={popt[2]:.4f}")

### Section 5: Creating WorkflowConfig from Presets

Use `WorkflowConfig.from_preset()` to create a configuration object.

In [None]:
# Create config from preset
config = WorkflowConfig.from_preset('quality')

print("WorkflowConfig from 'quality' preset:")
print("-" * 50)
print(f"  tier:              {config.tier.name}")
print(f"  goal:              {config.goal.name}")
print(f"  enable_multistart: {config.enable_multistart}")
print(f"  n_starts:          {config.n_starts}")
print(f"  gtol:              {config.gtol}")
print(f"  ftol:              {config.ftol}")
print(f"  xtol:              {config.xtol}")

In [None]:
# Check the preset origin
print(f"\nPreset origin: '{config.preset}'")

### Section 6: Customizing Presets

Start from a preset and override specific values.

In [None]:
# Customize the 'quality' preset
base_config = WorkflowConfig.from_preset('quality')

# Override specific values
custom_config = base_config.with_overrides(
    n_starts=30,              # More starting points
    sampler='sobol',          # Use Sobol sampling
    gtol=1e-12,               # Tighter tolerance
)

print("Customized 'quality' preset:")
print("-" * 50)
print(f"  Original n_starts: {base_config.n_starts}")
print(f"  Custom n_starts:   {custom_config.n_starts}")
print(f"  Original sampler:  {base_config.sampler}")
print(f"  Custom sampler:    {custom_config.sampler}")
print(f"  Original gtol:     {base_config.gtol}")
print(f"  Custom gtol:       {custom_config.gtol}")

In [None]:
# Create a large dataset config from 'streaming' preset
streaming_config = WorkflowConfig.from_preset('streaming')

# Customize for memory-constrained environment
memory_config = streaming_config.with_overrides(
    memory_limit_gb=4.0,
    chunk_size=10000,
)

print("\nMemory-optimized streaming config:")
print("-" * 50)
print(f"  tier:            {memory_config.tier.name}")
print(f"  memory_limit_gb: {memory_config.memory_limit_gb}")
print(f"  chunk_size:      {memory_config.chunk_size}")

### Section 7: Preset Descriptions

Each preset is designed for a specific use case.

In [None]:
# Detailed preset documentation
preset_docs = {
    'standard': {
        'summary': 'Default curve_fit() behavior',
        'best_for': 'Well-conditioned problems with good initial guesses',
        'tradeoffs': 'Balanced speed/accuracy, no global search',
    },
    'quality': {
        'summary': 'Highest precision fitting',
        'best_for': 'Publication results, parameter uncertainty estimation',
        'tradeoffs': 'Slower due to multi-start and tight tolerances',
    },
    'fast': {
        'summary': 'Speed-optimized fitting',
        'best_for': 'Exploratory analysis, development, quick iterations',
        'tradeoffs': 'May converge to local minima',
    },
    'large_robust': {
        'summary': 'Chunked processing with multi-start',
        'best_for': 'Large datasets (1M-100M points) needing global search',
        'tradeoffs': 'Memory-efficient but slower than standard',
    },
    'streaming': {
        'summary': 'Streaming for huge datasets',
        'best_for': 'Datasets that exceed available memory (100M+ points)',
        'tradeoffs': 'No covariance matrix, approximate convergence',
    },
    'hpc_distributed': {
        'summary': 'Multi-GPU/node HPC clusters',
        'best_for': 'PBS Pro, Slurm clusters with checkpoint recovery',
        'tradeoffs': 'Requires HPC environment setup',
    },
    'memory_efficient': {
        'summary': 'Minimize memory footprint',
        'best_for': 'Memory-constrained systems, edge devices',
        'tradeoffs': 'Smaller chunk sizes = more overhead',
    },
}

print("Preset Use Case Guide:")
print("=" * 80)

for name, doc in preset_docs.items():
    print(f"\n{name.upper()}:")
    print(f"  Summary:    {doc['summary']}")
    print(f"  Best for:   {doc['best_for']}")
    print(f"  Tradeoffs:  {doc['tradeoffs']}")

### Section 8: Visualization

In [None]:
# Visualize preset characteristics
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

preset_names = list(results.keys())
colors = {'fast': 'blue', 'standard': 'green', 'quality': 'red'}

# SSR comparison
ax1 = axes[0]
ssrs = [results[p]['ssr'] for p in preset_names]
bars = ax1.bar(preset_names, ssrs, color=[colors[p] for p in preset_names])
ax1.set_xlabel('Preset')
ax1.set_ylabel('Sum of Squared Residuals')
ax1.set_title('Fit Quality by Preset')
for bar, ssr in zip(bars, ssrs):
    ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
             f"{ssr:.4f}", ha='center', va='bottom', fontsize=9)

# Time comparison
ax2 = axes[1]
times = [results[p]['time'] for p in preset_names]
bars = ax2.bar(preset_names, times, color=[colors[p] for p in preset_names])
ax2.set_xlabel('Preset')
ax2.set_ylabel('Time (seconds)')
ax2.set_title('Computation Time by Preset')
for bar, t in zip(bars, times):
    ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
             f"{t:.3f}s", ha='center', va='bottom', fontsize=9)

# Tolerance comparison
ax3 = axes[2]
tols = [WORKFLOW_PRESETS[p]['gtol'] for p in preset_names]
bars = ax3.bar(preset_names, tols, color=[colors[p] for p in preset_names])
ax3.set_xlabel('Preset')
ax3.set_ylabel('gtol')
ax3.set_title('Tolerance (gtol) by Preset')
ax3.set_yscale('log')
for bar, t in zip(bars, tols):
    ax3.text(bar.get_x() + bar.get_width()/2, bar.get_height(), 
             f"{t:.0e}", ha='center', va='bottom', fontsize=9)

plt.tight_layout()
plt.savefig('figures/04_preset_comparison.png', dpi=300, bbox_inches='tight')
plt.show()

---

## Key Takeaways

After completing this notebook, remember:

1. **7 Built-in Presets:** `standard`, `quality`, `fast`, `large_robust`, `streaming`, `hpc_distributed`, `memory_efficient`

2. **Using presets:** Simply pass `preset='name'` to `fit()` or use `WorkflowConfig.from_preset('name')`

3. **Inspecting presets:** Access configuration via `WORKFLOW_PRESETS['name']` or `print(WORKFLOW_PRESETS[name])`

4. **Customization:** Use `config.with_overrides(...)` to modify preset values while keeping the base configuration

---

## Common Questions

**Q: Which preset should I use for a standard fitting problem?**

A: Start with `preset='standard'`. If you need better global search, try `preset='quality'`. For quick exploration, use `preset='fast'`.

**Q: How do I know if I need 'large_robust' vs 'streaming'?**

A: Use `'large_robust'` for datasets up to ~100M points that fit in memory (with chunking). Use `'streaming'` for truly massive datasets that cannot fit in memory at all.

**Q: Can I create my own presets?**

A: Yes! Create a `WorkflowConfig` with your desired settings and save it as a dictionary. You can also use YAML configuration files (see the next tutorial).

---

## Related Resources

**Next steps:**
- [05_yaml_configuration.ipynb](05_yaml_configuration.ipynb) - File-based configuration
- [../07_global_optimization/](../07_global_optimization/) - Global optimization details

**Further reading:**
- [API Documentation](https://nlsq.readthedocs.io/)
- [GitHub Repository](https://github.com/imewei/NLSQ)

**Need help?**
- [Discussions](https://github.com/imewei/NLSQ/discussions)
- [Report issues](https://github.com/imewei/NLSQ/issues)

---

## Glossary

**WORKFLOW_PRESETS:** Dictionary containing named workflow configurations.

**WorkflowConfig:** Configuration dataclass for workflow settings.

**Preset:** Pre-defined configuration optimized for a specific use case.

**with_overrides():** Method to create a modified copy of a configuration.

In [None]:
# Final summary
print("Summary")
print("=" * 60)
print()
print("Available presets:")
for name in WORKFLOW_PRESETS:
    desc = WORKFLOW_PRESETS[name].get('description', '')
    print(f"  - {name}: {desc}")
print()
print("Quick usage:")
print("  fit(model, x, y, preset='quality')")
print("  WorkflowConfig.from_preset('quality')")
print("  config.with_overrides(n_starts=30)")