# üìò Large Dataset Fitting: Handle Millions of Data Points> Master NLSQ's strategies for fitting curves to datasets too large for memory‚è±Ô∏è **20-30 minutes** | üìä **Level: ‚óè‚óè‚óã Intermediate** | üè∑Ô∏è **Memory Management** | **Performance** | **Scalability**[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/imewei/NLSQ/blob/main/examples/notebooks/02_core_tutorials/large_dataset_demo.ipynb)---

In [1]:
# @title Install NLSQ (run once in Colab)
import sys

if 'google.colab' in sys.modules:
    print("Running in Google Colab - installing NLSQ...")
    !pip install -q nlsq
    print("‚úÖ NLSQ installed successfully!")
else:
    print("Not running in Colab - assuming NLSQ is already installed")

Not running in Colab - assuming NLSQ is already installed


## üó∫Ô∏è Learning Path**You are here:** Core Tutorials > **Large Dataset Fitting**```Getting Started ‚Üí Quickstart ‚Üí [Large Dataset Demo] ‚Üê You are here ‚Üí GPU Optimization```**Prerequisites:**- ‚úì Completed [NLSQ Quickstart](../01_getting_started/nlsq_quickstart.ipynb)- ‚úì Familiar with NumPy arrays and JAX basics- ‚úì Understand basic curve fitting concepts- ‚úì Knowledge of memory constraints in data processing**Recommended flow:**- ‚Üê **Previous:** [NLSQ Quickstart](../01_getting_started/nlsq_quickstart.ipynb)- ‚Üí **Next (Recommended):** [GPU Optimization Deep Dive](../03_advanced/gpu_optimization_deep_dive.ipynb)- ‚Üí **Alternative:** [Performance Optimization](performance_optimization_demo.ipynb)---

## üéØ What You'll LearnAfter completing this tutorial, you will be able to:- ‚úì **Estimate memory requirements** before fitting to avoid out-of-memory errors- ‚úì **Use automatic chunking** for datasets larger than available memory- ‚úì **Implement streaming optimization** for unlimited dataset sizes (100M+ points)- ‚úì **Choose between chunking vs streaming** approaches based on dataset characteristics- ‚úì **Configure memory limits** and use context managers for temporary settings- ‚úì **Monitor and troubleshoot** large dataset fits with progress reporting---

## üí° Why This Matters**The problem:** SciPy's `curve_fit` loads entire datasets into memory, failing on large datasets or becoming prohibitively slow. For datasets >1M points, traditional approaches either crash or require excessive computation time.**NLSQ's solution:**- **Automatic memory management** - Detects available memory and optimizes strategy- **GPU acceleration** - 150-270x faster than CPU-only approaches- **Intelligent chunking** - Achieves <1% error for well-conditioned problems- **Streaming optimization** - Handles unlimited dataset sizes with zero accuracy loss- **Progress reporting** - Track long-running fits in real-time**Real-world use cases:**- üî¨ **High-throughput screening** - Millions of measurements from automated experiments- üì° **Sensor calibration** - Continuous data streams from IoT devices- üß¨ **Genomics data fitting** - Large-scale biological datasets- üå°Ô∏è **Climate model parameter estimation** - Decades of environmental measurements- üìä **Financial time series** - Years of high-frequency trading data**When to use this approach:**- ‚úÖ **Good for:** Datasets >100K points, memory-constrained environments, production systems- ‚ùå **Not needed for:** Small datasets (<10K points) ‚Üí Use [Quickstart](../01_getting_started/nlsq_quickstart.ipynb) instead**Performance characteristics:**- **Speed:** GPU acceleration provides 150-270x speedup vs SciPy- **Memory:** Processes datasets 10-100x larger than available RAM- **Accuracy:** <1% error with chunking, zero loss with streaming---

## ‚ö° Quick StartFit a 1 million point dataset in 3 steps:```pythonfrom nlsq import fit_large_datasetimport numpy as np# 1. Generate datax = np.linspace(0, 5, 1_000_000)y = 5.0 * np.exp(-1.2 * x) + 0.5 + np.random.normal(0, 0.05, 1_000_000)# 2. Define modeldef exponential_decay(x, a, b, c):    return a * jnp.exp(-b * x) + c# 3. Fit automaticallyresult = fit_large_dataset(exponential_decay, x, y, p0=[4.0, 1.0, 0.4])print(f"Parameters: {result.popt}")```**Expected output:**```‚úÖ Fit completed in 0.8 secondsParameters: [5.001, 1.199, 0.500]Relative errors: <0.1%```---

## üìñ Setup and ImportsFirst, let's import the necessary modules and verify the Python version.

In [2]:
# Configure matplotlib for inline plotting in VS Code/Jupyter
# MUST come before importing matplotlib
%matplotlib inline

In [3]:
# Check Python version
import sys

print(f"‚úÖ Python {sys.version_info.major}.{sys.version_info.minor} meets requirements")

import time

import jax.numpy as jnp
import numpy as np

from nlsq import (
    AlgorithmSelector,
    CurveFit,
    LargeDatasetConfig,
    LargeDatasetFitter,
    LDMemoryConfig,
    MemoryConfig,
    __version__,
    auto_select_algorithm,
    configure_for_large_datasets,
    curve_fit_large,
    estimate_memory_requirements,
    fit_large_dataset,
    get_memory_config,
    large_dataset_context,
    memory_context,
    set_memory_limits,
)

print(f"NLSQ version: {__version__}")
print("NLSQ Large Dataset Demo - Enhanced Version")

‚úÖ Python 3.13 meets requirements


NLSQ version: 0.5.5.dev1
NLSQ Large Dataset Demo - Enhanced Version


### Define Model FunctionsWe'll use several model functions throughout this tutorial to demonstrate different aspects of large dataset fitting.

In [4]:
def exponential_decay(x, a, b, c):
    """Exponential decay model with offset: y = a * exp(-b * x) + c"""
    return a * jnp.exp(-b * x) + c


def polynomial_model(x, a, b, c, d):
    """Polynomial model: y = a*x^3 + b*x^2 + c*x + d"""
    return a * x**3 + b * x**2 + c * x + d


def gaussian(x, a, mu, sigma, offset):
    """Gaussian model: y = a * exp(-((x - mu)^2) / (2*sigma^2)) + offset"""
    return a * jnp.exp(-((x - mu) ** 2) / (2 * sigma**2)) + offset


def complex_model(x, a, b, c, d, e, f):
    """Complex model with many parameters for algorithm selection testing"""
    return a * jnp.exp(-b * x) + c * jnp.sin(d * x) + e * x**2 + f

## 1. Memory Estimation**Key concept:** Before fitting large datasets, use `estimate_memory_requirements()` to predict memory usage and determine the optimal processing strategy.**Why it matters:** Prevents out-of-memory errors and helps you choose between single-pass, chunked, or streaming approaches.**How it works:**1. Calculates memory needed for data arrays (x, y)2. Estimates Jacobian matrix size (n_points √ó n_params)3. Accounts for JAX compilation overhead4. Recommends chunk count based on available memory

In [5]:
def demo_memory_estimation():
    """Demonstrate memory estimation capabilities."""
    print("=" * 60)
    print("MEMORY ESTIMATION DEMO")
    print("=" * 60)

    # Estimate requirements for different dataset sizes
    test_cases = [
        (100_000, 3, "Small dataset"),
        (1_000_000, 3, "Medium dataset"),
        (10_000_000, 3, "Large dataset"),
        (50_000_000, 3, "Very large dataset"),
        (100_000_000, 3, "Extremely large dataset"),
    ]

    for n_points, n_params, description in test_cases:
        stats = estimate_memory_requirements(n_points, n_params)
        print(f"\n{description} ({n_points:,} points, {n_params} parameters):")
        print(f"  Total memory estimate: {stats.total_memory_estimate_gb:.3f} GB")
        print(f"  Number of chunks: {stats.n_chunks}")

        # Determine strategy description
        if stats.n_chunks == 1:
            print("  Strategy: Single pass (fits in memory)")
        elif stats.n_chunks > 1:
            print(f"  Strategy: Chunked processing ({stats.n_chunks} chunks)")

        # For very large datasets, suggest streaming
        if n_points > 50_000_000:
            print("  üí° Consider: Streaming optimization for zero accuracy loss")

demo_memory_estimation()

MEMORY ESTIMATION DEMO

Small dataset (100,000 points, 3 parameters):
  Total memory estimate: 0.014 GB
  Number of chunks: 1
  Strategy: Single pass (fits in memory)

Medium dataset (1,000,000 points, 3 parameters):
  Total memory estimate: 0.136 GB
  Number of chunks: 1
  Strategy: Single pass (fits in memory)

Large dataset (10,000,000 points, 3 parameters):
  Total memory estimate: 1.360 GB
  Number of chunks: 10
  Strategy: Chunked processing (10 chunks)

Very large dataset (50,000,000 points, 3 parameters):
  Total memory estimate: 6.799 GB
  Number of chunks: 50
  Strategy: Chunked processing (50 chunks)

Extremely large dataset (100,000,000 points, 3 parameters):
  Total memory estimate: 13.597 GB
  Number of chunks: 100
  Strategy: Chunked processing (100 chunks)
  üí° Consider: Streaming optimization for zero accuracy loss


## 2. Advanced Configuration & Algorithm Selection**Key concept:** NLSQ provides sophisticated configuration management and automatic algorithm selection for optimal performance.**Features:**- **`get_memory_config()`** - View current memory settings- **`configure_for_large_datasets()`** - Optimize settings for large data- **`auto_select_algorithm()`** - Automatically choose best optimization algorithm- **Context managers** - Temporarily change settings for specific operations

In [6]:
def demo_advanced_configuration():
    """Demonstrate advanced configuration and algorithm selection."""
    print("=" * 60)
    print("ADVANCED CONFIGURATION & ALGORITHM SELECTION DEMO")
    print("=" * 60)

    # Current memory configuration
    current_config = get_memory_config()
    print("Current memory configuration:")
    print(f"  Memory limit: {current_config.memory_limit_gb} GB")
    print(f"  Mixed precision fallback: {current_config.enable_mixed_precision_fallback}")

    # Automatically configure for large datasets
    print("\nConfiguring for large dataset processing...")
    configure_for_large_datasets(memory_limit_gb=8.0, enable_chunking=True)

    # Show updated configuration
    new_config = get_memory_config()
    print(f"Updated memory limit: {new_config.memory_limit_gb} GB")

    # Generate test dataset for algorithm selection
    print("\n=== Algorithm Selection Demo ===")
    np.random.seed(42)

    # Test different model complexities
    test_cases = [
        ("Simple exponential", exponential_decay, 3, [5.0, 1.2, 0.5]),
        ("Polynomial", polynomial_model, 4, [0.1, -0.5, 2.0, 1.0]),
        ("Complex multi-param", complex_model, 6, [3.0, 0.8, 1.5, 2.0, 0.1, 0.2]),
    ]

    for model_name, model_func, n_params, true_params in test_cases:
        print(f"\n{model_name} ({n_params} parameters):")

        # Generate sample data
        n_sample = 10000  # Smaller sample for algorithm analysis
        x_sample = np.linspace(0, 5, n_sample)
        y_sample = model_func(x_sample, *true_params) + np.random.normal(
            0, 0.05, n_sample
        )

        # Get algorithm recommendation
        try:
            recommendations = auto_select_algorithm(model_func, x_sample, y_sample)
            print(f"  Recommended algorithm: {recommendations['algorithm']}")
            print(f"  Recommended tolerance: {recommendations['ftol']}")
            print(f"  Problem complexity: {recommendations.get('complexity', 'Unknown')}")

            # Estimate memory for full dataset
            large_n = 1_000_000  # 1M points
            stats = estimate_memory_requirements(large_n, n_params)
            print(f"  Memory for 1M points: {stats.total_memory_estimate_gb:.3f} GB")
            print(f"  Chunking strategy: {'Required' if stats.n_chunks > 1 else 'Not needed'}")
        except Exception as e:
            print(f"  Algorithm selection failed: {e}")
            print(f"  Using default settings for {model_name}")

# Run the demo
demo_advanced_configuration()

ADVANCED CONFIGURATION & ALGORITHM SELECTION DEMO
Current memory configuration:
  Memory limit: 8.0 GB
  Mixed precision fallback: True

Configuring for large dataset processing...
Updated memory limit: 8.0 GB

=== Algorithm Selection Demo ===

Simple exponential (3 parameters):


  Recommended algorithm: trf
  Recommended tolerance: 1e-08
  Problem complexity: Unknown
  Memory for 1M points: 0.136 GB
  Chunking strategy: Not needed

Polynomial (4 parameters):
  Recommended algorithm: trf
  Recommended tolerance: 1e-08
  Problem complexity: Unknown
  Memory for 1M points: 0.158 GB
  Chunking strategy: Not needed

Complex multi-param (6 parameters):
  Recommended algorithm: trf
  Recommended tolerance: 1e-08
  Problem complexity: Unknown
  Memory for 1M points: 0.203 GB
  Chunking strategy: Not needed


## 3. Basic Large Dataset Fitting**Key function:** `fit_large_dataset()` - Convenience function for automatic large dataset handling**Features:**- Automatic memory management- Progress reporting for long-running fits- Intelligent strategy selection (single-pass, chunked, or streaming)- Returns standard `OptimizeResult` with fitted parameters

In [7]:
def demo_basic_large_dataset_fitting():
    """Demonstrate basic large dataset fitting."""
    print("\n" + "=" * 60)
    print("BASIC LARGE DATASET FITTING DEMO")
    print("=" * 60)

    # Generate synthetic large dataset (1M points)
    print("Generating 1M point exponential decay dataset...")
    np.random.seed(42)
    n_points = 1_000_000
    x_data = np.linspace(0, 5, n_points, dtype=np.float64)
    true_params = [5.0, 1.2, 0.5]
    noise_level = 0.05
    y_true = true_params[0] * np.exp(-true_params[1] * x_data) + true_params[2]
    y_data = y_true + np.random.normal(0, noise_level, n_points)

    print(f"Dataset: {n_points:,} points")
    print(
        f"True parameters: a={true_params[0]}, b={true_params[1]}, c={true_params[2]}"
    )

    # Fit using convenience function
    print("\nFitting with automatic memory management...")
    start_time = time.time()
    result = fit_large_dataset(
        exponential_decay,
        x_data,
        y_data,
        p0=[4.0, 1.0, 0.4],
        memory_limit_gb=2.0,  # 2GB limit
        show_progress=True,
    )
    fit_time = time.time() - start_time

    if result.success:
        fitted_params = np.array(result.popt)
        errors = np.abs(fitted_params - np.array(true_params))
        rel_errors = errors / np.array(true_params) * 100
        print(f"\n‚úÖ Fit completed in {fit_time:.2f} seconds")
        print(
            f"Fitted parameters: [{fitted_params[0]:.3f}, {fitted_params[1]:.3f}, {fitted_params[2]:.3f}]"
        )
        print(f"Absolute errors: [{errors[0]:.4f}, {errors[1]:.4f}, {errors[2]:.4f}]")
        print(
            f"Relative errors: [{rel_errors[0]:.2f}%, {rel_errors[1]:.2f}%, {rel_errors[2]:.2f}%]"
        )
    else:
        print(f"‚ùå Fit failed: {result.message}")


# Run the demo
demo_basic_large_dataset_fitting()

INFO:nlsq.streaming.large_dataset:Dataset analysis for 1,000,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.14 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 1,000,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=1000000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08



BASIC LARGE DATASET FITTING DEMO
Generating 1M point exponential decay dataset...
Dataset: 1,000,000 points
True parameters: a=5.0, b=1.2, c=0.5

Fitting with automatic memory management...


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=1000000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=33388.95287858993 | grad_norm=1.3658e+05 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1451.3621154759871 | grad_norm=1.1619e+04 | step=4.142463035441596 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1250.605606748687 | grad_norm=469.9238 | step=4.142463035441596 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1250.4680999115978 | grad_norm=0.1148 | step=4.142463035441596 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=2.853038s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=1250.4681 | elapsed=2.853s | final_gradient_norm=5.6343e-07


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=3.857506s





‚úÖ Fit completed in 4.06 seconds
Fitted parameters: [5.000, 1.200, 0.500]
Absolute errors: [0.0002, 0.0000, 0.0001]
Relative errors: [0.00%, 0.00%, 0.03%]


## 4. Context Managers for Temporary Configuration**Key concept:** Use context managers to temporarily change settings without affecting global state**Available contexts:**- **`memory_context(MemoryConfig)`** - Temporarily change memory settings- **`large_dataset_context(LargeDatasetConfig)`** - Optimize for large dataset processing**Why use context managers:**- Settings automatically restore after the context exits- Safe for nested operations- Allows experiment with different configurations- No risk of forgetting to restore settings

In [8]:
def demo_context_managers():
    """Demonstrate context managers for temporary configuration."""
    print("\n" + "=" * 60)
    print("CONTEXT MANAGERS DEMO")
    print("=" * 60)

    # Show current configuration
    original_mem_config = get_memory_config()
    print(f"Original memory limit: {original_mem_config.memory_limit_gb} GB")

    # Generate test data
    np.random.seed(555)
    n_points = 500_000
    x_data = np.linspace(0, 5, n_points)
    y_data = exponential_decay(x_data, 4.0, 1.5, 0.3) + np.random.normal(
        0, 0.05, n_points
    )
    print(f"Test dataset: {n_points:,} points")

    # Test 1: Memory context for memory-constrained fitting
    print("\n--- Test 1: Memory-constrained fitting ---")
    constrained_config = MemoryConfig(
        memory_limit_gb=0.5,  # Very low limit
        enable_mixed_precision_fallback=True,
    )

    with memory_context(constrained_config):
        temp_config = get_memory_config()
        print(f"Inside context memory limit: {temp_config.memory_limit_gb} GB")
        print(f"Mixed precision enabled: {temp_config.enable_mixed_precision_fallback}")

        start_time = time.time()
        result1 = fit_large_dataset(
            exponential_decay, x_data, y_data, p0=[3.5, 1.3, 0.25], show_progress=False
        )
        time1 = time.time() - start_time

        if result1.success:
            print(f"‚úÖ Constrained fit completed: {time1:.3f}s")
            print(f"   Parameters: {result1.popt}")
        else:
            print(f"‚ùå Constrained fit failed: {result1.message}")

    # Check that configuration is restored
    restored_config = get_memory_config()
    print(f"After context memory limit: {restored_config.memory_limit_gb} GB")

    # Test 2: Large dataset context for optimized processing
    print("\n--- Test 2: Large dataset optimization ---")
    ld_config = LargeDatasetConfig()

    with large_dataset_context(ld_config):
        print("Inside large dataset context - chunking optimized")
        start_time = time.time()
        result2 = fit_large_dataset(
            exponential_decay, x_data, y_data, p0=[3.5, 1.3, 0.25], show_progress=False
        )
        time2 = time.time() - start_time

        if result2.success:
            print(f"‚úÖ Optimized fit completed: {time2:.3f}s")
            print(f"   Parameters: {result2.popt}")
        else:
            print(f"‚ùå Optimized fit failed: {result2.message}")

    print("\n‚úì Context managers allow flexible, temporary configuration changes!")


# Run the demo
demo_context_managers()

INFO:nlsq.streaming.large_dataset:Dataset analysis for 500,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.07 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 500,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=500000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False



CONTEXT MANAGERS DEMO
Original memory limit: 8.0 GB
Test dataset: 500,000 points

--- Test 1: Memory-constrained fitting ---
Inside context memory limit: 0.5 GB
Mixed precision enabled: True


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=500000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=3381.8141561933803 | grad_norm=2.2684e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=636.8694109166609 | grad_norm=626.3187 | step=3.7419914484135317 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=627.7287040967236 | grad_norm=9.6397 | step=3.7419914484135317 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=627.7285590198587 | grad_norm=2.7295e-04 | step=3.7419914484135317 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=1.587960s


INFO:nlsq.least_squares:Convergence reason=Both `ftol` and `xtol` termination conditions are satisfied. | iterations=4 | final_cost=627.7286 | elapsed=1.588s | final_gradient_norm=1.6683e-08


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=2.150968s




INFO:nlsq.streaming.large_dataset:Dataset analysis for 500,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.07 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 500,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=500000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


‚úÖ Constrained fit completed: 2.303s
   Parameters: [4.00017172 1.49998995 0.29995423]
After context memory limit: 8.0 GB

--- Test 2: Large dataset optimization ---
Inside large dataset context - chunking optimized


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=500000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=3381.8141561933803 | grad_norm=2.2684e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=636.8694109166609 | grad_norm=626.3187 | step=3.7419914484135317 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=627.7287040967236 | grad_norm=9.6397 | step=3.7419914484135317 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=627.7285590198587 | grad_norm=2.7295e-04 | step=3.7419914484135317 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.554497s


INFO:nlsq.least_squares:Convergence reason=Both `ftol` and `xtol` termination conditions are satisfied. | iterations=4 | final_cost=627.7286 | elapsed=0.554s | final_gradient_norm=1.6683e-08


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.946995s




‚úÖ Optimized fit completed: 1.030s
   Parameters: [4.00017172 1.49998995 0.29995423]

‚úì Context managers allow flexible, temporary configuration changes!


## 5. Chunked Processing**Key concept:** For datasets that don't fit in memory, NLSQ automatically chunks the data and processes it in batches using an advanced exponential moving average algorithm.**How it works:**1. Dataset divided into manageable chunks based on memory limit2. Each chunk processed separately to compute partial gradient3. Gradients combined using exponential moving average4. Achieves <1% error for well-conditioned problems**When to use:**- Dataset larger than available RAM- Memory-constrained environments- Well-conditioned optimization problems

In [9]:
def demo_chunked_processing():
    """Demonstrate chunked processing with progress reporting."""
    print("\n" + "=" * 60)
    print("CHUNKED PROCESSING DEMO")
    print("=" * 60)

    # Generate a dataset that will require chunking
    print("Generating 2M point polynomial dataset...")
    np.random.seed(123)
    n_points = 2_000_000
    x_data = np.linspace(-2, 2, n_points, dtype=np.float64)
    true_params = [0.5, -1.2, 2.0, 1.5]
    noise_level = 0.1
    y_true = (
        true_params[0] * x_data**3
        + true_params[1] * x_data**2
        + true_params[2] * x_data
        + true_params[3]
    )
    y_data = y_true + np.random.normal(0, noise_level, n_points)

    print(f"Dataset: {n_points:,} points")
    print(f"True parameters: {true_params}")

    # Create fitter with limited memory to force chunking
    fitter = LargeDatasetFitter(memory_limit_gb=0.5)  # Small limit to force chunking

    # Get processing recommendations
    recs = fitter.get_memory_recommendations(n_points, 4)
    print(f"\nProcessing strategy: {recs['processing_strategy']}")
    print(f"Chunk size: {recs['recommendations']['chunk_size']:,}")
    print(f"Number of chunks: {recs['recommendations']['n_chunks']}")
    print(
        f"Memory estimate: {recs['recommendations']['total_memory_estimate_gb']:.2f} GB"
    )

    # Fit with progress reporting
    print("\nFitting with chunked processing...")
    start_time = time.time()
    result = fitter.fit_with_progress(
        polynomial_model, x_data, y_data, p0=[0.4, -1.0, 1.8, 1.2]
    )
    fit_time = time.time() - start_time

    if result.success:
        fitted_params = np.array(result.popt)
        errors = np.abs(fitted_params - np.array(true_params))
        rel_errors = errors / np.abs(np.array(true_params)) * 100
        print(f"\n‚úÖ Chunked fit completed in {fit_time:.2f} seconds")

        if hasattr(result, "n_chunks"):
            print(
                f"Used {result.n_chunks} chunks with {result.success_rate:.1%} success rate"
            )

        print(f"Fitted parameters: {fitted_params}")
        print(f"Absolute errors: {errors}")
        print(f"Relative errors: {rel_errors}%")
    else:
        print(f"‚ùå Chunked fit failed: {result.message}")


# Run the demo
demo_chunked_processing()

INFO:nlsq.streaming.large_dataset:Dataset analysis for 2,000,000 points, 4 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 170.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.32 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 1,000,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 2


INFO:nlsq.streaming.large_dataset:Dataset analysis for 2,000,000 points, 4 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 170.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.32 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 1,000,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 2


INFO:nlsq.streaming.large_dataset:Auto-enabled mixed precision for chunked processing (50% additional memory savings)


INFO:nlsq.streaming.large_dataset:Mixed precision optimization enabled (float32 -> float64 fallback)


INFO:nlsq.streaming.large_dataset:Fitting dataset using 2 chunks


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=1000000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08



CHUNKED PROCESSING DEMO
Generating 2M point polynomial dataset...
Dataset: 2,000,000 points
True parameters: [0.5, -1.2, 2.0, 1.5]

Processing strategy: chunked
Chunk size: 1,000,000
Number of chunks: 2
Memory estimate: 0.32 GB

Fitting with chunked processing...


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=1000000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=236991.47534147004 | grad_norm=2.0206e+06 | nfev=1


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=2.714109s


INFO:nlsq.least_squares:Convergence reason=`gtol` termination condition is satisfied. | iterations=1 | final_cost=5002.5690 | elapsed=2.714s | final_gradient_norm=8.6732e-10


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=3.703224s




INFO:nlsq.streaming.large_dataset:Progress: 1/2 chunks (50.0%) - ETA: 3.9s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=1000000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=1000000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=5017.4989236345355 | grad_norm=1.4634e+04 | nfev=1


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.040305s


INFO:nlsq.least_squares:Convergence reason=`gtol` termination condition is satisfied. | iterations=1 | final_cost=5004.8308 | elapsed=0.040s | final_gradient_norm=1.8653e-10


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.609524s




INFO:nlsq.streaming.large_dataset:Progress: 2/2 chunks (100.0%) - ETA: 0.0s


INFO:nlsq.streaming.large_dataset:Chunked fit completed with 100.0% success rate



‚úÖ Chunked fit completed in 4.73 seconds
Used 2 chunks with 100.0% success rate
Fitted parameters: [ 0.49954434 -1.1991878   2.00000184  1.49975205]
Absolute errors: [4.55663613e-04 8.12197789e-04 1.84432241e-06 2.47951909e-04]
Relative errors: [9.11327226e-02 6.76831491e-02 9.22161203e-05 1.65301273e-02]%


## 6. Streaming Optimization for Unlimited Datasets**Key concept:** For datasets too large to fit in memory, NLSQ uses streaming optimization with mini-batch gradient descent. Unlike subsampling (removed), streaming processes **100% of data with zero accuracy loss**.**‚ö†Ô∏è Deprecation Notice:**- **Removed:** Subsampling (which caused data loss)- **Added:** Streaming optimization (processes all data)- **Deprecated:** `enable_sampling`, `sampling_threshold`, `max_sampled_size` parameters now emit warnings**How streaming works:**1. Processes data in sequential batches2. Uses mini-batch gradient descent3. No data is skipped or discarded4. Zero accuracy loss compared to full dataset processing**When to use:**- Dataset > available RAM- Unlimited or continuously generated data- When accuracy is critical

In [10]:
def demo_streaming_optimization():
    """Demonstrate streaming optimization for unlimited datasets."""
    print("\n" + "=" * 60)
    print("STREAMING OPTIMIZATION DEMO")
    print("=" * 60)

    # Simulate a very large dataset scenario
    print("Simulating extremely large dataset (100M points)...")
    print("Using streaming optimization for zero data loss\n")

    n_points_full = 100_000_000  # 100M points
    true_params = [3.0, 0.8, 0.2]

    # For demo purposes, generate a representative dataset
    # In production, streaming would process full dataset in batches
    print("Generating representative dataset for demo...")
    np.random.seed(777)
    n_demo = 1_000_000  # 1M points for demo
    x_data = np.linspace(0, 10, n_demo)
    y_data = exponential_decay(x_data, *true_params) + np.random.normal(0, 0.1, n_demo)

    # Memory estimation
    stats = estimate_memory_requirements(n_points_full, len(true_params))
    print(f"\nFull dataset memory estimate: {stats.total_memory_estimate_gb:.1f} GB")
    print(f"Number of chunks required: {stats.n_chunks}")

    # Configure streaming optimization
    print("\nConfiguring streaming optimization...")
    config = LDMemoryConfig(
        memory_limit_gb=4.0,
        use_streaming=True,  # Enable streaming
        streaming_batch_size=50000,  # Process 50K points per batch
    )
    fitter = LargeDatasetFitter(config=config)

    print("\nFitting with streaming optimization...")
    print("(Processing 100% of data in batches)\n")

    try:
        start_time = time.time()
        result = fitter.fit(exponential_decay, x_data, y_data, p0=[2.5, 0.6, 0.15])
        fit_time = time.time() - start_time

        if result.success:
            print(f"\n‚úÖ Streaming fit completed in {fit_time:.2f} seconds")
            print(f"\nFitted parameters: {result.x}")
            print(f"True parameters:    {true_params}")

            errors = np.abs(result.x - np.array(true_params))
            rel_errors = errors / np.abs(np.array(true_params)) * 100
            print(f"Relative errors:    {[f'{e:.2f}%' for e in rel_errors]}")
            print("\n‚ÑπÔ∏è Streaming processed 100% of data (zero accuracy loss)")
        else:
            print(f"‚ùå Streaming fit failed: {result.message}")
    except Exception as e:
        print(f"‚ùå Error during streaming fit: {e}")


demo_streaming_optimization()


STREAMING OPTIMIZATION DEMO
Simulating extremely large dataset (100M points)...
Using streaming optimization for zero data loss

Generating representative dataset for demo...


INFO:nlsq.streaming.large_dataset:Dataset analysis for 1,000,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.14 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 1,000,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=1000000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08



Full dataset memory estimate: 13.6 GB
Number of chunks required: 100

Configuring streaming optimization...

Fitting with streaming optimization...
(Processing 100% of data in batches)



INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=1000000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=10166.595424129187 | grad_norm=1.7173e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=5168.709318538771 | grad_norm=1.3786e+04 | step=2.5753640519351824 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=5001.171831123859 | grad_norm=41.5858 | step=2.5753640519351824 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=5001.169795820514 | grad_norm=0.0057 | step=2.5753640519351824 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.667063s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=5001.1698 | elapsed=0.667s | final_gradient_norm=9.3184e-07


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=1.151636s





‚úÖ Streaming fit completed in 1.25 seconds

Fitted parameters: [3.00009638 0.80008703 0.20011093]
True parameters:    [3.0, 0.8, 0.2]
Relative errors:    ['0.00%', '0.01%', '0.06%']

‚ÑπÔ∏è Streaming processed 100% of data (zero accuracy loss)


## 7. curve_fit_large Convenience Function**Key function:** `curve_fit_large()` provides automatic detection and handling of large datasets**Features:**- Automatic dataset size detection- Intelligent processing strategy selection- SciPy-compatible API (drop-in replacement)- Returns standard `(popt, pcov)` tuple**When to use:**- You want automatic handling of both small and large datasets- Migrating from SciPy's `curve_fit`- Don't want to manually configure chunking/streaming

In [11]:
def demo_curve_fit_large():
    """Demonstrate the curve_fit_large convenience function."""
    print("\n" + "=" * 60)
    print("CURVE_FIT_LARGE CONVENIENCE FUNCTION DEMO")
    print("=" * 60)

    # Generate test dataset
    print("Generating 3M point dataset for curve_fit_large demo...")
    np.random.seed(789)
    n_points = 3_000_000
    x_data = np.linspace(0, 10, n_points, dtype=np.float64)
    true_params = [5.0, 5.0, 1.5, 0.5]
    y_true = gaussian(x_data, *true_params)
    y_data = y_true + np.random.normal(0, 0.1, n_points)

    print(f"Dataset: {n_points:,} points")
    print(
        f"True parameters: a={true_params[0]:.2f}, mu={true_params[1]:.2f}, sigma={true_params[2]:.2f}, offset={true_params[3]:.2f}"
    )

    # Use curve_fit_large - automatic large dataset handling
    print("\nUsing curve_fit_large with automatic optimization...")
    start_time = time.time()
    popt, pcov = curve_fit_large(
        gaussian,
        x_data,
        y_data,
        p0=[4.5, 4.8, 1.3, 0.4],
        memory_limit_gb=1.0,  # Force chunking with low memory limit
        show_progress=True,
        auto_size_detection=True,  # Automatically detect large dataset
    )
    fit_time = time.time() - start_time

    errors = np.abs(popt - np.array(true_params))
    rel_errors = errors / np.array(true_params) * 100
    print(f"\n‚úÖ curve_fit_large completed in {fit_time:.2f} seconds")
    print(f"Fitted parameters: {popt}")
    print(f"Absolute errors: {errors}")
    print(f"Relative errors: {rel_errors}%")

    # Show parameter uncertainties from covariance matrix
    param_std = np.sqrt(np.diag(pcov))
    print(f"Parameter uncertainties (std): {param_std}")


# Run the demo
demo_curve_fit_large()


CURVE_FIT_LARGE CONVENIENCE FUNCTION DEMO
Generating 3M point dataset for curve_fit_large demo...


INFO:nlsq.streaming.large_dataset:Dataset analysis for 3,000,000 points, 4 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 170.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.47 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 300,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 10


INFO:nlsq.streaming.large_dataset:Auto-enabled mixed precision for chunked processing (50% additional memory savings)


INFO:nlsq.streaming.large_dataset:Mixed precision optimization enabled (float32 -> float64 fallback)


INFO:nlsq.streaming.large_dataset:Fitting dataset using 10 chunks


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


Dataset: 3,000,000 points
True parameters: a=5.00, mu=5.00, sigma=1.50, offset=0.50

Using curve_fit_large with automatic optimization...


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=4504.637924565895 | grad_norm=4.2087e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1539.0775457264363 | grad_norm=4750.7958 | step=6.71863081289633 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1500.882649087269 | grad_norm=54.1765 | step=3.359315406448165 | nfev=4


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1500.8661268353856 | grad_norm=24.1317 | step=0.4199144258060207 | nfev=7


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=4 | cost=1500.8647711010408 | grad_norm=3.0261 | step=0.2099572129030103 | nfev=9


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=5 | cost=1500.8647401618496 | grad_norm=0.7967 | step=0.10497860645150517 | nfev=11


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=6 | cost=1500.8647380586845 | grad_norm=3.2991 | step=0.026244651612876292 | nfev=12


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=7 | cost=1500.8647079501825 | grad_norm=0.2075 | step=0.052489303225752584 | nfev=13


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=1.724199s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=8 | final_cost=1500.8647 | elapsed=1.724s | final_gradient_norm=0.8581


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=2.386124s




INFO:nlsq.streaming.large_dataset:Progress: 1/10 chunks (10.0%) - ETA: 23.0s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1500.2015949163897 | grad_norm=690.5659 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1499.9268526414614 | grad_norm=483.5279 | step=0.5913725706262271 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1499.6771777786362 | grad_norm=39.6605 | step=0.29568628531311353 | nfev=5


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1499.6753500701545 | grad_norm=8.7392 | step=0.14784314265655676 | nfev=7


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=4 | cost=1499.6751527575452 | grad_norm=2.1297 | step=0.07392157132827838 | nfev=9


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=5 | cost=1499.6749858639846 | grad_norm=8.3226 | step=0.07392157132827838 | nfev=10


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=6 | cost=1499.6747624911072 | grad_norm=8.2090 | step=0.14784314265655676 | nfev=11


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=7 | cost=1499.6745912452989 | grad_norm=2.0351 | step=0.07392157132827838 | nfev=13


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=8 | cost=1499.6744490881515 | grad_norm=7.9604 | step=0.07392157132827838 | nfev=14


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=9 | cost=1499.674255456539 | grad_norm=7.8527 | step=0.07392157132827838 | nfev=15


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=10 | cost=1499.674072127344 | grad_norm=7.7164 | step=0.07392157132827838 | nfev=16


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=11 | cost=1499.673899047033 | grad_norm=7.5833 | step=0.07392157132827838 | nfev=17


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=12 | cost=1499.6737356782291 | grad_norm=7.4536 | step=0.07392157132827838 | nfev=18


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=13 | cost=1499.6735815144248 | grad_norm=7.3269 | step=0.07392157132827838 | nfev=19


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=14 | cost=1499.6734360794353 | grad_norm=7.2034 | step=0.07392157132827838 | nfev=20


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=15 | cost=1499.6732989253144 | grad_norm=7.0828 | step=0.07392157132827838 | nfev=21


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=16 | cost=1499.673169630433 | grad_norm=6.9652 | step=0.07392157132827838 | nfev=22


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=17 | cost=1499.673047797704 | grad_norm=6.8503 | step=0.07392157132827838 | nfev=23


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=18 | cost=1499.6729330529433 | grad_norm=6.7381 | step=0.07392157132827838 | nfev=24


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=19 | cost=1499.672825043354 | grad_norm=6.6286 | step=0.07392157132827838 | nfev=25


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=20 | cost=1499.6727234361238 | grad_norm=6.5217 | step=0.07392157132827838 | nfev=26


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=21 | cost=1499.672627917128 | grad_norm=6.4172 | step=0.07392157132827838 | nfev=27


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=22 | cost=1499.6725381897252 | grad_norm=6.3152 | step=0.07392157132827838 | nfev=28


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=23 | cost=1499.6724539736433 | grad_norm=6.2155 | step=0.07392157132827838 | nfev=29


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=24 | cost=1499.6723750039437 | grad_norm=6.1181 | step=0.07392157132827838 | nfev=30


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=25 | cost=1499.67230103006 | grad_norm=6.0229 | step=0.07392157132827838 | nfev=31


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=26 | cost=1499.6722318149066 | grad_norm=5.9299 | step=0.07392157132827838 | nfev=32


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=27 | cost=1499.6721671340479 | grad_norm=5.8390 | step=0.07392157132827838 | nfev=33


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=28 | cost=1499.672106774925 | grad_norm=5.7501 | step=0.07392157132827838 | nfev=34


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=29 | cost=1499.6720505361386 | grad_norm=5.6631 | step=0.07392157132827838 | nfev=35


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=30 | cost=1499.6719982267782 | grad_norm=5.5781 | step=0.07392157132827838 | nfev=36


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=31 | cost=1499.6719496657975 | grad_norm=5.4950 | step=0.07392157132827838 | nfev=37


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=32 | cost=1499.6719046814321 | grad_norm=5.4136 | step=0.07392157132827838 | nfev=38


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=33 | cost=1499.6718631106576 | grad_norm=5.3341 | step=0.07392157132827838 | nfev=39


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=34 | cost=1499.67182479868 | grad_norm=5.2562 | step=0.07392157132827838 | nfev=40


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=35 | cost=1499.671789598463 | grad_norm=5.1800 | step=0.07392157132827838 | nfev=41


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=36 | cost=1499.671757370284 | grad_norm=5.1055 | step=0.07392157132827838 | nfev=42


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=37 | cost=1499.6717279813204 | grad_norm=5.0325 | step=0.07392157132827838 | nfev=43


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=38 | cost=1499.6717013052598 | grad_norm=4.9610 | step=0.07392157132827838 | nfev=44


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=39 | cost=1499.6716772219384 | grad_norm=4.8911 | step=0.07392157132827838 | nfev=45


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=40 | cost=1499.671655616999 | grad_norm=4.8226 | step=0.07392157132827838 | nfev=46


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=41 | cost=1499.671636381574 | grad_norm=4.7555 | step=0.07392157132827838 | nfev=47


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=42 | cost=1499.6716194119836 | grad_norm=4.6898 | step=0.07392157132827838 | nfev=48


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.595501s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=43 | final_cost=1499.6716 | elapsed=0.596s | final_gradient_norm=4.6254


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.762132s




INFO:nlsq.streaming.large_dataset:Progress: 2/10 chunks (20.0%) - ETA: 13.7s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1685.3476844736037 | grad_norm=2.3091e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1521.973650389447 | grad_norm=8832.9070 | step=0.870913237260669 | nfev=4


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1508.941065903442 | grad_norm=4803.7909 | step=0.870913237260669 | nfev=5


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1504.8408595586548 | grad_norm=2857.0230 | step=0.870913237260669 | nfev=6


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=4 | cost=1502.3471090158996 | grad_norm=113.7448 | step=0.4354566186303345 | nfev=8


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=5 | cost=1502.3340248841905 | grad_norm=50.3163 | step=0.21772830931516726 | nfev=10


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=6 | cost=1502.3283101458464 | grad_norm=214.7546 | step=0.21772830931516726 | nfev=11


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=7 | cost=1502.308748656028 | grad_norm=211.3454 | step=0.21772830931516726 | nfev=12


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=8 | cost=1502.2906733094678 | grad_norm=213.7847 | step=0.21772830931516726 | nfev=13


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=9 | cost=1502.273745218542 | grad_norm=215.7213 | step=0.21772830931516726 | nfev=14


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=10 | cost=1502.2586868426329 | grad_norm=217.1940 | step=0.21772830931516726 | nfev=15


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=11 | cost=1502.2464751529942 | grad_norm=218.1157 | step=0.21772830931516726 | nfev=16


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=12 | cost=1502.2384648192456 | grad_norm=218.3884 | step=0.21772830931516726 | nfev=17


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=13 | cost=1502.2292455862982 | grad_norm=172.1698 | step=0.21772830931516726 | nfev=18


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=14 | cost=1502.2173640714989 | grad_norm=1.0163 | step=0.21772830931516726 | nfev=19


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.209892s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=15 | final_cost=1502.2174 | elapsed=0.210s | final_gradient_norm=5.5986e-06


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.333804s




INFO:nlsq.streaming.large_dataset:Progress: 3/10 chunks (30.0%) - ETA: 9.1s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1656.4477587988117 | grad_norm=1.2872e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1501.6501318476164 | grad_norm=3317.5823 | step=6.8469171514547 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1495.4643096477512 | grad_norm=31.3849 | step=6.8469171514547 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1495.4638751858156 | grad_norm=2.2495e-04 | step=6.8469171514547 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.047064s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=1495.4639 | elapsed=0.047s | final_gradient_norm=1.9542e-08


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.190621s




INFO:nlsq.streaming.large_dataset:Progress: 4/10 chunks (40.0%) - ETA: 6.3s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1548.8779116884373 | grad_norm=4087.4922 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1501.0145838423405 | grad_norm=67.5141 | step=7.139806277273369 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1500.944402671505 | grad_norm=1.6093 | step=7.139806277273369 | nfev=3


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.031015s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=3 | final_cost=1500.9444 | elapsed=0.031s | final_gradient_norm=1.1405e-04


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.166231s




INFO:nlsq.streaming.large_dataset:Progress: 5/10 chunks (50.0%) - ETA: 4.5s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1507.8138945069043 | grad_norm=2157.5293 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1501.6487467451545 | grad_norm=128.4823 | step=7.52650396877716 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1501.622584520537 | grad_norm=1.5783 | step=7.52650396877716 | nfev=3


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.031754s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=3 | final_cost=1501.6226 | elapsed=0.032s | final_gradient_norm=1.4718e-05


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.140544s




INFO:nlsq.streaming.large_dataset:Progress: 6/10 chunks (60.0%) - ETA: 3.2s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1515.5330438426763 | grad_norm=5138.3682 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1502.8459800688092 | grad_norm=152.3110 | step=7.297412504815711 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1502.8343220293823 | grad_norm=0.1459 | step=7.297412504815711 | nfev=3


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.031596s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=3 | final_cost=1502.8343 | elapsed=0.032s | final_gradient_norm=3.0556e-08


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.168682s




INFO:nlsq.streaming.large_dataset:Progress: 7/10 chunks (70.0%) - ETA: 2.1s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1509.02418426668 | grad_norm=3864.8224 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1500.1419436743008 | grad_norm=183.0366 | step=7.209229094484592 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1500.1225517680039 | grad_norm=0.7556 | step=7.209229094484592 | nfev=3


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.041694s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=3 | final_cost=1500.1226 | elapsed=0.042s | final_gradient_norm=1.9371e-06


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.156132s




INFO:nlsq.streaming.large_dataset:Progress: 8/10 chunks (80.0%) - ETA: 1.3s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1503.546021537445 | grad_norm=566.3477 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1502.4914504805793 | grad_norm=20.6827 | step=0.4921713924271676 | nfev=4


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1502.4902337845315 | grad_norm=26.7032 | step=0.1230428481067919 | nfev=6


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1502.489158271236 | grad_norm=26.6393 | step=0.1230428481067919 | nfev=7


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=4 | cost=1502.4881157199243 | grad_norm=25.8222 | step=0.1230428481067919 | nfev=8


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=5 | cost=1502.4871447805156 | grad_norm=25.0261 | step=0.1230428481067919 | nfev=9


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=6 | cost=1502.4862400686147 | grad_norm=24.2637 | step=0.1230428481067919 | nfev=10


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=7 | cost=1502.4853960150583 | grad_norm=23.5337 | step=0.1230428481067919 | nfev=11


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=8 | cost=1502.4846076057995 | grad_norm=22.8343 | step=0.1230428481067919 | nfev=12


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=9 | cost=1502.4838703239207 | grad_norm=22.1641 | step=0.1230428481067919 | nfev=13


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=10 | cost=1502.4831800926086 | grad_norm=21.5216 | step=0.1230428481067919 | nfev=14


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=11 | cost=1502.4825332254773 | grad_norm=20.9054 | step=0.1230428481067919 | nfev=15


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=12 | cost=1502.4819263832192 | grad_norm=20.3142 | step=0.1230428481067919 | nfev=16


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=13 | cost=1502.4813565356912 | grad_norm=19.7468 | step=0.1230428481067919 | nfev=17


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=14 | cost=1502.4808209286825 | grad_norm=19.2021 | step=0.1230428481067919 | nfev=18


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=15 | cost=1502.480317054719 | grad_norm=18.6788 | step=0.1230428481067919 | nfev=19


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=16 | cost=1502.4798426273671 | grad_norm=18.1760 | step=0.1230428481067919 | nfev=20


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=17 | cost=1502.4793955585635 | grad_norm=17.6927 | step=0.1230428481067919 | nfev=21


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=18 | cost=1502.4789739385833 | grad_norm=17.2279 | step=0.1230428481067919 | nfev=22


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=19 | cost=1502.4785760182967 | grad_norm=16.7808 | step=0.1230428481067919 | nfev=23


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=20 | cost=1502.4782001934332 | grad_norm=16.3505 | step=0.1230428481067919 | nfev=24


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=21 | cost=1502.4778449905903 | grad_norm=15.9362 | step=0.1230428481067919 | nfev=25


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=22 | cost=1502.4775090547823 | grad_norm=15.5372 | step=0.1230428481067919 | nfev=26


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=23 | cost=1502.4771911383305 | grad_norm=15.1527 | step=0.1230428481067919 | nfev=27


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=24 | cost=1502.47689009094 | grad_norm=14.7821 | step=0.1230428481067919 | nfev=28


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=25 | cost=1502.4766048508227 | grad_norm=14.4247 | step=0.1230428481067919 | nfev=29


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=26 | cost=1502.4763344367357 | grad_norm=14.0800 | step=0.1230428481067919 | nfev=30


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=27 | cost=1502.4760779408398 | grad_norm=13.7474 | step=0.1230428481067919 | nfev=31


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=28 | cost=1502.4758345222779 | grad_norm=13.4263 | step=0.1230428481067919 | nfev=32


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=29 | cost=1502.4756034013913 | grad_norm=13.1162 | step=0.1230428481067919 | nfev=33


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=30 | cost=1502.475383854509 | grad_norm=12.8167 | step=0.1230428481067919 | nfev=34


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=31 | cost=1502.4751752092404 | grad_norm=12.5273 | step=0.1230428481067919 | nfev=35


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=32 | cost=1502.4749768402203 | grad_norm=12.2475 | step=0.1230428481067919 | nfev=36


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=33 | cost=1502.4747881652563 | grad_norm=11.9769 | step=0.1230428481067919 | nfev=37


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=34 | cost=1502.4746086418386 | grad_norm=11.7151 | step=0.1230428481067919 | nfev=38


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=35 | cost=1502.4744377639686 | grad_norm=11.4618 | step=0.1230428481067919 | nfev=39


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=36 | cost=1502.4742750592811 | grad_norm=11.2167 | step=0.1230428481067919 | nfev=40


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=37 | cost=1502.4741200864219 | grad_norm=10.9793 | step=0.1230428481067919 | nfev=41


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=38 | cost=1502.4739724326623 | grad_norm=10.7493 | step=0.1230428481067919 | nfev=42


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=39 | cost=1502.4738317117212 | grad_norm=10.5265 | step=0.1230428481067919 | nfev=43


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=40 | cost=1502.4736975617786 | grad_norm=10.3106 | step=0.1230428481067919 | nfev=44


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=41 | cost=1502.473569643658 | grad_norm=10.1013 | step=0.1230428481067919 | nfev=45


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=42 | cost=1502.473447639165 | grad_norm=9.8982 | step=0.1230428481067919 | nfev=46


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=43 | cost=1502.4733312495641 | grad_norm=9.7013 | step=0.1230428481067919 | nfev=47


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=44 | cost=1502.4732201941836 | grad_norm=9.5102 | step=0.1230428481067919 | nfev=48


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=45 | cost=1502.4731142091332 | grad_norm=9.3247 | step=0.1230428481067919 | nfev=49


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=46 | cost=1502.4730130461282 | grad_norm=9.1446 | step=0.1230428481067919 | nfev=50


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=47 | cost=1502.4729164714054 | grad_norm=8.9697 | step=0.1230428481067919 | nfev=51


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=48 | cost=1502.4728242647277 | grad_norm=8.7998 | step=0.1230428481067919 | nfev=52


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=49 | cost=1502.4727362184658 | grad_norm=8.6347 | step=0.1230428481067919 | nfev=53


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=50 | cost=1502.472652136751 | grad_norm=8.4743 | step=0.1230428481067919 | nfev=54


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=51 | cost=1502.4725718346951 | grad_norm=8.3183 | step=0.1230428481067919 | nfev=55


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=52 | cost=1502.4724951376666 | grad_norm=8.1666 | step=0.1230428481067919 | nfev=56


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=53 | cost=1502.4724218806239 | grad_norm=8.0190 | step=0.1230428481067919 | nfev=57


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=54 | cost=1502.472351907499 | grad_norm=7.8755 | step=0.1230428481067919 | nfev=58


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=55 | cost=1502.4722850706235 | grad_norm=7.7358 | step=0.1230428481067919 | nfev=59


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=56 | cost=1502.4722212301997 | grad_norm=7.5998 | step=0.1230428481067919 | nfev=60


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=57 | cost=1502.472160253808 | grad_norm=7.4674 | step=0.1230428481067919 | nfev=61


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=58 | cost=1502.4721020159514 | grad_norm=7.3385 | step=0.1230428481067919 | nfev=62


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=59 | cost=1502.4720463976305 | grad_norm=7.2129 | step=0.1230428481067919 | nfev=63


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=60 | cost=1502.4719932859489 | grad_norm=7.0905 | step=0.1230428481067919 | nfev=64


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=61 | cost=1502.4719425737483 | grad_norm=6.9713 | step=0.1230428481067919 | nfev=65


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=62 | cost=1502.4718941592646 | grad_norm=6.8550 | step=0.1230428481067919 | nfev=66


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=63 | cost=1502.4718479458115 | grad_norm=6.7417 | step=0.1230428481067919 | nfev=67


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=64 | cost=1502.471803841483 | grad_norm=6.6312 | step=0.1230428481067919 | nfev=68


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=65 | cost=1502.4717617588765 | grad_norm=6.5235 | step=0.1230428481067919 | nfev=69


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=66 | cost=1502.4717216148356 | grad_norm=6.4183 | step=0.1230428481067919 | nfev=70


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=67 | cost=1502.4716833302073 | grad_norm=6.3157 | step=0.1230428481067919 | nfev=71


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=68 | cost=1502.4716468296176 | grad_norm=6.2156 | step=0.1230428481067919 | nfev=72


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=69 | cost=1502.4716120412597 | grad_norm=6.1179 | step=0.1230428481067919 | nfev=73


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=70 | cost=1502.4715788966973 | grad_norm=6.0225 | step=0.1230428481067919 | nfev=74


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=71 | cost=1502.4715473306794 | grad_norm=5.9293 | step=0.1230428481067919 | nfev=75


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=72 | cost=1502.4715172809663 | grad_norm=5.8384 | step=0.1230428481067919 | nfev=76


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=73 | cost=1502.4714886881684 | grad_norm=5.7495 | step=0.1230428481067919 | nfev=77


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=74 | cost=1502.4714614955928 | grad_norm=5.6626 | step=0.1230428481067919 | nfev=78


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=75 | cost=1502.4714356491013 | grad_norm=5.5778 | step=0.1230428481067919 | nfev=79


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=76 | cost=1502.4714110969744 | grad_norm=5.4948 | step=0.1230428481067919 | nfev=80


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=77 | cost=1502.4713877897877 | grad_norm=5.4137 | step=0.1230428481067919 | nfev=81


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=78 | cost=1502.4713656802905 | grad_norm=5.3345 | step=0.1230428481067919 | nfev=82


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=79 | cost=1502.4713447232962 | grad_norm=5.2570 | step=0.1230428481067919 | nfev=83


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=80 | cost=1502.4713248755756 | grad_norm=5.1811 | step=0.1230428481067919 | nfev=84


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=81 | cost=1502.4713060957602 | grad_norm=5.1070 | step=0.1230428481067919 | nfev=85


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=82 | cost=1502.4712883442467 | grad_norm=5.0344 | step=0.1230428481067919 | nfev=86


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=83 | cost=1502.4712715831106 | grad_norm=4.9634 | step=0.1230428481067919 | nfev=87


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=84 | cost=1502.471255776023 | grad_norm=4.8939 | step=0.1230428481067919 | nfev=88


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=1.014815s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=85 | final_cost=1502.4712 | elapsed=1.015s | final_gradient_norm=4.8259


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=1.159747s




INFO:nlsq.streaming.large_dataset:Progress: 9/10 chunks (90.0%) - ETA: 0.7s


INFO:nlsq.curve_fit:Starting curve fit n_params=4 | n_data_points=300000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=4 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=4 | n_residuals=300000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1514.353261169696 | grad_norm=1976.2995 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=1508.0804640920915 | grad_norm=901.7379 | step=7.840722896396632 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=1506.1122845726627 | grad_norm=61.4589 | step=3.9203614481983164 | nfev=5


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=1506.1049672289269 | grad_norm=18.5893 | step=1.9601807240991587 | nfev=7


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=4 | cost=1506.1042625997925 | grad_norm=4.3341 | step=0.9800903620495792 | nfev=9


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=5 | cost=1506.104223528566 | grad_norm=1.0420 | step=0.49004518102478956 | nfev=11


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.091631s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=6 | final_cost=1506.1042 | elapsed=0.092s | final_gradient_norm=0.2555


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.240165s




INFO:nlsq.streaming.large_dataset:Progress: 10/10 chunks (100.0%) - ETA: 0.0s


INFO:nlsq.streaming.large_dataset:Chunked fit completed with 100.0% success rate



‚úÖ curve_fit_large completed in 7.07 seconds
Fitted parameters: [22.91920679  3.25963575  1.8053707   0.49727729]
Absolute errors: [1.79192068e+01 1.74036425e+00 3.05370696e-01 2.72270809e-03]
Relative errors: [358.38413589  34.80728503  20.35804638   0.54454162]%
Parameter uncertainties (std): [6.15079165 0.810103   0.16448884 0.13548504]


## 8. Performance ComparisonLet's compare different fitting approaches across various dataset sizes to understand when each strategy is most effective.

In [12]:
def compare_approaches():
    """Compare different fitting approaches."""
    print("\n" + "=" * 60)
    print("PERFORMANCE COMPARISON")
    print("=" * 60)

    # Test different dataset sizes
    sizes = [10_000, 100_000, 500_000]
    print(f"\n{'Size':>10} {'Time (s)':>12} {'Memory (GB)':>12} {'Strategy':>20}")
    print("-" * 55)

    for n in sizes:
        # Generate data
        np.random.seed(42)
        x = np.linspace(0, 10, n)
        y = 2.0 * np.exp(-0.5 * x) + 0.3 + np.random.normal(0, 0.05, n)

        # Get memory estimate
        stats = estimate_memory_requirements(n, 3)

        # Determine strategy
        if stats.n_chunks == 1:
            strategy = "Single chunk"
        else:
            strategy = f"Chunked ({stats.n_chunks} chunks)"

        # Time the fit
        start = time.time()
        result = fit_large_dataset(
            exponential_decay,
            x,
            y,
            p0=[2.5, 0.6, 0.2],
            memory_limit_gb=0.5,  # Small limit to test chunking
            show_progress=False,
        )
        elapsed = time.time() - start

        print(
            f"{n:10,} {elapsed:12.3f} {stats.total_memory_estimate_gb:12.3f} {strategy:>20}"
        )


# Run comparison
compare_approaches()

INFO:nlsq.streaming.large_dataset:Dataset analysis for 10,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.00 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 10,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=10000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08



PERFORMANCE COMPARISON

      Size     Time (s)  Memory (GB)             Strategy
-------------------------------------------------------


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=10000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=102.1852640386412 | grad_norm=815.4136 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=12.997801321879155 | grad_norm=64.7095 | step=2.5787593916455256 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=12.582564007334419 | grad_norm=2.4612 | step=2.5787593916455256 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=12.582194053186823 | grad_norm=0.0016 | step=2.5787593916455256 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.403822s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=12.5822 | elapsed=0.404s | final_gradient_norm=9.3893e-06


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.720249s




INFO:nlsq.streaming.large_dataset:Dataset analysis for 100,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.01 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 100,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=100000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


    10,000        0.878        0.001         Single chunk


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=100000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=1028.8495536379035 | grad_norm=8171.7031 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=129.49613338098197 | grad_norm=660.2263 | step=2.5787593916455256 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=125.22896456720858 | grad_norm=26.2061 | step=2.5787593916455256 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=125.2250395456617 | grad_norm=0.0055 | step=2.5787593916455256 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.278939s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=125.2250 | elapsed=0.279s | final_gradient_norm=7.8406e-07


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=0.548261s




INFO:nlsq.streaming.large_dataset:Dataset analysis for 500,000 points, 3 parameters:


INFO:nlsq.streaming.large_dataset:  Estimated memory per point: 146.0 bytes


INFO:nlsq.streaming.large_dataset:  Total memory estimate: 0.07 GB


INFO:nlsq.streaming.large_dataset:  Recommended chunk size: 500,000


INFO:nlsq.streaming.large_dataset:  Number of chunks: 1


INFO:nlsq.streaming.large_dataset:Fitting dataset in single chunk


INFO:nlsq.curve_fit:Starting curve fit n_params=3 | n_data_points=500000 | method=trf | solver=auto | batch_size=None | has_bounds=False | dynamic_sizing=False


INFO:nlsq.least_squares:Starting least squares optimization method=trf | n_params=3 | loss=linear | ftol=1.0000e-08 | xtol=1.0000e-08 | gtol=1.0000e-08


   100,000        0.698        0.014         Single chunk


INFO:nlsq.optimizer.trf:Starting TRF optimization (no bounds) n_params=3 | n_residuals=500000 | max_nfev=None


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=0 | cost=5138.772247981322 | grad_norm=4.0806e+04 | nfev=1


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=1 | cost=647.0026303186478 | grad_norm=3300.6215 | step=2.5787593916455256 | nfev=2


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=2 | cost=625.6042862483894 | grad_norm=131.2255 | step=2.5787593916455256 | nfev=3


PERFORMANCE:nlsq.optimizer.trf:Iteration iter=3 | cost=625.5845138795642 | grad_norm=0.0272 | step=2.5787593916455256 | nfev=4


PERFORMANCE:nlsq.least_squares:Timer: optimization elapsed=0.628233s


INFO:nlsq.least_squares:Convergence reason=`ftol` termination condition is satisfied. | iterations=4 | final_cost=625.5845 | elapsed=0.628s | final_gradient_norm=3.5590e-08


PERFORMANCE:nlsq.curve_fit:Timer: curve_fit elapsed=1.073064s




   500,000        1.151        0.068         Single chunk


## üîë Key Takeaways1. **Memory estimation first:** Always use `estimate_memory_requirements()` before fitting large datasets to predict memory usage and avoid crashes.2. **Automatic is best:** Use `curve_fit_large()` for automatic optimization - it intelligently selects the best strategy (single-pass, chunked, or streaming).3. **Chunking for large data:** Chunked processing works well when dataset is larger than RAM but can be processed in batches. Achieves <1% error for well-conditioned problems.4. **Streaming for unlimited:** Use streaming optimization when dataset exceeds available memory or is continuously generated. Processes 100% of data with zero accuracy loss.5. **Context managers for flexibility:** Use `memory_context()` and `large_dataset_context()` for temporary configuration changes without affecting global settings.6. **Monitor progress:** Enable `show_progress=True` for long-running fits to track optimization progress in real-time.7. **Algorithm selection matters:** Use `auto_select_algorithm()` to automatically choose the best optimization algorithm for your specific problem.---

## ‚ö†Ô∏è Common Pitfalls**Pitfall 1: Not checking memory requirements**- **Symptom:** Out of memory errors, system crashes, or extremely slow performance- **Cause:** Dataset too large for available RAM, not using chunking/streaming- **Solution:** Always call `estimate_memory_requirements()` first to understand memory needs```python# ‚úÖ Correct approachstats = estimate_memory_requirements(n_points, n_params)if stats.n_chunks > 1:    # Use chunking or streaming    result = fit_large_dataset(func, x, y, memory_limit_gb=2.0)```**Pitfall 2: Using streaming when chunking is sufficient**- **Symptom:** Slower performance than necessary- **Cause:** Streaming uses mini-batch gradient descent which is slower than direct optimization- **Solution:** Chunking is faster when data fits in memory (even if split into chunks)```python# Choose based on memory requirementsstats = estimate_memory_requirements(n_points, n_params)if stats.total_memory_estimate_gb < available_ram_gb:    # Use chunking (faster)    result = fit_large_dataset(func, x, y, memory_limit_gb=available_ram_gb)else:    # Use streaming (handles unlimited data)    config = LDMemoryConfig(use_streaming=True)    fitter = LargeDatasetFitter(config=config)```**Pitfall 3: Forgetting to restore configuration**- **Symptom:** Global settings changed unexpectedly, affecting subsequent fits- **Cause:** Manually changing config without restoring- **Solution:** Use context managers to automatically restore settings```python# ‚ùå Wrong approachconfigure_for_large_datasets(memory_limit_gb=1.0)# ... do work ...# (forgot to restore original settings)# ‚úÖ Correct approachwith memory_context(MemoryConfig(memory_limit_gb=1.0)):    # ... do work ...# Settings automatically restored here```**Pitfall 4: Not monitoring long-running fits**- **Symptom:** Fits appear frozen, no feedback on progress- **Cause:** Not enabling progress reporting- **Solution:** Use `show_progress=True` for datasets >100K points```python# ‚úÖ Always use progress reporting for large datasetsresult = fit_large_dataset(    func, x, y,     p0=initial_guess,    show_progress=True  # Get real-time updates)```---

## üí° Best Practices1. **Start with memory estimation**   - Call `estimate_memory_requirements()` before fitting   - Plan your strategy based on the results   - Set appropriate `memory_limit_gb` for your system2. **Use automatic functions when possible**   - `curve_fit_large()` handles most cases automatically   - `fit_large_dataset()` provides explicit control when needed   - Let NLSQ choose the optimal strategy3. **Enable progress reporting**   - Use `show_progress=True` for datasets >100K points   - Monitor optimization progress for long-running fits   - Helps identify convergence issues early4. **Choose the right approach**   - **Small (<100K):** Regular `curve_fit()` is sufficient   - **Medium (100K-10M):** Use `curve_fit_large()` with chunking   - **Large (>10M):** Consider streaming optimization   - **Unlimited:** Always use streaming5. **Use context managers**   - Temporary configuration changes with automatic restoration   - Safe for nested operations   - Prevents global state pollution6. **Leverage algorithm selection**   - Use `auto_select_algorithm()` for complex models   - Let NLSQ choose optimal tolerance and algorithm   - Improves convergence for difficult problems7. **Monitor memory usage**   - Check system memory before starting   - Leave headroom (20-30%) for other processes   - Use mixed precision fallback for memory-constrained systems---

## üìä Performance Considerations**Memory usage:**- **Single-pass:** Requires `n_points √ó n_params √ó 8 bytes` for Jacobian- **Chunked:** Memory divided by number of chunks- **Streaming:** Constant memory regardless of dataset size- **Trade-off:** Memory vs accuracy (chunking has <1% error, streaming has 0% error)**Computational cost:**- **Time complexity:** O(n √ó m) where n = points, m = parameters- **JAX compilation:** First fit is slow (~1-5s), subsequent fits are fast- **GPU acceleration:** 150-270x speedup for large datasets (>1M points)- **Chunking overhead:** Minimal (<5%) for well-conditioned problems**Scaling behavior:**- **Linear scaling:** Fit time scales linearly with dataset size- **GPU advantage:** Increases with dataset size (more parallelism)- **Memory scaling:** O(n √ó m) for Jacobian matrix- **Chunking efficiency:** >95% accuracy retention for most problems**Trade-offs:**| Approach | Speed | Memory | Accuracy | Best For ||----------|-------|--------|----------|----------|| Single-pass | Fastest | High | 100% | Fits in RAM || Chunked | Fast | Medium | >99% | Larger than RAM || Streaming | Moderate | Low | 100% | Unlimited size |**Optimization tips:**1. Use GPU when available (automatic in JAX)2. Set `memory_limit_gb` to 70-80% of available RAM3. Enable mixed precision fallback for memory-constrained systems4. Use `auto_select_algorithm()` for complex models5. Reuse `CurveFit` objects to avoid recompilation---

## ‚ùì Common Questions**Q: How do I know if I need chunking vs streaming?**A: Use `estimate_memory_requirements()`. If `n_chunks > 1` but the total memory estimate is less than your available RAM, use chunking (faster). If dataset exceeds available memory, use streaming (handles unlimited data).**Q: What's the accuracy trade-off with chunking?**A: NLSQ's advanced chunking algorithm (exponential moving average) achieves <1% error for well-conditioned problems. For ill-conditioned problems or when accuracy is critical, use streaming for zero accuracy loss.**Q: Why is my first fit slow?**A: JAX compiles functions on first use (JIT compilation). Subsequent fits with the same function signature reuse the compiled code and run 100-300x faster.**Q: Can I use large dataset features on a GPU?**A: Yes! JAX automatically uses GPU when available. Large dataset features work seamlessly on both CPU and GPU, with GPU providing additional 2-5x speedup.**Q: What if my dataset doesn't fit in RAM at all?**A: Use streaming optimization with `LDMemoryConfig(use_streaming=True)`. Streaming processes data in batches and can handle unlimited dataset sizes with zero accuracy loss.**Q: How do I monitor long-running fits?**A: Set `show_progress=True` when calling `fit_large_dataset()` or `curve_fit_large()`. This provides real-time progress updates showing iteration count and current objective value.**Q: Should I always use `curve_fit_large()` instead of `curve_fit()`?**A: For small datasets (<100K points), regular `curve_fit()` is simpler and equally fast. Use `curve_fit_large()` when you have >100K points or want automatic dataset size detection.[Complete FAQ](../../docs/faq.md)---

## üîó Related Resources**Build on this knowledge:**- [GPU Optimization Deep Dive](../03_advanced/gpu_optimization_deep_dive.ipynb) - Maximize GPU performance- [Performance Optimization Demo](performance_optimization_demo.ipynb) - General optimization strategies- [Streaming Tutorials](../06_streaming/) - Production streaming workflows**Alternative approaches:**- [NLSQ Quickstart](../01_getting_started/nlsq_quickstart.ipynb) - For small datasets (<100K points)- [Custom Algorithms Advanced](../03_advanced/custom_algorithms_advanced.ipynb) - When standard algorithms don't converge**Feature demos:**- [Callbacks Demo](../05_feature_demos/callbacks_demo.ipynb) - Monitor optimization progress- [Enhanced Error Messages](../05_feature_demos/enhanced_error_messages_demo.ipynb) - Debug fitting issues**References:**- [API Documentation - Large Dataset Functions](https://nlsq.readthedocs.io/en/latest/api.html#large-dataset-fitting)- [Memory Management Guide](https://nlsq.readthedocs.io/en/latest/guides/memory.html)- [Performance Benchmarks](https://nlsq.readthedocs.io/en/latest/benchmarks.html)---

## üìö Technical Glossary**Chunking:** Dividing a large dataset into smaller batches that fit in memory, processing each batch separately, and combining results using an exponential moving average algorithm.**Streaming optimization:** Processing data in sequential batches using mini-batch gradient descent. Handles unlimited dataset sizes with zero accuracy loss.**Memory estimation:** Predicting memory requirements before fitting by calculating data array sizes, Jacobian matrix size, and JAX compilation overhead.**Exponential moving average (EMA):** Algorithm used in chunking to combine gradients from different chunks with decaying weights, achieving <1% error for well-conditioned problems.**JIT compilation:** Just-In-Time compilation by JAX that converts Python functions to optimized machine code on first use. Subsequent calls reuse the compiled code for 100-300x speedup.**Context manager:** Python construct (`with` statement) that automatically manages resource setup and cleanup, used for temporary configuration changes.**Well-conditioned problem:** Optimization problem where the objective function is smooth, has a clear minimum, and small parameter changes lead to proportional objective changes.**Ill-conditioned problem:** Optimization problem with steep gradients, multiple local minima, or high sensitivity to parameter changes. Benefits from streaming (zero accuracy loss) over chunking.**Auto-detection:** NLSQ feature that automatically detects dataset size and chooses optimal processing strategy (single-pass, chunked, or streaming).**Mixed precision fallback:** Memory optimization technique that uses float32 instead of float64 when memory is constrained, trading slight accuracy for 50% memory reduction.[Complete glossary](../../docs/glossary.md)