# 07_EWSINDy_STLSQ - Physics-SR Framework v3.0

## Stage 2.2b: E-WSINDy with STLSQ (Weak-form Sparse Regression)

**Author:** Zhengze Zhang  
**Affiliation:** Department of Statistics, Columbia University  
**Date:** January 2026

---

### Purpose

Noise-robust equation discovery via weak-form sparse regression. This approach transfers derivatives from noisy data to smooth test functions, achieving 50-1000x noise improvement.

### Weak Form Theory

**Strong form** (noise-sensitive):
$$\frac{\partial q}{\partial t} = f(q, \nabla q, \nabla^2 q)$$

**Weak form** (noise-robust): Multiply by test function $\psi$ and integrate by parts:
$$\int \psi \cdot \nabla^2 q \, dx = -\int \nabla\psi \cdot \nabla q \, dx + \text{boundary terms}$$

**Result:** Derivatives transferred from noisy data $q$ to smooth test function $\psi$.

### Noise Improvement

- Strong form variance: $\text{Var}(\dot{x}_{FD}) = \frac{2\sigma^2}{h^2}$
- Weak form variance: $\text{Var}(b) = \sigma^2 \|\dot{\psi}\|_2^2$
- **Improvement factor: 50-1000x**

### STLSQ Algorithm

Sequentially Thresholded Least Squares achieves **exact sparsity** (true zeros) unlike LASSO:

1. Initialize with OLS solution
2. Threshold small coefficients to zero
3. Refit OLS on remaining support
4. Repeat until convergence

### Reference

- Messenger, D. A., & Bortz, D. M. (2021). Weak SINDy for partial differential equations. *Journal of Computational Physics*, 443, 110525.
- Brunton, S. L., et al. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. *PNAS*, 113(15), 3932-3937.

---
## Section 1: Header and Imports

In [None]:
"""
07_EWSINDy_STLSQ.ipynb - Ensemble Weak-form SINDy with STLSQ
=============================================================

Three-Stage Physics-Informed Symbolic Regression Framework v3.0

This module provides:
- EWSINDySTLSQ: Weak-form SINDy with STLSQ sparse regression
- Gaussian bump test functions for weak form transformation
- 50-1000x noise robustness improvement over finite differences
- Exact sparsity (true zeros) via iterative thresholding

Algorithm:
    1. Generate smooth test functions (Gaussian bumps)
    2. Apply weak form transformation via numerical integration
    3. Run STLSQ: threshold -> refit -> repeat
    4. Return sparse coefficient vector

Author: Zhengze Zhang
Affiliation: Department of Statistics, Columbia University
"""

# Import core module
%run 00_Core.ipynb

In [None]:
# Additional imports for E-WSINDy
from scipy import integrate
from scipy.interpolate import interp1d
from sklearn.linear_model import Lasso, Ridge
from typing import Dict, List, Tuple, Optional, Any

print("07_EWSINDy_STLSQ: Additional imports successful.")

---
## Section 2: Class Definition

In [None]:
# ==============================================================================
# E-WSINDY STLSQ CLASS
# ==============================================================================

class EWSINDySTLSQ:
    """
    Ensemble Weak-form SINDy with STLSQ.
    
    Implements noise-robust equation discovery via:
    1. Weak form transformation (transfers derivatives to smooth test functions)
    2. STLSQ sparse regression (achieves exact sparsity)
    
    The weak form provides 50-1000x noise improvement by avoiding direct
    differentiation of noisy data. STLSQ provides unbiased coefficient
    estimates with exact zeros (unlike LASSO which has shrinkage bias).
    
    Attributes
    ----------
    threshold : float
        STLSQ sparsity threshold (default: 0.1)
    max_iter : int
        Maximum STLSQ iterations (default: 20)
    n_test_functions : int
        Number of test functions (default: 50)
    test_function_type : str
        Type of test function: 'gaussian' or 'polynomial'
    test_function_width : float
        Width parameter for test functions (default: 0.1)
    
    Examples
    --------
    >>> model = EWSINDySTLSQ(threshold=0.1)
    >>> result = model.fit(feature_library, y, feature_names=names)
    >>> print(result['equation'])
    """
    
    def __init__(
        self,
        threshold: float = DEFAULT_STLSQ_THRESHOLD,
        max_iter: int = DEFAULT_STLSQ_MAX_ITER,
        n_test_functions: int = 50,
        test_function_type: str = 'gaussian',
        test_function_width: float = 0.1,
        use_weak_form: bool = True
    ):
        """
        Initialize EWSINDySTLSQ.
        
        Parameters
        ----------
        threshold : float
            Coefficients with |value| < threshold are set to zero.
            Default: 0.1
        max_iter : int
            Maximum number of STLSQ iterations.
            Default: 20
        n_test_functions : int
            Number of test functions for weak form.
            Default: 50
        test_function_type : str
            'gaussian' for Gaussian bumps, 'polynomial' for polynomial.
            Default: 'gaussian'
        test_function_width : float
            Width of Gaussian bumps (as fraction of domain).
            Default: 0.1
        use_weak_form : bool
            Whether to use weak form (True) or standard form (False).
            Default: True
        """
        self.threshold = threshold
        self.max_iter = max_iter
        self.n_test_functions = n_test_functions
        self.test_function_type = test_function_type
        self.test_function_width = test_function_width
        self.use_weak_form = use_weak_form
        
        # Internal state
        self._coefficients = None
        self._support = None
        self._feature_names = None
        self._n_features = None
        self._n_iterations = 0
        self._convergence_history = []
        self._fit_complete = False
        self._r2_score = None
        self._mse = None
    
    def fit(
        self,
        feature_library: np.ndarray,
        y: np.ndarray,
        t: np.ndarray = None,
        feature_names: List[str] = None
    ) -> Dict[str, Any]:
        """
        Fit the E-WSINDy model.
        
        Parameters
        ----------
        feature_library : np.ndarray
            Feature library matrix of shape (n_samples, n_features)
        y : np.ndarray
            Target vector of shape (n_samples,)
        t : np.ndarray, optional
            Time vector. If None, uses uniform spacing.
        feature_names : List[str], optional
            Names of features in library
        
        Returns
        -------
        Dict[str, Any]
            Dictionary containing:
            - coefficients: Sparse coefficient vector
            - support: Boolean mask of active terms
            - equation: String representation of equation
            - n_active_terms: Number of non-zero coefficients
            - n_iterations: Number of STLSQ iterations
            - r2_score: R-squared on training data
            - mse: Mean squared error
        """
        n_samples, n_features = feature_library.shape
        self._n_features = n_features
        
        # Set feature names
        if feature_names is None:
            self._feature_names = [f'f{i}' for i in range(n_features)]
        else:
            self._feature_names = list(feature_names)
        
        # Generate time vector if not provided
        if t is None:
            t = self._generate_time_vector(n_samples)
        
        # Apply weak form transformation if enabled
        if self.use_weak_form:
            Q, b = self._weak_form_transform(feature_library, y, t)
        else:
            # Standard form: direct regression
            Q = feature_library
            b = y
        
        # Run STLSQ
        self._coefficients = self._stlsq_iteration(Q, b)
        self._support = np.abs(self._coefficients) > 0
        
        # Compute metrics
        y_pred = feature_library @ self._coefficients
        self._mse = np.mean((y - y_pred)**2)
        ss_tot = np.sum((y - np.mean(y))**2)
        ss_res = np.sum((y - y_pred)**2)
        self._r2_score = 1 - ss_res / ss_tot if ss_tot > 0 else 0.0
        
        self._fit_complete = True
        
        return {
            'coefficients': self._coefficients,
            'support': self._support,
            'equation': self.get_equation(),
            'n_active_terms': int(np.sum(self._support)),
            'n_iterations': self._n_iterations,
            'r2_score': self._r2_score,
            'mse': self._mse,
            'convergence_history': self._convergence_history,
            'threshold': self.threshold
        }
    
    def _generate_time_vector(
        self,
        n_samples: int
    ) -> np.ndarray:
        """
        Generate uniform time vector.
        
        Parameters
        ----------
        n_samples : int
            Number of samples
        
        Returns
        -------
        np.ndarray
            Time vector from 0 to 1
        """
        return np.linspace(0, 1, n_samples)
    
    def _generate_test_functions(
        self,
        t: np.ndarray
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate test functions and their derivatives.
        
        Parameters
        ----------
        t : np.ndarray
            Time vector
        
        Returns
        -------
        Tuple[np.ndarray, np.ndarray]
            - psi: Test functions of shape (n_test, n_samples)
            - dpsi: Derivatives of shape (n_test, n_samples)
        """
        n_samples = len(t)
        t_min, t_max = t.min(), t.max()
        t_range = t_max - t_min
        
        # Centers for test functions (avoid boundaries)
        centers = np.linspace(
            t_min + 0.1 * t_range,
            t_max - 0.1 * t_range,
            self.n_test_functions
        )
        
        width = self.test_function_width * t_range
        
        psi = np.zeros((self.n_test_functions, n_samples))
        dpsi = np.zeros((self.n_test_functions, n_samples))
        
        for m, center in enumerate(centers):
            if self.test_function_type == 'gaussian':
                psi[m], dpsi[m] = self._gaussian_bump(t, center, width)
            else:
                psi[m], dpsi[m] = self._polynomial_bump(t, center, width)
        
        return psi, dpsi
    
    def _gaussian_bump(
        self,
        t: np.ndarray,
        center: float,
        width: float
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate Gaussian bump test function.
        
        psi(t) = exp(-(t - center)^2 / (2 * width^2))
        dpsi(t) = -(t - center) / width^2 * psi(t)
        
        Parameters
        ----------
        t : np.ndarray
            Time vector
        center : float
            Center of Gaussian
        width : float
            Width (standard deviation)
        
        Returns
        -------
        Tuple[np.ndarray, np.ndarray]
            (psi, dpsi)
        """
        z = (t - center) / width
        psi = np.exp(-0.5 * z**2)
        dpsi = -z / width * psi
        return psi, dpsi
    
    def _polynomial_bump(
        self,
        t: np.ndarray,
        center: float,
        width: float
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate polynomial bump test function.
        
        Uses (1 - ((t-center)/width)^2)^4 for compact support.
        
        Parameters
        ----------
        t : np.ndarray
            Time vector
        center : float
            Center of bump
        width : float
            Half-width of support
        
        Returns
        -------
        Tuple[np.ndarray, np.ndarray]
            (psi, dpsi)
        """
        z = (t - center) / width
        mask = np.abs(z) < 1
        
        psi = np.zeros_like(t)
        dpsi = np.zeros_like(t)
        
        psi[mask] = (1 - z[mask]**2)**4
        dpsi[mask] = -8 * z[mask] / width * (1 - z[mask]**2)**3
        
        return psi, dpsi
    
    def _weak_form_transform(
        self,
        Phi: np.ndarray,
        y: np.ndarray,
        t: np.ndarray
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Apply weak form transformation.
        
        Q[m,k] = integral(psi_m * Phi_k) dt
        b[m] = -integral(dpsi_m * y) dt  (integration by parts)
        
        Parameters
        ----------
        Phi : np.ndarray
            Feature library (n_samples, n_features)
        y : np.ndarray
            Target vector (n_samples,)
        t : np.ndarray
            Time vector (n_samples,)
        
        Returns
        -------
        Tuple[np.ndarray, np.ndarray]
            (Q, b) - Weak form matrices
        """
        n_samples, n_features = Phi.shape
        
        # Generate test functions
        psi, dpsi = self._generate_test_functions(t)
        
        # Compute weak form matrices via numerical integration
        Q = np.zeros((self.n_test_functions, n_features))
        b = np.zeros(self.n_test_functions)
        
        dt = t[1] - t[0] if len(t) > 1 else 1.0
        
        for m in range(self.n_test_functions):
            # Q[m, k] = integral(psi_m * Phi_k)
            for k in range(n_features):
                Q[m, k] = np.trapz(psi[m] * Phi[:, k], t)
            
            # b[m] = -integral(dpsi_m * y) (integration by parts)
            b[m] = -np.trapz(dpsi[m] * y, t)
        
        return Q, b
    
    def _stlsq_iteration(
        self,
        Q: np.ndarray,
        b: np.ndarray
    ) -> np.ndarray:
        """
        Sequentially Thresholded Least Squares (STLSQ).
        
        Algorithm:
            1. Initialize with OLS solution
            2. Threshold small coefficients to zero
            3. Refit OLS on remaining support
            4. Repeat until convergence
        
        This achieves exact sparsity (true zeros) unlike LASSO.
        
        Parameters
        ----------
        Q : np.ndarray
            Design matrix (n_equations, n_features)
        b : np.ndarray
            Target vector (n_equations,)
        
        Returns
        -------
        np.ndarray
            Sparse coefficient vector
        """
        n_features = Q.shape[1]
        self._convergence_history = []
        
        # Step 1: Initialize with regularized OLS (for stability)
        try:
            # Use Ridge regression for initial estimate
            ridge = Ridge(alpha=1e-6, fit_intercept=False)
            ridge.fit(Q, b)
            xi = ridge.coef_
        except Exception:
            xi = np.linalg.lstsq(Q, b, rcond=None)[0]
        
        self._convergence_history.append(xi.copy())
        
        for iteration in range(self.max_iter):
            xi_old = xi.copy()
            
            # Step 2: Threshold small coefficients
            support = np.abs(xi) > self.threshold
            
            if np.sum(support) == 0:
                # All coefficients below threshold - return zeros
                self._n_iterations = iteration + 1
                return np.zeros(n_features)
            
            # Step 3: Refit on support (unbiased estimate)
            Q_support = Q[:, support]
            try:
                xi_support = np.linalg.lstsq(Q_support, b, rcond=None)[0]
            except np.linalg.LinAlgError:
                # Singular matrix - use Ridge
                ridge = Ridge(alpha=1e-6, fit_intercept=False)
                ridge.fit(Q_support, b)
                xi_support = ridge.coef_
            
            # Update full coefficient vector
            xi = np.zeros(n_features)
            xi[support] = xi_support
            
            self._convergence_history.append(xi.copy())
            
            # Check convergence
            if np.allclose(xi, xi_old, rtol=1e-6, atol=1e-10):
                self._n_iterations = iteration + 1
                break
        else:
            self._n_iterations = self.max_iter
        
        return xi
    
    def get_equation(
        self,
        feature_names: List[str] = None
    ) -> str:
        """
        Get string representation of discovered equation.
        
        Parameters
        ----------
        feature_names : List[str], optional
            Feature names to use (defaults to stored names)
        
        Returns
        -------
        str
            Equation string
        """
        if self._coefficients is None:
            return ""
        
        names = feature_names or self._feature_names
        
        terms = []
        for i, (coef, name) in enumerate(zip(self._coefficients, names)):
            if abs(coef) > 1e-10:  # Non-zero coefficient
                if coef >= 0 and len(terms) > 0:
                    terms.append(f"+ {coef:.6f}*{name}")
                else:
                    terms.append(f"{coef:.6f}*{name}")
        
        if len(terms) == 0:
            return "0"
        
        return " ".join(terms)
    
    def predict(
        self,
        Phi_new: np.ndarray
    ) -> np.ndarray:
        """
        Predict using discovered equation.
        
        Parameters
        ----------
        Phi_new : np.ndarray
            Feature library for new data
        
        Returns
        -------
        np.ndarray
            Predictions
        """
        if self._coefficients is None:
            raise ValueError("Must call fit() before predict()")
        return Phi_new @ self._coefficients
    
    def get_active_terms(
        self
    ) -> List[Tuple[str, float]]:
        """
        Get list of active terms with coefficients.
        
        Returns
        -------
        List[Tuple[str, float]]
            List of (name, coefficient) tuples for active terms
        """
        if not self._fit_complete:
            raise ValueError("Must call fit() first")
        
        active = []
        for i, (coef, name) in enumerate(zip(self._coefficients, self._feature_names)):
            if self._support[i]:
                active.append((name, float(coef)))
        
        return active
    
    def print_stlsq_report(self) -> None:
        """
        Print detailed STLSQ report.
        """
        if not self._fit_complete:
            print("Fit not yet performed. Call fit() first.")
            return
        
        print("=" * 70)
        print(" E-WSINDy STLSQ Results")
        print("=" * 70)
        print()
        print(f"Configuration:")
        print(f"  Threshold: {self.threshold}")
        print(f"  Max iterations: {self.max_iter}")
        print(f"  Test functions: {self.n_test_functions}")
        print(f"  Test function type: {self.test_function_type}")
        print(f"  Weak form: {self.use_weak_form}")
        print()
        print(f"Results:")
        print(f"  Iterations: {self._n_iterations}")
        print(f"  Active terms: {int(np.sum(self._support))} / {self._n_features}")
        print(f"  R-squared: {self._r2_score:.6f}")
        print(f"  MSE: {self._mse:.6e}")
        print()
        print("-" * 70)
        print(" Discovered Equation:")
        print("-" * 70)
        print(f"  {self.get_equation()}")
        print()
        print("-" * 70)
        print(" Active Terms:")
        print("-" * 70)
        print(f"  {'Term':<30} {'Coefficient':<15}")
        print("  " + "-" * 45)
        for name, coef in self.get_active_terms():
            print(f"  {name:<30} {coef:<15.6f}")
        print()
        print("=" * 70)

---
## Section 3: Internal Tests

In [None]:
# ==============================================================================
# TEST CONTROL FLAG
# ==============================================================================

_RUN_TESTS = False  # Set to True to run internal tests

if _RUN_TESTS:
    print("=" * 70)
    print(" RUNNING INTERNAL TESTS FOR 07_EWSINDy_STLSQ")
    print("=" * 70)

In [None]:
# ==============================================================================
# TEST 1: Clean Data Recovery
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 1: Clean Data Recovery")
    
    # Generate data with known sparse solution
    np.random.seed(42)
    n_samples = 200
    
    x = np.random.uniform(0.1, 2, n_samples)
    
    # True equation: y = 2*x + 0.5*x^2 (sparse: 2 out of many terms)
    y = 2*x + 0.5*x**2
    
    # Build feature library
    Phi = np.column_stack([
        np.ones(n_samples),  # 1
        x,                    # x
        x**2,                 # x^2
        x**3,                 # x^3
        np.sin(x)             # sin(x)
    ])
    feature_names = ['1', 'x', 'x^2', 'x^3', 'sin(x)']
    
    print(f"True equation: y = 2*x + 0.5*x^2")
    print(f"Library: {feature_names}")
    print()
    
    # Fit with standard form (no weak form for static data)
    model = EWSINDySTLSQ(
        threshold=0.1,
        use_weak_form=False  # Static data, not time series
    )
    result = model.fit(Phi, y, feature_names=feature_names)
    
    print(f"Discovered: {result['equation']}")
    print(f"Active terms: {result['n_active_terms']}")
    print(f"R-squared: {result['r2_score']:.6f}")
    
    # Check if correct terms were identified
    active = model.get_active_terms()
    active_names = [name for name, _ in active]
    
    if 'x' in active_names and 'x^2' in active_names:
        print("[PASS] Correct sparse structure recovered")
    else:
        print(f"[INFO] Active terms: {active_names}")

In [None]:
# ==============================================================================
# TEST 2: Noise Robustness
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 2: Noise Robustness")
    
    np.random.seed(42)
    n_samples = 300
    
    x = np.random.uniform(0.1, 2, n_samples)
    y_true = 1.5*x + 0.8*x**2
    
    Phi = np.column_stack([
        np.ones(n_samples),
        x,
        x**2,
        x**3
    ])
    feature_names = ['1', 'x', 'x^2', 'x^3']
    
    noise_levels = [0.01, 0.05, 0.1]
    
    print(f"True equation: y = 1.5*x + 0.8*x^2")
    print(f"{'Noise':<10} {'R2':<10} {'Active':<10} {'Equation'}")
    print("-" * 70)
    
    for noise in noise_levels:
        y = y_true + noise * np.std(y_true) * np.random.randn(n_samples)
        
        model = EWSINDySTLSQ(
            threshold=0.1,
            use_weak_form=False
        )
        result = model.fit(Phi, y, feature_names=feature_names)
        
        eq = result['equation'][:40] + "..." if len(result['equation']) > 40 else result['equation']
        print(f"{noise:<10.2f} {result['r2_score']:<10.4f} {result['n_active_terms']:<10} {eq}")

In [None]:
# ==============================================================================
# TEST 3: STLSQ Convergence
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 3: STLSQ Convergence")
    
    np.random.seed(42)
    n_samples = 200
    
    x = np.random.uniform(0.1, 2, n_samples)
    y = 3*x + 2*x**2 + 0.01*np.random.randn(n_samples)
    
    Phi = np.column_stack([np.ones(n_samples), x, x**2, x**3, x**4])
    feature_names = ['1', 'x', 'x^2', 'x^3', 'x^4']
    
    model = EWSINDySTLSQ(threshold=0.1, max_iter=20, use_weak_form=False)
    result = model.fit(Phi, y, feature_names=feature_names)
    
    print(f"Converged in {result['n_iterations']} iterations")
    print()
    print("Coefficient evolution:")
    print(f"{'Iter':<6} {'1':<10} {'x':<10} {'x^2':<10} {'x^3':<10} {'x^4':<10}")
    print("-" * 60)
    
    for i, coefs in enumerate(result['convergence_history'][:5]):
        print(f"{i:<6} {coefs[0]:<10.4f} {coefs[1]:<10.4f} {coefs[2]:<10.4f} {coefs[3]:<10.4f} {coefs[4]:<10.4f}")
    
    if result['n_iterations'] < model.max_iter:
        print(f"\n[PASS] STLSQ converged before max iterations")
    else:
        print(f"\n[INFO] Reached max iterations")

In [None]:
# ==============================================================================
# TEST 4: Threshold Sensitivity
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 4: Threshold Sensitivity")
    
    np.random.seed(42)
    n_samples = 200
    
    x = np.random.uniform(0.1, 2, n_samples)
    y = 2*x + 0.5*x**2 + 0.01*np.random.randn(n_samples)
    
    Phi = np.column_stack([np.ones(n_samples), x, x**2, x**3])
    feature_names = ['1', 'x', 'x^2', 'x^3']
    
    thresholds = [0.01, 0.05, 0.1, 0.2, 0.5]
    
    print(f"True: y = 2*x + 0.5*x^2")
    print(f"{'Threshold':<12} {'Active':<10} {'R2':<10} {'Equation'}")
    print("-" * 70)
    
    for thresh in thresholds:
        model = EWSINDySTLSQ(threshold=thresh, use_weak_form=False)
        result = model.fit(Phi, y, feature_names=feature_names)
        
        eq = result['equation'][:30] if len(result['equation']) > 30 else result['equation']
        print(f"{thresh:<12.2f} {result['n_active_terms']:<10} {result['r2_score']:<10.4f} {eq}")
    
    print()
    print("Note: Threshold 0.05-0.1 typically works well")

In [None]:
# ==============================================================================
# TEST 5: Comparison with LASSO
# ==============================================================================

if _RUN_TESTS:
    print()
    print_section_header("Test 5: STLSQ vs LASSO")
    
    np.random.seed(42)
    n_samples = 200
    
    x = np.random.uniform(0.1, 2, n_samples)
    y = 2*x + 0.5*x**2 + 0.05*np.random.randn(n_samples)
    
    Phi = np.column_stack([np.ones(n_samples), x, x**2, x**3])
    feature_names = ['1', 'x', 'x^2', 'x^3']
    
    print(f"True: y = 2*x + 0.5*x^2")
    print()
    
    # STLSQ
    stlsq = EWSINDySTLSQ(threshold=0.1, use_weak_form=False)
    stlsq_result = stlsq.fit(Phi, y, feature_names=feature_names)
    
    # LASSO
    lasso = Lasso(alpha=0.01, fit_intercept=False)
    lasso.fit(Phi, y)
    
    print(f"STLSQ coefficients: {stlsq._coefficients}")
    print(f"LASSO coefficients: {lasso.coef_}")
    print()
    print(f"STLSQ exact zeros: {np.sum(stlsq._coefficients == 0)}")
    print(f"LASSO small (<0.01): {np.sum(np.abs(lasso.coef_) < 0.01)}")
    print()
    
    print("[INFO] STLSQ achieves exact zeros, LASSO has shrinkage bias")

---
## Section 4: Module Summary

In [None]:
# ==============================================================================
# MODULE SUMMARY
# ==============================================================================

print("=" * 70)
print(" 07_EWSINDy_STLSQ.ipynb - Module Summary")
print("=" * 70)
print()
print("CLASS: EWSINDySTLSQ")
print("-" * 70)
print()
print("Purpose:")
print("  Noise-robust equation discovery via weak-form sparse regression.")
print("  Achieves 50-1000x noise improvement over finite differences.")
print("  Uses STLSQ for exact sparsity (true zeros).")
print()
print("Main Methods:")
print("  fit(feature_library, y, t=None, feature_names=None)")
print("      Fit sparse regression model")
print("      Returns: dict with coefficients, equation, metrics")
print()
print("  get_equation()")
print("      Get string representation of discovered equation")
print()
print("  predict(Phi_new)")
print("      Make predictions using discovered equation")
print()
print("  get_active_terms()")
print("      Get list of active terms with coefficients")
print()
print("  print_stlsq_report()")
print("      Print detailed results report")
print()
print("Key Parameters:")
print("  threshold: STLSQ sparsity threshold (default: 0.1)")
print("  use_weak_form: Enable weak form (True for time series)")
print("  n_test_functions: Number of test functions (default: 50)")
print()
print("Usage Example:")
print("-" * 70)
print("""
# Build feature library (from 05_FeatureLibrary)
builder = FeatureLibraryBuilder(max_poly_degree=3)
Phi, names = builder.build(X, feature_names)

# Fit E-WSINDy with STLSQ
model = EWSINDySTLSQ(
    threshold=0.1,
    use_weak_form=False  # Set True for time series
)
result = model.fit(Phi, y, feature_names=names)

print(f"Equation: {result['equation']}")
print(f"R-squared: {result['r2_score']:.4f}")
""")
print()
print("=" * 70)
print("Module loaded successfully. Import via: %run 07_EWSINDy_STLSQ.ipynb")
print("=" * 70)