# Numerical integration and extinction rates in large random ecosystems

Requirements
+ matplotlib
+ numpy
+ scipy


**A note on cloud-hosted notebooks.** If you are running a notebook on a cloud provider, such as Google Colab or CodeOcean, remember to save your work frequently. Cloud notebooks will occasionally restart after a fixed duration, crash, or prolonged inactivity, requiring you to re-run code.

<!-- [Click here to open this notebook in Colab](https://colab.research.google.com/github/williamgilpin/cphy/blob/main/hw/lotka_volterra.ipynb) -->
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/williamgilpin/cphy/blob/main/hw/pendulum_sindy.ipynb)



In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

<!-- <img src="../resources/lynx_tom_bech_ccby3.jpeg" style="max-width:40%; height:auto;"> -->

<!-- *Image from Tom Bech, CC BY 3.0 <https://creativecommons.org/licenses/by/3.0>, via Wikimedia Commons* -->

## Learning equations from data

The growing field of scientific machine learning aims to discover interpretable and predictable models of physical systems directly from observations. 



### To Do

*Please complete the following tasks and answer the included questions. You can edit a Markdown cell in Jupyter by double-clicking on it. To return the cell to its formatted form, press `[Shift]+[Enter]`.*

1. Implement SINDy with a library of candidate terms.

```
    Your Answer: complete the code below
```

2. Try varying the size of the candidate library.

```
    Your Answer: complete the code below
```

3. Try modifying your candidate library to include a term that describes the damping of the pendulum.

```
    Your Answer: 
```

4.  The phenomenon you are observing is known as multicollinearity, and it occurs when there is a degeneracy among candidate models for a given dataset.

```
    Your Answer: 
```

In [None]:
import numpy as np
from itertools import combinations_with_replacement
from sklearn.linear_model import Lasso

class SINDyLasso:
    """
    Minimal SINDy using scikit-learn's Lasso for sparse regression.

    Args:
        poly_order (int): Maximum total degree of polynomial features (>=1).
        alpha (float): L1 regularization strength passed to sklearn.linear_model.Lasso.
        include_bias (bool): Include constant feature (1).
        max_iter (int): Max iterations for Lasso solver.
        tol (float): Tolerance for Lasso convergence.

    Attributes:
        coef_ (ndarray): Coefficient matrix (n_features x n_states).
        feature_names_ (list[str]): Names of library features, length n_features.
        poly_order (int)
        alpha (float)
        include_bias (bool)
        max_iter (int)
        tol (float)
    """
    def __init__(self, poly_order=3, alpha=1e-2, include_bias=True, max_iter=10000, tol=1e-6):
        self.poly_order = int(poly_order)
        self.alpha = float(alpha)
        self.include_bias = bool(include_bias)
        self.max_iter = int(max_iter)
        self.tol = float(tol)
        self.coef_ = None
        self.feature_names_ = None

    def _poly_library(self, X):
        """Polynomial library up to poly_order (with optional bias)."""
        n, d = X.shape
        cols, names = [], []
        if self.include_bias:
            cols.append(np.ones((n, 1)))
            names.append("1")
        cols.append(X)
        names += [f"x{i+1}" for i in range(d)]
        for deg in range(2, self.poly_order + 1):
            for idxs in combinations_with_replacement(range(d), deg):
                cols.append(np.prod(X[:, idxs], axis=1, keepdims=True))
                names.append("*".join([f"x{j+1}" for j in idxs]))
        Theta = np.hstack(cols)
        return Theta, names

    def _finite_difference(self, X, t):
        """Use np.gradient with uniform dt; second-order edges."""
        t = np.asarray(t)
        if t.ndim != 1 or len(t) != len(X):
            raise ValueError("t must be 1D and match X length.")
        dt = np.diff(t)
        if not np.allclose(dt, dt[0]):
            raise ValueError("This minimal version requires uniform sampling in t.")
        dXdt = np.gradient(np.asarray(X, float), dt[0], axis=0, edge_order=2)
        return dXdt

    def fit(self, X):
        """
        Fit sparse RHS x_dot = Θ(X) Ξ via independent Lasso regressions per state.

        Args:
            X (ndarray): (n_samples, n_states) time series.

        Returns:
            self
        """
        X = np.asarray(X, float)

        dXdt = self._finite_difference(X, t)
        Theta, names = self._poly_library(X)
        n_states = X.shape[1]
        Xi = np.zeros((Theta.shape[1], n_states))
        for k in range(n_states):
            model = Lasso(alpha=self.alpha, fit_intercept=False, max_iter=self.max_iter, tol=self.tol)
            model.fit(Theta, dXdt[:, k])
            Xi[:, k] = model.coef_

        self.coef_ = Xi
        self.feature_names_ = names
        return self

    def predict_rhs(self, X):
        """
        Args:
            X (ndarray): (n_samples, n_states)

        Returns:
            ndarray: (n_samples, n_states) predicted time derivatives.
        """
        if self.coef_ is None:
            raise RuntimeError("Call fit() first.")
        Theta, _ = self._poly_library(np.asarray(X, float))
        return Theta @ self.coef_

    def print_model(self):
        """
        Returns:
            list[str]: Human-readable RHS for each state.
        """
        if self.coef_ is None:
            raise RuntimeError("Call fit() first.")
        eqs = []
        for k in range(self.coef_.shape[1]):
            terms = [f"{c:.6g}*{n}" for c, n in zip(self.coef_[:, k], self.feature_names_) if abs(c) > 0]
            rhs = " + ".join(terms) if terms else "0"
            eqs.append(f"dx{k+1}/dt = {rhs}")
        return eqs


# Example usage


### Test and use your code

+ You don't need to write any code below, these cells are just to confirm that everything is working and to play with your implementation
+ If you are working from a local fork of the entire course, then you already have access to the solutions. In this case, make sure to `git pull` to make sure that you are up-to-date (save your work first).
+ If you are working from a single downloaded notebook, or are working in Google Colab, then you will need to manually download the solutions file from the course repository. The lines below will do this for you.

In [None]:
import os
import requests
# Check if the "solutions" directory exists. If not, create it and download the solution file
if not os.path.exists('solutions'):
    os.makedirs('solutions')
else:
    print('Directory "solutions" already exists. Skipping creation.')

# Now download the solution file into the directory we just created
url = 'https://raw.githubusercontent.com/williamgilpin/cphy/main/hw/solutions/allencahn_spectral.py'
response = requests.get(url)
file_path = os.path.join('solutions', 'sandpile.py')
with open(file_path, 'wb') as file:
    file.write(response.content)
print(f'File saved to {file_path}')
# Now download the solution file into the directory we just created
url = 'https://raw.githubusercontent.com/williamgilpin/cphy/main/hw/solutions/allencahn.py'
response = requests.get(url)
file_path = os.path.join('solutions', 'sandpile.py')
with open(file_path, 'wb') as file:
    file.write(response.content)
print(f'File saved to {file_path}')