# Getting started
Kooplearn is a machine learning library designed for a specific purpose: implement state-of-the-art algorithmes to learn Koopman/Transfer operators of dynamical systems from data. For a user-friendly introduction to the operatorial perspective on dynamical systems check out [this blog post [TODO]](). 

In this article we will go through a typical training and inference pipeline in `kooplearn`, experimenting with the _noisy logistic map_, a one-dimensional dynamical system defined by the relation

$$
    x_{t + 1} = (4x_{t}(1 - x_{t}) + \xi_{t}) \mod 1.
$$

Here $\xi_{t}$ is just a noise term with density $\propto \cos^{N}(x)$, $N$ being an even integer and $x \in [-0.5, 0.5]$. {footcite:t}`Kostic2022` reported a full characterization of the transfer operator of the noisy logistic map. In particular, the transfer operator has rank $N + 1$, it is _non-normal_, and its eigenvalues and eigenfunctions can be computed with arbitrary precision. In `kooplearn` we provide an implementation of the noisy logistic map in {class}`kooplearn.datasets.LogisticMap`.

## Generating and splitting the data

In [1]:
from kooplearn.datasets import LogisticMap

# Defining the number of samples for each data split
train_samples = 1000 
val_samples = 200
test_samples = 1000

logmap = LogisticMap(N = 20, rng_seed = 0) # Setting the rng_seed for reproducibility

initial_condition = 0.5 # Setting the initial condition x_0 to start sampling the map

datasets = {
    'train': logmap.sample(initial_condition, train_samples),
    'val': logmap.sample(initial_condition, val_samples),
    'test': logmap.sample(initial_condition, test_samples)
}

for split, ds in datasets.items():
    print(f"{split.capitalize()} split has shape {ds.shape}")

Okay, we have a numpy array of shape `(split_samples + 1, 1)` for each split, stored in the `datasets` dictionary. Notice that even if we requested `split_samples` the `sample` function returned one extra sample. This is no accident, the "extra" sample is actually the initial condition, and calling `logistic.sample` we have appended `split_samples` new fresh points to it.

## Learning the transfer operator on a dictionary of functions

We recall that the transfer operator $\mathcal{T}$ is a mathematical object that maps any function $f$ to its expected value after one step of the dynamics:

$$
(\mathcal{T}f)(x) := \mathbb{E}[f(X_t + 1) | X_t = x].
$$

A popular class of methods to learn $\mathcal{T}$ from data follows from the observation that the action of $\mathcal{T}$ on the span of a _finite_ number of functions $(\phi_{i})_{i = 1}^{d}$ is described by a $d\times d$ matrix, which can be learned from data via standard regression algorithms. If the chosen set of functions $(\phi_{i})_{i = 1}^{d}$ is rich enough, we should have a decent approximation of the transfer operator. We refer to {footcite:t}`Kostic2022` for a mathematically rigorous exposition of these topics. 

This strategy is implemented in the {class}`kooplearn.models.ExtendedDMD` model, which requires as input the functions $(\phi_{i})_{i = 1}^{d}$ specified as a feature map $x \mapsto  (\phi_{i}(x))_{i = 1}^{d}$. The choiche of feature map is highly problem dependent and `kooplearn` exposes an abstract base class {class}`kooplearn.abc.FeatureMap` as a blueprint. For the noisy logistic map a sensible option is a set of orthogonal polynomials, such as the [Chebyshev polynomials of the first kind](https://en.wikipedia.org/wiki/Chebyshev_polynomials). Let's implement that by subclassing {class}`kooplearn.abc.FeatureMap`

In [2]:
import numpy as np
import scipy.special
from kooplearn.abc import FeatureMap

class ChebyshevPoly(FeatureMap):
    def __init__(self, max_order: int = 10):
        self.max_order = max_order # Will take polynomials up to order max_order (excluded)
    
    def __call__(self, data: np.ndarray):
        x = 2 * data - 1 # Transforms the input data defined on [0, 1] to the interval [-1, 1] over which the Chebyshev polynomials are defined.
        phi = np.concatenate([scipy.special.eval_chebyt(n, x) for n in range(self.max_order)], axis=-1)
        return phi

Let's instantiate the feature map and test how it acts on one of our data splits.

In [3]:
feature_map = ChebyshevPoly(max_order = 4)

for split, ds in datasets.items():
    print(f"The featurized {split} split has shape {feature_map(ds).shape}")

The featurized train split has shape (1001, 4)
The featurized val split has shape (201, 4)
The featurized test split has shape (1001, 4)


```{footbibliography}
```