# Getting started
Kooplearn is a machine learning library designed for a specific purpose: implement state-of-the-art algorithmes to learn Koopman/Transfer operators of dynamical systems from data. For a user-friendly introduction to the operatorial perspective on dynamical systems check out [this blog post [TODO]](). 

In this article we will go through a typical training and inference pipeline in `kooplearn`, experimenting with the _noisy logistic map_, a one-dimensional dynamical system defined by the relation

$$
    x_{t + 1} = (4x_{t}(1 - x_{t}) + \xi_{t}) \mod 1.
$$

Here $\xi_{t}$ is just a noise term with density $\propto \cos^{N}(x)$, $N$ being an even integer and $x \in [-0.5, 0.5]$. {footcite:t}`Kostic2022` reported a full characterization of the transfer operator of the noisy logistic map. In particular, the transfer operator has rank $N + 1$, it is _non-normal_, and its eigenvalues and eigenfunctions can be computed with arbitrary precision. In `kooplearn` we provide an implementation of the noisy logistic map in {class}`kooplearn.datasets.LogisticMap`.

## Generating and splitting the data

In [1]:
from kooplearn.datasets import LogisticMap

# Defining the number of samples for each data split
train_samples = 1000 
val_samples = 200
test_samples = 1000

logmap = LogisticMap(N = 20, rng_seed = 0) # Setting the rng_seed for reproducibility

initial_condition = 0.5 # Setting the initial condition x_0 to start sampling the map

datasets = {
    'train': logmap.sample(initial_condition, train_samples),
    'val': logmap.sample(initial_condition, val_samples),
    'test': logmap.sample(initial_condition, test_samples)
}

for split, ds in datasets.items():
    print(f"{split.capitalize()} split has shape {ds.shape}")

Okay, we have a numpy array of shape `(split_samples + 1, 1)` for each split, stored in the `datasets` dictionary. Notice that even if we requested `split_samples` the `sample` function returned one extra sample. This is no accident, the "extra" sample is actually the initial condition, and calling `logistic.sample` we have appended `split_samples` new fresh points to it.

## Learning the transfer operator on a dictionary of functions

We recall that the transfer operator $\mathcal{T}$ is mathematical object that maps any function $f$ to its expected value after one step of the dynamics, given that it was in $x$:

$$
(\mathcal{T}f)(x) := \mathbb{E}[f(X_t + 1) | X_t = x].
$$

A popular class of methods to learn $\mathcal{T}$ from data follows from the simple observation that the action of $\mathcal{T}$ on the span of a _finite_ number of functions $(\phi_{i})_{i = 1}^{d}$ is described by a $d\times d$ matrix, which can be learned from data. If the chosen dictionary of functions $(\phi_{i})_{i = 1}^{d}$ is rich enough, we will have a decent approximation of the transfer operator. We refer to {footcite:t}`Kostic2022` for a mathematically rigorous exposition of these topics. 

This strategy is implemented in the {class}`kooplearn.models.ExtendedDMD` model, which needs at least the dictionary of functions $(\phi_{i})_{i = 1}^{d}$ to be trained. For the noisy logistic map a sensible choice is any orthogonal polynomial up to order $d$

In [None]:
from kooplearn.models import ExtendedDMD

```{footbibliography}
```