# Phd-teach-PhD: Approaching science with AI

**An introduction to scientific machine learning**

Guilherme Zagatti      
PhD candidate, NUS ISEP/IDS   
<gzagatti@u.nus.edu>   

# Outline

1. Motivation

    Dynamical systems describe changes we observe

2. Julia

    Intro to the language and ecosystem
    
3. A formal introduction to ODEs and SciML

    The initial value problem
        
    Fitting data to ODEs, classical approaches
    
4. NeuralODE

    The initial value problem reviewed
    
5. Exercises

6. References

# Motivation

**Dynamical systems describe changes we observe**

## "It is then by cause that we define time" (Poincaré 1903)

How do we learn about cause and effect if not by how we observe things evolving? 

**Dynamical systems** are an evolution rule that defines a trajectory as a map (Meiss 2007):

<br/>

$$
    \text{time} ↦ \text{set of states}
$$

<br/>

As a mathematics discipline, the study of dynamical systems originates at the end of 19th century spearheaded by [Henri Poincaré](https://plato.stanford.edu/entries/poincare/) who also developed an extensive theory of sciences.

## Physics: coupled oscillator

Describes the movement of an object with **mass $m$ attached to a spring under force $F$**.

The force follows **Hooke's law**, that is, it is proportional to the initial displacement.

<div style="display: flex;">
<div style="flex: 0 0 250px; margin-right: 10px;">
<img src="./assets/mass-spring.png" style="height: 250px;">
</div>
<div style="display: flex; align-items: center;">
$$
F = ma \Rightarrow
\frac{d^2 x}{dt^2} =-\frac{kx}{m}
$$
</div>
</div>

## Ecology: Lotka-Volterra

Describes the **dynamics of two-competing species**, the prey $x$ and predator $y$. Predator and prey come in contact to one another according to the proportion of each specie in the environment.

The prey is born with rate $\alpha$. The predator kills the prey with rate $\beta$. 

The predator needs to eat the prey to grows at rate $\delta$. The predator dies of natural cause at rate $\gamma$.

<div style="display: flex;">
<div style="flex: 0 0 250px; margin-right: 10px;">
<img src="./assets/fox-rabbit.png" style="height: 200px;">
</div>
<div style="display: flex; align-items: center;">
$$
\begin{aligned}
\frac{dx}{dt} &= \alpha x - \beta xy \\
\frac{dy}{dt} & = \delta xy - \gamma y
\end{aligned}
$$
</div>

## Epidemiology: SIR

Susceptible-infected-recovered: describes the **evolution of an infectious disease**. 

Susceptible and infected come in contact to one another according to the proportion of each type in the population. The disease is then transmitted with rate $\beta$. An infected person recovers with rate $\gamma$.

<div style="display: flex;">
<div style="flex: 0 0 250px; margin-right: 10px;">
<img src="./assets/sneeze.png" style="width: 250px">
</div>
<div style="display: flex; align-items: center;">
$$
\begin{aligned}
\frac{ds}{dt} &= - \beta si \\
\frac{di}{dt} & = \beta si - \gamma i \\
\frac{dr}{dt} &= \gamma i
\end{aligned}
$$
</div>

## Economics: Solow growth model

The production function $f(.)$ describes **economic growth as a function of $k$**. 

The rate of capital growth is constrained by the savings rate $s$ and the depreciation rate of capital $\delta$. 

<div style="column-count: 2; column-gap: 10px; display: flex;">
<div style="flex: 0 0 250px; margin-right: 10px;">
<img src="./assets/growth.png" style="width: 250px;">
</div>
<div style="display: flex; align-items: center;">
$$
\frac{dk}{dt} = s f(k) - \delta k
$$
</div>

## A common thread

All of the models presented describe how the **variables of interest** change as a **function of time** given a set of parameters.

Let our parameters of interest be $ x = (x_1 \;  x_2 \;  \dots \;  x_n) \in \mathbb{R}^n$ and $f: \mathbb{R}^n ↦ \mathbb{R}^n $.

We can express any **generic model** as following:

$$
\dot{x} = \frac{d x}{dt} = f(x)
$$

An key problem is whether any such models have a **unique solution**, that is:

> If we are given an initial set of parameters and an initial condition can we determine the state of the system at any point in the future?

## What is in a function?

Notice that we placed **no restrictions on $f(.)$**. So, it could really be anything.

> In practical terms, can we develop a method that is able to list every possible way in which an element from a set $X$ can be mapped to a single element of a set $Y$?

If set $X$ and $Y$ are small we can simply list all possible functions:

<div style="display: flex;">
<div style="flex: 0 0 50%; align-items: center; margin-right: 10px;">
$$
\vdots \\
f_i(x) = \begin{cases}
D \text{, if } x = 1 \\
C \text{, if } x \in \{2, 3\}
\end{cases} \\
\vdots
$$
</div>
<div style="flex: 0 0 50%; align-items: center">
<img src="./assets/injection.svg" style="width: 150px;">
</div>
</div>

However, **this does not scale**. Most spaces are continuous and infinite.

## Linear models

We can consider additional strategies. 

For instance, we can restrict ourselves only to **linear functions**, when variables change in proportion to the current (or previous) states.

$$
\dot{x} = f(x) = A x
$$

The simple harmonic oscillator presented in the beginning is a member of this class of functions.

Again, this is **very restricitve**. 

## Classical function generators

There are other alternatives, we could have the class of **polynomial functions**. 

We know from [Taylor's theorem](https://en.wikipedia.org/wiki/Taylor%27s_theorem#Generalizations_of_Taylor's_theorem) that a $k$-times differentiable function $f$ can be approximated with a Polynomial of $k$ degrees:

$$
\dot{x} = f(x_0) + \sum_k \frac{d^k f}{d x^n} \frac{(x - x_0)^k}{k!}
$$

The advantage of Taylor series is that we can **describe $f(.)$ according to its derivatives** --- that is, the way it changes in a small neighborhood.

When $k = 1$, we can approximate the function with a linear function.


## Cyclical decompositions

Alternatively, we can generate any [square-integrable functions](https://en.wikipedia.org/wiki/Square-integrable_function) --- that is, functions with a finite norm -- using the [Fourier series](https://en.wikipedia.org/wiki/Fourier_series) 

$$
\dot{x} = \sum_{n \in \mathbb{Z}} \langle f, e_n \rangle \, e_n(x)
$$

where $\langle . \, , \, . \rangle$ is the inner product of two functions and $e_n(x) = e^{inx}$.

The Fourier series allow us to describe **changes in terms of cycles**.

The lower the $n$, the lower the frequency of the cycle.

## Finding laws from data

Ok, so now that we have at least three general methods for generating arbitrary functions. 

**It should be easy to discover laws from data**, a simple linear regression will help us to find the parameters that best fit the data:

$$
\hat{\alpha} = 
$$



## From generic functions to black boxes

## Describing nature with models

<p>"*It is not sufficient for each elementary phenomenon to obey simple laws, all those to be combined must obey the same law as well.*"<br/>Poincaré 1902

A lot ingenuity and observation is required to develop models that are both **consistent with reality and existing theory**.

Scientific machine learning is a **set of tools for automated model discovery** to support the development of models from data.
    
<div style="display: flex;">
<div style="flex: 0 0 250px;">
<img src="./assets/khun.png" style="height: 150px;">
</div>
<div style="display: flex; align-items: center;">
<p>According to Thomas Khun, science is a combination of <b>marginal revolutions and paradigm shifts</b>.</p>
</div>
</div>

## What about stochastic models?

# Julia

**Into to the language and ecosystem**

## The <img src="./assets/julia-logo.svg" style="display: inline; vertical-align: text-top; width: 36pt; "></img> programming language

**Fast** designed for high-performance, JIT compiled code.

**Reproducible environment** `Manifest.toml` contains all the instructions to reproduce the environment; like the one in this notebook.

**General** from data-wrangling through data-analytics to data-reporting

**Dynamic** feels like Python

**Composable** multiple dispatch as a paradigm and functional programming, not quite like Python

**Open source** MIT license and active developer community in the scientific machine learning field.

Visit [Julia's](https://julialang.org) website to get started. 

## Installation

To follow the course, make sure to install the latest stable version of `Julia` from its [official page](https://julialang.org/downloads/). 

Then, clone this repository:

```
> git clone git@github.com:gzagatti/phdteachphd-sciml.git
```

After downloading the repository, activate its Julia environment and install its dependencies:

```
> cd phdteachphd-sciml/
> julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> ]
(@v1.6) pkg> activate .
  Activating environment at `./phdteachphd-sciml/Project.toml`
(@v1.6) pkg> instantiate
  Resolving package versions...
  ...
```


## Interacting with Julia

<div style="column-count: 2; column-gap: 5px; display: flex;">
<div style="display: flex; align-items: center;">
<ol>
<li>as a scripting language <pre><code>> julia my-code.jl</code></pre></li>
<li>from the terminal using the built-in REPL <pre><code>julia> 1 + 1</code></pre></li>
<li>from a Jupyter notebook using <code>IJulia</code> <pre><code>julia> using IJulia; jupyterlab();</code></pre></li>
<li>from <a href="https://www.julia-vscode.org/">Julia for VSCode</a> a powerful IDE for interactive computing similar to Rstudio</li>
</ol>
</div>
<div style="display: flex; width: 600px;">
<img src="./assets/vscode.png" style="display: block; margin:auto;">
</div>
</div>

## Ecosystem

Julia comes with a built-in pac

## Ecosystem

Julia comes with a built-in pac

# ODEs and SciML

**The initial value problem, classical approaches for fitting data to ODEs**

# NeuralODE

**The initial value problem reviewed, fitting data to ODEs with machine learning**

# Exercises

# References

Meiss 2007