<a href="https://colab.research.google.com/github/RCortez25/Scientific-Machine-Learning/blob/main/Differential_equations/Lotka_Volterra_UDE.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction
---

# Code walkthrough
---

## Ground truth
---

In [None]:
using JLD, Lux, DifferentialEquations, Optimization, OptimizationOptimJL
using Random, Plots, ComponentArrays, ModelingToolkit

random_number_generator = Random.default_rng()
Random.seed!(random_number_generator, 42)

#------ Generate ground-truth data
@parameters α β δ γ
@independent_variables t
@variables x(t) y(t)
Dt = Differential(t)

eqs = [
    Dt(x) ~ α*x - β*x*y,
    Dt(y) ~ -δ*y + γ*x*y
]

@named system = ODESystem(eqs, t, [x, y], [α, β, δ, γ])
simplified = structural_simplify(system)

N_days = 25

parameter_map = Dict(α => 1.0, β => 0.02, δ => 0.5, γ => 0.02)
initial_conditions = Dict(x => 20, y => 10)
timespan = (0.0, N_days)

The code is for generating ground truth data with the real Lotka-Volterra system of equations. Note that in this case que values of the parameters

`α => 1.0, β => 0.02, δ => 0.5, γ => 0.02`

were selected rather pedagogically, in order to have a "nice" system.

In [None]:
problem = ODEProblem(
    simplified,
    merge(initial_conditions, parameter_map),
    timespan
)

solution = solve(problem; saveat = 0.1)
solution = Array(solution) # solution[1,:] = y, solution[2,:] = x

x_ground_truth = solution[2, :]
y_ground_truth = solution[1, :]

Here, the ODE problem is generated as `problem`. The problem is solved with default parameters (not a specific integrator).

The important thing to note, is that, for some reason, `structural_simplify` rearranges variables, so that instead of having `(x,y)` one has `(y,x)`. That's why, in order to retrieve the `x` values, we have to use the second index as in

`x_ground_truth = solution[2, :]`

and the first index for `y`

`y_ground_truth = solution[1, :]`

## Universal Differential Equation
---

In [None]:
# A
NN1 = Lux.Chain(
    Lux.Dense(2, 32, relu),
    Lux.Dense(32, 1, softplus)
)
parameters_1, state_1 = Lux.setup(random_number_generator, NN1)

NN2 = Lux.Chain(
    Lux.Dense(2, 32, relu),
    Lux.Dense(32, 1, softplus)
)
parameters_2, state_2 = Lux.setup(random_number_generator, NN2)

# B
initial_parameters = (layer_1 = parameters_1, layer_2 = parameters_2)
initial_parameters = ComponentArray(initial_parameters)

**A** - In this case, we're modeling the terms $\beta xy$ and $\gamma xy$ of the LV system. That's why, we're creating 2 neural networks here, `NN1` and `NN2`, in order to replace these two terms.

Each NN will take 2 inputs, $(x,y)$ and will output 1 number, their corresponding $\beta xy$ and $\gamma xy$. Note also that in the output layer, the `sofplus` activation is used in order to guarantee positive numbers, as we want the interaction terms to be positive. The softplus funcion is defined as

$$x=\ln(1+e^x)$$

That is, it maps negative numbers to numbers very close to 0.

We initialize the NNs with `Lux.setup`, then we extract the initial (random) weights and biases with `parameters_1` and `parameters_2`.

Each network has

*   Dense(2→32): weights 32×2=64, bias 32. Total: 96 parameters
*   Dense(32→1): weights 1×32=32, bias 1. Total: 33 parameters

that is, 129 trainable parameters per network, for a total of 129×2=258 trainable parameters in total.

**B** - We store the initial parameters into a single vector `initial_parameters`, that becomes a flat vector due to `ComponentArrays` of length 258. This vector has named subfields `layer_1` and `layer_2` that contain information on each network's parameters.

We will use this vector when solving the UDE, and for training and optimizing as the starting point.

In [None]:
function derivative_predictions(du, u, p, t)
    # A
    (x, y) = u

    # B
    output1, updated_state1 = Lux.apply(NN1, Float32([x,y]), p.p1, state_1)
    output2, updated_state2 = Lux.apply(NN2, Float32([x,y]), p.p2, state_2)

    # C
    beta_xy = only(output1)
    gamma_xy = only(output2)

    # D
    du[1] = p.α * x - beta_xy
    du[2] = -p.δ * y + gamma_xy
end

Create the function to calculate the derivatives of the LV system, that is, the whole RHS of

$$
\begin{align}
\dot{x} &=αx-\beta xy \\
\dot{y} &=-\delta y+\gamma xy
\end{align}
$$

This is an operation defined **in-place**, meaning that it will write `du` given the current state `u`, parameters `p`, and time `t`.

**A** - This splits the current state vector `u` into $x$ and $y$.

**B** - Runnning a forward pass of the NNs defined above on the current state $(x,y)$, using parameters the portion `p1` of the parameters vector `p` for NN1, and `p2` for NN2. This returns two objects

*   `output1`: the output of the NN forward pass, a scalar
*   `updated_state1`: the updated state of BatchNorm, etc. In this example, we don't use this.

The same applies for NN2.

Note that we perform a transformation to `Float32.` to match the NNs' weights datatype, which is `Float32` as well.

**C** - Extract the only element of the vector `output1` and `output2`. These are single-element vectors, e.g. `output1=[2]`, so selecting the only element gives back the scalar `only(output1)=2`.

Note that `output1` corresponds to the term $\beta xy$ whereas `output2` corresponds to $\gamma xy$.

**D** -