Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions docs/src/prob_and_solve.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# Problem Definition And Solution
# Problem Definition and Solution

As can be seen from the [introduction examples](@id Quickstart), [DataDrivenDiffEq.jl](https://github.com/SciML/DataDrivenDiffEq.jl) tries to structurize the workflow in a similar fashion to other [SciML](https://sciml.ai/) packages by defining a [`DataDrivenProblem`](@ref), dispatching on the `solve` command to return a [`DataDrivenSolution`](@ref).
As can be seen from the [introduction examples](@id Quickstart), [DataDrivenDiffEq.jl](https://github.com/SciML/DataDrivenDiffEq.jl) tries to structure the workflow in a similar fashion to other [SciML](https://sciml.ai/) packages by defining a [`DataDrivenProblem`](@ref), dispatching on the `solve` command to return a [`DataDrivenSolution`](@ref).

A problem in the sense of identification, estimation or inference is defined by the data describing it. This data contains at least measurements of the states `X`, which would be sufficient to describe a `DiscreteDataDrivenProblem` with unit time steps similar to the [first example on dynamic mode decomposition](@ref Linear-Systems-via-Dynamic-Mode-Decomposition). Of course we can extend this to include time points `t`, control signals `U` or a function describing those `u(x,p,t)`. Additionally, any parameters `p` known a priori can be included in the problem. In practice, this looks like
## Defining a Problem

Problems of identification, estimation or inference are defined by the data. This data contains at least measurements of the states `X`, which would be sufficient to describe a `DiscreteDataDrivenProblem` with unit time steps similar to the [first example on dynamic mode decomposition](@ref Linear-Systems-via-Dynamic-Mode-Decomposition). Of course we can extend this to include time points `t`, control signals `U` or a function describing those `u(x,p,t)`. Additionally, any parameters `p` known a priori can be included in the problem. In practice, this looks like:

```julia
problem = DiscreteDataDrivenProblem(X)
Expand All @@ -12,7 +14,7 @@ problem = DiscreteDataDrivenProblem(X, t, U, p = p)
problem = DiscreteDataDrivenProblem(X, t, (x,p,t)->u(x,p,t))
```

Similarly, a `ContinuousDataDrivenProblem` would need at least measurements and time-derivatives (`X` and `DX`) or measurements, time information and a way to derive the time derivatives(`X`, `t` and a [Collocation](@ref) method). Again, this can be extended by including a control input as measurements or a function and possible parameters.
Similarly, a `ContinuousDataDrivenProblem` would need at least measurements and time-derivatives (`X` and `DX`) or measurements, time information and a way to derive the time derivatives(`X`, `t` and a [Collocation](@ref) method). Again, this can be extended by including a control input as measurements or a function and possible parameters:

```julia
problem = ContinuousDataDrivenProblem(X, DX)
Expand Down Expand Up @@ -45,7 +47,9 @@ problem = DirectDataDrivenProblem(X, t, Y, p = p)
problem = DirectDataDrivenProblem(X, t, Y, (x,p,t)->u(x,p,t), p = p)
```

Next up, we choose a method to `solve` the [`DataDrivenProblem`](@ref). Depending on the input arguments and the type of problem, the function will return a result derived via [`Koopman`](@ref) or [`Sparse Optimization`](@ref) methods. Different options can be provided as well as a [`Basis`](@ref) used for lifting the measurements, to control different options like rounding, normalization or the progressbar depending on the inference method. Possible options are provided [below](@ref optional_arguments).
## Solving a Problem

Next up, we choose a method to `solve` the [`DataDrivenProblem`](@ref). Depending on the input arguments and the type of problem, the function will return a result derived via [`Koopman`](@ref) or [`Sparse Optimization`](@ref) methods. Different options can be provided as well as a [`Basis`](@ref) used for lifting the measurements, to control different options like rounding, normalization or the progressbar depending on the inference method. Possible options are provided [below](@ref optional_arguments):

```julia
# Use a Koopman based inference
Expand All @@ -54,7 +58,7 @@ res = solve(problem, DMDSVD(), kwargs...)
res = solve(problem, basis, STLQS(), kwargs...)
```

The [`DataDrivenSolution`](@ref) `res` contains a `result` which is the infered system and a [`Basis`](@ref), `metrics` which is a `NamedTuple` containing different metrics like the L2 error of the infered system with the provided data and the [`AICC`](@ref). These can be accessed via
The [`DataDrivenSolution`](@ref) `res` contains a `result` which is the infered system and a [`Basis`](@ref), `metrics` which is a `NamedTuple` containing different metrics like the L2 error of the infered system with the provided data and the [`AICC`](@ref). These can be accessed via:

```julia
# The infered system
Expand Down Expand Up @@ -97,15 +101,15 @@ res = solve(problem, basis, DMDSVD(), kwargs...)

If control signals are present, they get processed according to [this paper](https://epubs.siam.org/doi/abs/10.1137/15M1013857?mobileUi=0) for dynamic mode decomposition and [as described here](https://epubs.siam.org/doi/pdf/10.1137/16M1062296) for extended dynamic mode decomposition assuming a linear relationship on the operator.

Possible keyworded arguments include
Possible keyworded arguments include:
+ `B` a linear mapping known a priori which maps the control signals onto the lifted states
+ `digits` controls the digits / rounding used for deriving the system equations (`digits = 1` would round `10.02` to `10.0`)
+ `operator_only` returns a `NamedTuple` containing the operator, input and output mapping and matrices used for updating the operator as described [here](https://arxiv.org/pdf/1406.7187.pdf)

!!! info
If `eval_expression` is set to `true`, the returning result of the Koopman based inference will not contain a parametrized equation, but rather use the numeric values of the operator/generator.

SINDy based algorithms can be called like :
SINDy based algorithms can be called like:

```julia
res = solve(problem, basis, STLQS(), kwargs...)
Expand All @@ -121,7 +125,7 @@ equations ``f(u_t, u, p, t) = 0`` can be passed in
res = solve(problem, basis, ImplicitOptimizer(), implicits, kwargs...)
```

This indicates that we do not have coupling between two implicitly defined variables if not explicitly given. To elaborate this complicated sentence, consider the following:
This indicates that we do not have coupling between two implicitly defined variables unless coupling is explicitly included. To elaborate, consider the following:

```julia
basis = Basis([x, y, z], [x,y,z])
Expand All @@ -143,7 +147,7 @@ basis = Basis([x, y, z, x*y], [x,y,z])
res = solve(problem, basis, ImplicitOptimizer(), [x,y])
```

Would allow solutions of the form `x = y*x + z` or `y = y*x + z` to occur while suppressing `x = y + z`. This is due to the reason that `y*x` includes both `x` and `y`, so the function will get included in the evaluation.
Would allow solutions of the form `x = y*x + z` or `y = y*x + z` to occur while suppressing `x = y + z`. This is because `y*x` includes both `x` and `y`, so the function will get included in the evaluation.

Possible keyworded arguments include
+ `normalize` normalizes the data matrix ``\\Theta`` such that each row ( correspondng to candidate functions) has a 2-norm of `1.0`
Expand Down
22 changes: 11 additions & 11 deletions docs/src/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ system = result(res)
println(system) # hide
```

The [`DataDrivenSolution`](@ref) contains an explicit result which is a [`Koopman`](@ref), defining all necessary information, e.g. the associated operator (which corresponds to our abefore defined matrix ``A``).
The [`DataDrivenSolution`](@ref) contains an explicit result which is a [`Koopman`](@ref), defining all necessary information, e.g. the associated operator (which corresponds to our matrix ``A``).

```@example 4
Matrix(system)
Expand All @@ -48,11 +48,11 @@ In general, we can skip the expensive progress of deriving a callable symbolic s
res = solve(prob, DMDSVD(), digits = 1, operator_only = true)
```

Where `K` is the associated operator given as its eigendecomposition, `B` is a possible mapping of inputs onto the states, `C` is the linear mapping from the lifted observeables back onto the original states and `Q` and `P` are used for updating the operator.
Where `K` is the associated operator given as its eigendecomposition, `B` is a possible mapping of inputs onto the states, `C` is the linear mapping from the lifted observables back onto the original states and `Q` and `P` are used for updating the operator.

## Nonlinear System with Extended Dynamic Mode Decomposition

Similarly, we can use the [Extended Dynamic Mode Decomposition](https://link.springer.com/article/10.1007/s00332-015-9258-5) via a nonlinear [`Basis`](@ref) of observeables. Here, we look a rather [famous example](https://arxiv.org/pdf/1510.03007.pdf) with a finite dimensional solution.
Similarly, we can use the [Extended Dynamic Mode Decomposition](https://link.springer.com/article/10.1007/s00332-015-9258-5) via a nonlinear [`Basis`](@ref) of observeables. Here, we will look at a rather [famous example](https://arxiv.org/pdf/1510.03007.pdf) with a finite dimensional solution.

```@example 3
using DataDrivenDiffEq
Expand All @@ -77,7 +77,7 @@ savefig("EDMD_Example_1.png") # hide
```
![](EDMD_Example_1.png)

Since we are dealing with an continuous system in time, we define the associated [`DataDrivenProblem`](@ref) accordingly using the measured states `X`, their derivates `DX` and the time `t`.
Since we are dealing with an continuous system in time, we define the associated [`DataDrivenProblem`](@ref) accordingly using the measured states `X`, their derivatives `DX` and the time `t`.

```@example 3
prob = ContinuousDataDrivenProblem(solution)
Expand All @@ -104,7 +104,7 @@ operator(system)

## Nonlinear Systems - Sparse Identification of Nonlinear Dynamics

To find the underlying system without any [`Algortihms`](@ref koopman_algorithms) related to Koopman operator theory, we can use [Sparse Identification of Nonlinear Dynamics](https://www.pnas.org/content/113/15/3932) - SINDy for short. As the name suggests, it finds the sparsest basis of functions which build the observed trajectory. Again, we will start with a nonlinear system
To find the underlying system without any [`Algorithms`](@ref koopman_algorithms) related to Koopman operator theory, we can use [Sparse Identification of Nonlinear Dynamics](https://www.pnas.org/content/113/15/3932) - SINDy for short. As the name suggests, it finds the sparsest basis of functions which build the observed trajectory. Again, we will start with a nonlinear system:

```@example 1
using DataDrivenDiffEq
Expand All @@ -115,7 +115,7 @@ using Plots
using Random
using Symbolics: scalarize

Random.seed!(1111) # Due to the noise
Random.seed!(1111) # For the noise

# Create a nonlinear pendulum
function pendulum(u, p, t)
Expand Down Expand Up @@ -183,7 +183,7 @@ println(res) # hide
!!! info
A more detailed description of the result can be printed via `print(res, Val{true})`, which also includes the discovered equations and parameter values.

Where the resulting [`DataDrivenSolution`](@ref) stores information about the infered model and the parameters:
Where the resulting [`DataDrivenSolution`](@ref) stores information about the inferred model and the parameters:

```@example 1
system = result(res);
Expand Down Expand Up @@ -290,7 +290,7 @@ println(system)

The following is another example on how to use the [`ImplicitOptimizer`](@ref) and is taken from the [original paper](https://royalsocietypublishing.org/doi/10.1098/rspa.2020.0279).

As always, we start by creating a corresponding dataset.
As always, we start by creating a corresponding dataset:

```@example 5
using DataDrivenDiffEq
Expand Down Expand Up @@ -334,7 +334,7 @@ savefig("SINDy_Example_Data_3.png") # hide
```
![](SINDy_Example_Data_3.png)

Next, we define a sufficient [`Basis`](@ref)
Next, we define a sufficient [`Basis`](@ref):

```@example 5
using Symbolics: scalarize
Expand Down Expand Up @@ -365,10 +365,10 @@ basis= Basis(implicits, [u; du], controls = x, iv = t);
println(basis) # hide
```

And solve the problem by varying over a sufficient set of thresholds for the associated optimizer.
We solve the problem by varying over a sufficient set of thresholds for the associated optimizer.
Additionally we activate the `scale_coefficients` option for the [`ImplicitOptimizer](@ref), which helps to find sparse equations by normalizing the resulting coefficient matrix after each suboptimization.

To evaluate the pareto optimal solution over, we use the functions `f` and `g` which can be passed as keyworded arguements into the `solve` function. `f` is a function with different signatures for different optimizers, but returns the ``L_0`` norm of the coefficients and the ``L_2`` error of the current model. `g` takes this vector and projects it down onto a scalar, using the ``L_2`` norm per default. However, here we want to use the `AIC` of the output of `f`. A noteworthy exception is of course, that we want only results with two or more active coefficents. Hence we modify `g` accordingly.
To evaluate the pareto optimal solution, we use the functions `f` and `g` which can be passed as keyworded arguements into the `solve` function. `f` is a function with different signatures for different optimizers, but returns the ``L_0`` norm of the coefficients and the ``L_2`` error of the current model. `g` takes this vector and projects it down onto a scalar, using the ``L_2`` norm per default. However, here we want to use the `AIC` of the output of `f`. A noteworthy exception is of course, that we want only results with two or more active coefficents. Hence we modify `g` accordingly.

```@example 5
λ = [1e-4;5e-4;1e-3;2e-3;3e-3;4e-3;5e-3;6e-3;7e-3;8e-3;9e-3;1e-2;2e-2;3e-2;4e-2;5e-2;
Expand Down
20 changes: 10 additions & 10 deletions docs/src/symbolic_regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Using [sparse regression](@ref sparse_optimization) limits the discovery to a ge
## SymbolicRegression

!!! warning
This feature requires the explicit loading of [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl) in addition to `DataDrivenDiffEq`. It will _only_ be useable if used like
This feature requires the explicit loading of [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl) in addition to `DataDrivenDiffEq`. It will _only_ be useable if loaded like:
```julia
using DataDrivenDiffEq
using SymbolicRegression
Expand Down Expand Up @@ -49,18 +49,18 @@ EQSearch
## OccamNet

!!! warning
This feature requires the explicit loading of [Flux.jl](https://fluxml.ai/) in addition to `DataDrivenDiffEq`. It will _only_ be useable if used like
This feature requires the explicit loading of [Flux.jl](https://fluxml.ai/) in addition to `DataDrivenDiffEq`. It will _only_ be useable if loaded like:
```julia
using DataDrivenDiffEq
using Flux
```

As introduced in [Interpretable Neuroevolutionary Models for Learning Non-Differentiable Functions and Programs
](https://arxiv.org/abs/2007.10784), `OccamNet` is a special form of symbolic regression which uses a probabilistic approach to equation discovery by using a feedforward multilayer neural network. In constrats to normal architectures, each layers weight reflect the probability which inputs to use. Additionally not only a single activation function is used, but a set of functions. Similar to simulated annealing, a temperature is included to control the exploration of possible functions.
](https://arxiv.org/abs/2007.10784), `OccamNet` is a special form of symbolic regression which uses a probabilistic approach to equation discovery by using a feedforward multilayer neural network. In contrast to normal architectures, each layer's weights reflect the probability of which inputs to use. Additionally a set of activation functions is used, instead of a single function. Similar to simulated annealing, a temperature is included to control the exploration of possible functions.

`DataDrivenDiffEq` offers two main interfaces to `OccamNet`: a `Flux` based API with `Flux.train!` and a `solve(...)` function.

Consider the following example, where we want to discover a vector valued function
Consider the following example, where we want to discover a vector valued function:

```@example occamnet_flux
using DataDrivenDiffEq
Expand All @@ -78,20 +78,20 @@ f(x) = [sin(π*x[2]+x[1]); exp(x[2])]
Y = hcat(map(f, eachcol(X))...)
```

Next, we define our network
Next, we define our network:

```@example occamnet_flux
net = OccamNet(2, 2, 3, Function[sin, +, *, exp], skip = true, constants = Float64[π])
```

Where `2,2,3` refers to input and output dimension and the number of layers _without the output layer_. We also define that each layer uses the functions `sin, +, *, exp` as activations an use a `π` as a constant, which get concanated to the input data. Additionally, `skip` indicates the useage of skip connections, which allow the output of each layer to be passed onto the output layer directly.
Where `2,2,3` refers to input and output dimension and the number of layers _without the output layer_. We also define that each layer uses the functions `sin, +, *, exp` as activations and uses a `π` as a constant, which get concatenated to the input data. Additionally, `skip` indicates the useage of skip connections, which allow the output of each layer to be passed onto the output layer directly.

To train the network over `100` epochs using `ADAM`, we type
```@example occamnet_flux
Flux.train!(net, X, Y, ADAM(1e-2), 100, routes = 100, nbest = 3)
```

Under the hood, we select `routes` possible routes through the network based on the probability reflect by the [`ProbabilityLayer`](@ref) forming the network. From these we take the `nbest` candidates to train the parameters of the network, meaning increase the probability of those routes.
Under the hood, we select possible routes, `routes`, through the network based on the probability reflected by the [`ProbabilityLayer`](@ref) forming the network. From these we take the `nbest` candidates to train the parameters of the network, meaning increase the probability of those routes.

Lets have a look at some possible equations after the initial training. We can use `rand` to sample a route through the network, compute the output probability with `probability` and transform it into analytical equations by simply using `ModelingToolkit`s variables as input. The call `net(x, route)` uses the route to compute just the element on this path.

Expand All @@ -118,7 +118,7 @@ for i in 1:10
end
```

The network is quite certain about the equation now, which is in fact our unknown mapping. To extract the solution with the highest probability, we set the temperature of the underlying distribution to a very low value. In the limit of `t ↦ 0` we approach a dirac distribution for the and hence extracting the most likely terms.
The network is quite certain about the equation now, which is in fact our unknown mapping. To extract the solution with the highest probability, we set the temperature of the underlying distribution to a very low value. In the limit of `t ↦ 0` we approach a Dirac distribution, hence extracting the most likely terms.

```@example occamnet_flux
set_temp!(net, 0.01)
Expand All @@ -142,13 +142,13 @@ println(res) #hide

Within `solve` the network is generated using the information provided by the [DataDrivenProblem](@ref) in form of states, control and independent variables as well as the specified options, followed by training the network and extracting the equation with the highest probability by setting the temperature as above. After computing additional metrics, a [DataDrivenSolution](@ref) is returned where the equations are transformed into a [`Basis`](@ref) useable with `ModelingToolkit`.

The metrics can be accessed via
The metrics can be accessed via:

```@example occamnet_flux
metrics(res)
```

and the resulting [`Basis`](@ref) by
and the resulting [`Basis`](@ref) by:

```@example occamnet_flux
result(res)
Expand Down
Loading