Skip to content

Conversation

AlCap23
Copy link
Collaborator

@AlCap23 AlCap23 commented Jan 27, 2022

Adds some more insights into the problem via show and print. Allows for naming a problem.
Assert if a problem is applicable to a basis.

TODO: Subsampling of the problem

@AlCap23
Copy link
Collaborator Author

AlCap23 commented Jan 29, 2022

Current example

using DataDrivenDiffEq
using LinearAlgebra
using ModelingToolkit
using OrdinaryDiffEq
using Plots 
using Statistics
using StatsBase
using Random

@variables u[1:2]
basis = Basis([polynomial_basis(u, 5); sin.(u); cos.(u)], u)

function pendulum(u, p, t)
    x = u[2]
    y = -9.81sin(u[1]) - 0.1u[2]^3 - 0.2 * cos(u[1])
    return [x;y]
end

u0 = [0.99π; -1.0]
tspan = (0.0, 20.0)
dt = 0.1
prob = ODEProblem(pendulum, u0, tspan)
sol = solve(prob, Tsit5(), saveat=dt)

X = Array(sol)
t = sol.t
Random.seed!(1234)
X = X .+ 1e-1*randn(size(X))

dd_prob_noisy = ContinuousDataDrivenProblem(
    X, t, GaussianKernel()
    )


sampler = DataSampler(Split(ratio = 0.8), Batcher(n = 4, shuffle = true, repeated = true, batchsize_min = 40))
opt = ADMM(1e-3:1e-3:5e-1)
sol = solve(dd_prob_noisy, basis, opt, maxiter = 1000, by = :stat, progress = true, denoise = true, normalize = true, sampler = sampler)
sol

Output of the result and parameters

julia> println(result(sol))
Model ##Basis#413 with 2 equations
States : u[1] u[2]
Parameters : 7
Independent variable: t
Equations
Differential(t)(u[1]) = p₁*u[2]
Differential(t)(u[2]) = p₂*u[2] + p₃*(u[2]^3) + p₄*(u[2]^5) + p₅*sin(u[1]) + p₆*cos(u[1]) + p₇*cos(u[2])

julia> parameters(sol)
7-element Vector{Measurements.Measurement{Float64}}:
   0.958 ± 0.051
  -0.075 ± 0.096
  -0.072 ± 0.049
 -0.0025 ± 0.005
   -8.76 ± 0.31
    -0.1 ± 0.2
    -0.2 ± 0.14

The DataSampler consists of two major components which can be used individually. Split defines a train / test split.
Batcher creates data batches of the training data. Both take in random seeds ( need to expose this in the API ) to ensure reproducibility.

The keyword by can be set to use either

  • :min which returns the best test error
  • :kfold which returns the best average error over all training sets
  • :stat which returns the mean +- std of all

or anything which defaults to :min.

Additionally, the original output of SINDy now contains all the best sparsity knobs, which can be retrieved via output(sol).\lambda

@AlCap23
Copy link
Collaborator Author

AlCap23 commented Feb 4, 2022

I still need to adapt the docs, but tests are green locally.

For this I'll open a separate PR, since I want to switch to Literate.jl for the examples. This would allow us to also generate notebooks for easy access and starting points.

@AlCap23
Copy link
Collaborator Author

AlCap23 commented Feb 4, 2022

This is the same script.

Screenshot 2022-02-04 at 17 34 31

@AlCap23 AlCap23 marked this pull request as ready for review February 5, 2022 06:08
@AlCap23
Copy link
Collaborator Author

AlCap23 commented Feb 5, 2022

I've flagged the test broken for now. There is no eminent reason for me why the result is good on v.1.7 and off on v.1.6.
But I'll have a look into this when this is partially done. But maybe we should not release for now.

Edit I've checked some of the possible sources for this, but found no clear source for the problem right now. Since using more samples seems to work out on both versions, I've adapted the tests slightly. I am confused however why this is an issue. There could have been scaling issues and minor bugs related to implicit identification which played in our favour here.

@AlCap23 AlCap23 merged commit 1983e8b into master Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants