Merge pull request #113 from acfr/112-docs-for-observer-example-are-o…

…utdated Updated random seeds and observer documentation
acfr · Aug 2, 2023 · 4c26155 · 4c26155
2 parents d4aa178 + 322e1a8
commit 4c26155
Show file tree

Hide file tree

Showing 12 changed files with 250 additions and 213 deletions.
diff --git a/.github/workflows/documentation.yml b/.github/workflows/documentation.yml
@@ -16,7 +16,7 @@ jobs:
       - uses: actions/checkout@v2
       - uses: julia-actions/setup-julia@v1
         with:
-          version: '1.6'
+          version: '1'
       - name: Install dependencies
         run: |
           julia --project=docs/ -e '

diff --git a/README.md b/README.md
@@ -23,7 +23,7 @@ using Random
 using RobustNeuralNetworks
 
 # Setup
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 nu, nx, nv, ny = 4, 2, 20, 1
 
@@ -44,7 +44,7 @@ println(round.(y1;digits=2))
 The output should be:
 
 ```julia
-[1.38 0.56 0.89 2.11 2.14 0.89 1.63 0.44 1.24 1.26]
+[-1.49 0.75 1.34 -0.23 -0.84 0.38 0.79 -0.1 0.72 0.54]
 ```
 
 ## Citing the Package

diff --git a/docs/src/examples/box_obsv.md b/docs/src/examples/box_obsv.md
@@ -81,7 +81,7 @@ Xt = X[1:end-1]
 Xn = X[2:end]
 y = gd.(Xt)
 ```
-With that done, we store the data for training, shuffling it so there is the data is not in simulation order.
+With that done, we store the data for training, shuffling it so there is no bias in the training towards earlier timesteps.
 ```julia
 observer_data = [[ut; yt] for (ut,yt) in zip(u, y)]
 indx = shuffle(rng, 1:length(observer_data))
@@ -95,7 +95,7 @@ Since we need our model to be a contracting dynamical system, the obvious choice
 ```julia
 using RobustNeuralNetworks
 
-nv = 100
+nv = 200
 nu = size(observer_data[1], 1)
 ny = nx
 model_ps = ContractingRENParams{Float64}(nu, nx, nv, ny; output_map=false, rng)
@@ -119,7 +119,7 @@ We've written a function to train the observer that decreases the learning rate
 using Flux
 using Printf
 
-function train_observer!(model, data; epochs=100, lr=1e-3, min_lr=1e-4)
+function train_observer!(model, data; epochs=50, lr=1e-3, min_lr=1e-6)
 
     opt_state = Flux.setup(Adam(lr), model)
     mean_loss = [1e5]
@@ -151,7 +151,7 @@ tloss = train_observer!(model, data)
 Now that we've trained the REN observer to minimise the one-step-ahead prediction error, let's see if the observer error actually does converge to zero. First, we'll need some test data. 
 ```julia
 batches   = 50
-ts_test   = 1:Int(10/dt)
+ts_test   = 1:Int(20/dt)
 u_test    = fill(zeros(1, batches), length(ts_test))
 x_test    = fill(zeros(nx,batches), length(ts_test))
 x_test[1] = 0.2*(2*rand(rng, nx, batches) .-1)
@@ -194,7 +194,7 @@ function plot_results(x, x̂, ts)
     fig = Figure(resolution = (800, 400))
     ga = fig[1,1] = GridLayout()
 
-    ax1 = Axis(ga[1,1], xlabel="Time (s)", ylabel="Position (m)", title="Actual")
+    ax1 = Axis(ga[1,1], xlabel="Time (s)", ylabel="Position (m)", title="States")
     ax2 = Axis(ga[1,2], xlabel="Time (s)", ylabel="Position (m)", title="Observer Error")
     ax3 = Axis(ga[2,1], xlabel="Time (s)", ylabel="Velocity (m/s)")
     ax4 = Axis(ga[2,2], xlabel="Time (s)", ylabel="Velocity (m/s)")

diff --git a/docs/src/examples/lbdn_curvefit.md b/docs/src/examples/lbdn_curvefit.md
@@ -40,7 +40,7 @@ using Random
 using RobustNeuralNetworks
 
 # Random seed for consistency
-rng = MersenneTwister(42)
+rng = Xoshiro(0)
 
 # Model specification
 nu = 1                  # Number of inputs
@@ -151,8 +151,8 @@ ŷ = map(x -> model([x])[1], xs)
 
 # Plot
 lines!(xs, ys, label = "Data")
-lines!(xs, ybest, label = "Maximum slope = 10.0")
-lines!(xs, ŷ, label = "LBDN: slope = $(round(Empirical_Lipschitz; digits=2))")
+lines!(xs, ybest, label = "Max. slope = 10.0")
+lines!(xs, ŷ, label = "LBDN slope = $(round(Empirical_Lipschitz; digits=2))")
 axislegend(ax, position=:lt)
 save("lbdn_curve_fit.svg", f1)
 ```

diff --git a/docs/src/introduction/getting_started.md b/docs/src/introduction/getting_started.md
@@ -17,7 +17,7 @@ using Random
 using RobustNeuralNetworks
 
 # Setup
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 nu, nx, nv, ny = 4, 10, 20, 1
 γ = 1
@@ -38,7 +38,7 @@ println(round.(y1; digits=2))
 
 # output
 
-[0.98 1.24 0.86 1.93 1.08 1.19 1.23 1.4 0.95 0.65]
+[1.06 1.13 0.95 0.93 1.03 0.78 0.75 1.42 0.89 1.44]
 ```
 
 For detailed examples of training models from `RobustNeuralNetworks.jl`, we recommend starting with [Fitting a Curve with LBDN](@ref) and working through the subsequent examples.
@@ -56,7 +56,7 @@ using RobustNeuralNetworks
 Let's set a random seed and define our batch size and some hyperparameters. For this example, we'll build a Lipschitz-bounded REN with 4 inputs, 1 output, 10 states, 20 neurons, and a Lipschitz bound of `γ = 1`.
 
 ```@example walkthrough
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 
 γ = 1

diff --git a/examples/results/lbdn-curvefit/lbdn_curve_fit.svg b/examples/results/lbdn-curvefit/lbdn_curve_fit.svg
diff --git a/examples/src/contracting_ren.jl b/examples/src/contracting_ren.jl
@@ -10,7 +10,7 @@ using RobustNeuralNetworks
 
 rng = MersenneTwister(42)
 
-# Create a contracting REN with just its state as an output 
+# Create a contracting REN with just its state as an output, slow dynamics
 nu, nx, nv, ny = 1, 1, 10, 1
 ren_ps = ContractingRENParams{Float64}(nu, nx, nv, ny; output_map=false, rng, init=:cholesky)
 ren = REN(ren_ps)

diff --git a/examples/src/lbdn_curvefit.jl b/examples/src/lbdn_curvefit.jl
@@ -11,7 +11,16 @@ using Random
 using RobustNeuralNetworks
 
 # Random seed for consistency
-rng = MersenneTwister(42)
+rng = Xoshiro(0)
+
+# Function to estimate
+f(x) = x < 0 ? 0 : 1
+
+# Training data
+dx = 0.01
+xs = -0.3:dx:0.3
+ys = f.(xs)
+data = zip(xs,ys)
 
 # Model specification
 nu = 1                  # Number of inputs
@@ -23,15 +32,6 @@ nh = fill(16,4)         # 4 hidden layers, each with 16 neurons
 model_ps = DenseLBDNParams{Float64}(nu, nh, ny, γ; rng)
 model = DiffLBDN(model_ps)
 
-# Function to estimate
-f(x) = x < 0 ? 0 : 1
-
-# Training data
-dx = 0.01
-xs = -0.3:dx:0.3
-ys = f.(xs)
-data = zip(xs,ys)
-
 # Loss function
 loss(model,x,y) = Flux.mse(model([x]),[y]) 
 
@@ -48,16 +48,14 @@ function progress(model, iter, xs, ys, dx)
 end
 
 # Define hyperparameters
-num_epochs = [400, 200]
-lrs = [2e-4, 5e-5]
+num_epochs = 300
+lr = 2e-4
 
 # Train with the Adam optimiser
-for k in eachindex(lrs)
-    opt_state = Flux.setup(Adam(lrs[k]), model)
-    for i in 1:num_epochs[k]
-        Flux.train!(loss, model, data, opt_state)
-        (i % 50 == 0) && progress(model, i, xs, ys, dx)
-    end
+opt_state = Flux.setup(Adam(lr), model)
+for i in 1:num_epochs
+    Flux.train!(loss, model, data, opt_state)
+    (i % 50 == 0) && progress(model, i, xs, ys, dx)
 end
 
 # Print out lower-bound on Lipschitz constant
@@ -73,8 +71,8 @@ ybest = get_best.(xs)
 ŷ = map(x -> model([x])[1], xs)
 
 lines!(xs, ys, label = "Data")
-lines!(xs, ybest, label = "Maximum slope = 10.0")
-lines!(xs, ŷ, label = "LBDN: slope = $(round(Empirical_Lipschitz; digits=2))")
+lines!(xs, ybest, label = "Slope restriction = 10.0")
+lines!(xs, ŷ, label = "LBDN slope = $(round(Empirical_Lipschitz; digits=2))")
 axislegend(ax, position=:lt)
 display(fig)
 save("../results/lbdn-curvefit/lbdn_curve_fit.svg", fig)
diff --git a/src/Wrappers/LBDN/lbdn.jl b/src/Wrappers/LBDN/lbdn.jl
@@ -42,7 +42,7 @@ using Random
 using RobustNeuralNetworks
 
 # Setup
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 γ = 20.0
 
@@ -61,7 +61,7 @@ println(round.(y; digits=2))
 
 # output
 
-[-0.69 -1.89 -9.68 3.47 -11.65 -4.48 -4.53 3.61 1.37 -0.68]
+[-1.11 -1.01 -0.07 -2.25 -4.22 -1.76 -3.82 -1.13 -11.85 -3.01]
 ```
 """
 function (m::AbstractLBDN)(u::AbstractVecOrMat)

diff --git a/src/Wrappers/LBDN/sandwich_fc.jl b/src/Wrappers/LBDN/sandwich_fc.jl
@@ -32,13 +32,13 @@ A non-expensive layer is a layer with a Lipschitz bound of exactly 1. This layer
 
 We can build a dense LBDN directly using `SandwichFC` layers. The model structure is described in Equation 8 of [Wang & Manchester (2023)](https://proceedings.mlr.press/v202/wang23v.html).
 
-```julia
+```jldoctest
 using Flux
 using Random
 using RobustNeuralNetworks
 
 # Random seed for consistency
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 
 # Model specification
 nu = 1                  # Number of inputs
@@ -56,14 +56,14 @@ model = Flux.Chain(
 )
 
 # Evaluate on dummy inputs
-u = 10*randn(nu, 10)
+u = 10*randn(rng, nu, 10)
 y = model(u)
 
 println(round.(y;digits=2))
 
 # output
 
-[5.66 2.45 3.98 2.59 0.75 6.14 0.89 5.43 4.11 4.65]
+[3.62 4.74 3.58 8.75 3.64 3.0 0.73 1.16 1.0 1.73]
 ```
 
 See also [`DenseLBDNParams`](@ref), [`DiffLBDN`](@ref).

diff --git a/src/Wrappers/REN/ren.jl b/src/Wrappers/REN/ren.jl
@@ -43,7 +43,7 @@ using Random
 using RobustNeuralNetworks
 
 # Setup
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 nu, nx, nv, ny = 4, 2, 20, 1
 
@@ -62,7 +62,7 @@ println(round.(y1;digits=2))
 
 # output
 
-[1.38 0.56 0.89 2.11 2.14 0.89 1.63 0.44 1.24 1.26]
+[-1.49 0.75 1.34 -0.23 -0.84 0.38 0.79 -0.1 0.72 0.54]
 ```
 
 See also [`REN`](@ref), [`WrapREN`](@ref), and [`DiffREN`](@ref).

diff --git a/src/Wrappers/REN/wrap_ren.jl b/src/Wrappers/REN/wrap_ren.jl
@@ -29,7 +29,7 @@ using Random
 using RobustNeuralNetworks
 
 # Setup
-rng = MersenneTwister(42)
+rng = Xoshiro(42)
 batches = 10
 nu, nx, nv, ny = 4, 10, 20, 2
 
@@ -56,7 +56,7 @@ println(round(ren.explicit.B2[10];digits=4))
 
 # output
 
-0.0051
+-0.0034
 ```
 
 See also [`AbstractREN`](@ref), [`REN`](@ref), and [`DiffREN`](@ref).