# DS4DS Homework Exercise Sheet 6

In [1]:
using OrdinaryDiffEq
using MAT
using LinearAlgebra

### Task 1: OLS for the nonlinear pendulum - (5 points)

We once again consider the nonlinear, frictionless pendulum that we have already seen in Exercise 02. The state-space representation of the pendulum is given by

$$
\begin{align}
    \mathbf{x}(t) = \begin{bmatrix}\theta(t)\\ \frac{\mathrm{d}\theta(t)}{\mathrm{d}t} \end{bmatrix}
\end{align}
$$

and

$$
\begin{align}
    \frac{\mathrm{d}\mathbf{x}(t)}{\mathrm{d}t} = \begin{bmatrix}x_2(t) \\ -\frac{g}{L} \sin{(x_1(t))} \end{bmatrix}
\end{align}
$$

where $g = 9.81 \frac{\mathrm{m}}{\mathrm{s}^2}$ is the magnitude of the gravitational field, $L = 1 \mathrm{m}$ is the length of the rod, $\theta$ is the angle from the vertical axis, and $\frac{\mathrm{d}\mathbf{\theta}(t)}{\mathrm{d}t}$ is the angular velocity.

In [2]:
# Model parameters defined as constants, i.e., they can be also used within function calls
const g = 9.81; # gravity constant
const L = 1.0; # length of pendulum

**a)** Now we want to study the system using data. To this end, implement the function `Tsit5_ode_solver` and the function `pendulum!` (which contains the right-hand side of the ODE) outlined below. Within your solution, you are to use the `Tsit5()`-ODE Solver from the `OrdinaryDiffEq`-package. The state trajectory must be specified at the times $t=0, \Delta t, 2 \cdot \Delta t \dots$. **- (1 point)**

In [3]:
# Define the ODE model shown above as a function which can be passed to a solver

function pendulum!(dx, x, p, t)
    """Consider the examples at https://github.com/SciML/OrdinaryDiffEq.jl or examples from the lecture for guidance."""

    #--- YOUR CODE STARTS HERE ---#
    dx[1] = x[2]
    dx[2] = -(g / L) * sin(x[1])
    #--- YOUR CODE ENDS HERE ---#

end


function Tsit5_ode_solver(ode_function, x0, tspan, dt)
    """Defines the ODE problem and solves it.
    
    Args:
        ode_function: The ODE defined as a function that can be passed to the solver
        x0: Initial state of the system
        tspan: time-span in which the system should be evaluated (This is a tuple with two values (t_0, t_e))
        dt: The length of each of the timesteps
    
    Returns:
        t: A vector containing the time for the corresponding elements in the state trajectory
        x: State trajectory as a Matrix in the form (n_states, N) where N is the length of the trajectory
    """
# 27
    #--- YOUR CODE STARTS HERE ---#
    t_0, t_e = tspan
    #N = Int((t_e - t_0) / dt) + 1
    #t = collect((0:(N-1)) .* dt)
    prob = ODEProblem(pendulum!, x0, tspan)
    sol = solve(prob, Tsit5(), dt=dt, adaptive=false)
    t = sol.t
    x = transpose(stack(sol.u, dims=1))
    #--- YOUR CODE ENDS HERE ---#

    return t, x
end

Tsit5_ode_solver (generic function with 1 method)

In [4]:
@assert isa(pendulum!, Function)
@assert isa(Tsit5_ode_solver, Function)


# test shapes with dummy data
t, x = Tsit5_ode_solver(pendulum!, [0, 0], (0, 4), 0.01)
@assert size(t) == (401,)
@assert size(x) == (2, 401)


In [5]:
# please leave this cell as it is


**b)** Use the functions implemented above to simulate the system on the time horizon $[t_0, t_e] = [0 \mathrm{s}, 2.5 \mathrm{s}]$ with $\Delta t = 0.01 \mathrm{s}$ and create trajectory data for different initial conditions for $\mathbf{x}_0$

- i.)  $ \,\, \mathbf{x}_{0,i} = \begin{bmatrix}\frac{\pi}{9}\\ 0 \end{bmatrix}$
- ii.) $ \, \mathbf{x}_{0,ii} = \begin{bmatrix}\frac{\pi}{4}\\ 0 \end{bmatrix}$
- iii.) $ \mathbf{x}_{0,iii} = \begin{bmatrix}\frac{\pi}{2}\\ 0 \end{bmatrix}$.

Find the period $T$ with which the system oscillates for the different initial conditions. To this end, find the first point $t$ when $\theta(t) \approx \theta_0$ (take the point from your discrete trajectory that is closest to this value with regard to the 2-norm of the error). Save your results for $T$ in the variables $T_i$, $T_{ii}$ and $T_{iii}$ respectively. **- (1 point)**

In [6]:
function get_T_estimate(x0)

    #--- YOUR CODE STARTS HERE ---#
    error_tol = 0.05
    t, x = Tsit5_ode_solver(pendulum!, x0, tspan, dt)
    min = 10000.
    T = -1
    for i = 5:length(t)
        diff = sqrt((x0[1] - x[1,i])^2 + (x0[2] - x[2,i])^2)
        if diff <= error_tol
            min = diff
            T = i*dt
            break
        end
    end
    #--- YOUR CODE ENDS HERE ---#

    return T, x
end


# store your time settings in these variables
tspan = NaN
dt = NaN


# store the estimated oscillation periods for the initial conditions i), ii), iii)
T_i = NaN
T_ii = NaN
T_iii = NaN

# store the trajectories for the initial conditions i), ii), iii)
x_i = NaN
x_ii = NaN
x_iii = NaN

#--- YOUR CODE STARTS HERE ---#
tspan = (0, 2.5)
dt = 0.01

x0_i = [pi/9, 0]
x0_ii = [pi/4, 0]
x0_iii = [pi/2, 0]

T_i, x_i = get_T_estimate(x0_i)
T_ii, x_ii = get_T_estimate(x0_ii)
T_iii, x_iii = get_T_estimate(x0_iii)
#--- YOUR CODE ENDS HERE ---#

(2.38, [1.5707963267948966 1.570305826798857 … 1.4976140920459031 1.485147769699737; 0.0 -0.0980999976398097 … -1.197728747541993 -1.2955195618889237])

In [7]:
for T in [T_i, T_ii, T_iii]
    
    @assert isa(T, Number) "Ensure that T is a scalar value."
    
    @assert 0 < T <= 2.5 "Ensure that T is in the accepted value range."
    @assert isapprox(T / 0.01 - round(T / 0.01), 0, rtol=0, atol=1e-7) "Ensure that T is a multiple of the step size."
end

for x in [x_i, x_ii, x_iii]
   @assert size(x) == (2, 251) "Ensure that the state trajectory has the correct shape."
end


In [8]:
# please leave this cell as it is


**c)** Use the OLS approach to identify the matrix $\mathbf{\tilde{A}}$ of an approximated linear system of the form $\mathbf{x}[k+1] = \mathbf{\tilde{A}} \mathbf{x}[k]$. Use your trajectories from a) and identify one matrix per trajectory. **- (0.5 points)**

In [9]:
function estimate_A_tilde_OLS(x)

    #--- YOUR CODE STARTS HERE ---#
    Y = x[:, 2:end]
    Z = x[:, 1:end-1]
    A_tilde = inv(Z * Z') * Z * Y'
    #--- YOUR CODE ENDS HERE ---#

    return A_tilde
end

A_tilde_i = NaN
A_tilde_ii = NaN
A_tilde_iii = NaN

#--- YOUR CODE STARTS HERE ---#
A_tilde_i = estimate_A_tilde_OLS(x_i)
A_tilde_ii = estimate_A_tilde_OLS(x_ii)
A_tilde_iii = estimate_A_tilde_OLS(x_iii)
#--- YOUR CODE ENDS HERE ---#

2×2 Matrix{Float64}:
 0.999651    -0.069871
 0.00999864   0.99961

In [10]:
for A_tilde in [A_tilde_i, A_tilde_ii, A_tilde_iii]
    @assert size(A_tilde) == (2,2)
end


In [11]:
# please leave this cell as it is


**d)** Approximate the eigenfrequencies $\omega$ of the system for the three initial conditions given above using the $\mathbf{\tilde{A}}$ matrices. Then use these frequencies to approximate the oscillating periods $\tilde{T}$ of the learned linear systems. **- (1 point)**

In [12]:
function calculate_frequency(A_tilde)

    #--- YOUR CODE STARTS HERE ---#
    omega = abs.(imag(log.(eigvals(A_tilde)) / dt))[1]
    #--- YOUR CODE ENDS HERE ---#

    return omega
end

omega_i = NaN
omega_ii = NaN
omega_iii = NaN

T_tilde_i = NaN
T_tilde_ii = NaN
T_tilde_iii = NaN

#--- YOUR CODE STARTS HERE ---#
omega_i = calculate_frequency(A_tilde_i)
omega_ii = calculate_frequency(A_tilde_ii)
omega_iii = calculate_frequency(A_tilde_iii)

for i = 1:N
    count_i = 0
    if abs(sin(omega_i * i*dt)) < 0.01
        T_tilde_i = i*dt
        count_i += 1
        if count_i == 2
            break
        end
    end
end

for i = 1:N
    count_ii = 0
    if abs(sin(omega_ii * i*dt)) < 0.01
        T_tilde_ii = i*dt
        count_ii += 1
        if count_ii == 2
            break
        end
    end
end

for i = 1:N
    count_iii = 0
    if abs(sin(omega_iii * i*dt)) < 0.01
        T_tilde_iii = i*dt
        count_iii += 1
        if count_iii == 2
            break
        end
    end
end
#--- YOUR CODE ENDS HERE ---#

UndefVarError: UndefVarError: `N` not defined in `Main`
Suggestion: check for spelling errors or missing imports.

In [13]:
for omega in [omega_i, omega_ii, omega_iii]
    @assert isa(omega, Number) "The frequency is a scalar."
    @assert omega > 0 "The frequency is positive."
end


In [14]:
for T in [T_tilde_i, T_tilde_ii, T_tilde_iii]
    @assert isa(T, Number) "The oscillation period is a scalar."
    @assert T > 0 "The oscillation period is positive."
end


AssertionError: AssertionError: The oscillation period is positive.

In [15]:
# please leave this cell as it is


**e)** Determine the relative error between the estimates for the oscillating periods from subtask b) and d). Consider the value from b) to be the truth. **- (0.5 points)**

In [16]:
rel_error_i = NaN
rel_error_ii = NaN
rel_error_iii = NaN

#--- YOUR CODE STARTS HERE ---#
rel_error_i = (T_tilde_i - T_i) / T_i
rel_error_ii = (T_tilde_ii - T_ii) / T_ii
rel_error_iii = (T_tilde_iii - T_iii) / T_iii
#--- YOUR CODE ENDS HERE ---#

NaN

In [17]:
# please leave this cell as it is


**f)** Use the estimated matrices from c) to predict the system behavior for the three different initial conditions over the same timespan as before. Compute the MSE between the true trajectory and the linear approximation. **- (1 point)**

The MSE is defined as

$$
\begin{align}
    \mathrm{MSE}(\begin{bmatrix}\mathbf{x}[1] & \dots & \mathbf{x}[N]\end{bmatrix}, \begin{bmatrix}\mathbf{\tilde{x}}[1] & \dots & \mathbf{\tilde{x}}[N]\end{bmatrix}) 
    = \frac{1}{N} \sum_{k=1}^N \| \mathbf{x}[k] - \mathbf{\tilde{x}}[k] \|_2^2
\end{align}.
$$

In [18]:
trajectory_MSE_i = NaN
trajectory_MSE_ii = NaN
trajectory_MSE_iii = NaN

#--- YOUR CODE STARTS HERE ---#
x_tilde_i = A_tilde_i * [[pi/9, 0] x_i[:,1:end-1]]
x_tilde_ii = A_tilde_ii * [[pi/4, 0] x_ii[:,1:end-1]]
x_tilde_iii = A_tilde_iii * [[pi/2, 0] x_iii[:,1:end-1]]

trajectory_MSE_i = 0
trajectory_MSE_ii = 0
trajectory_MSE_iii = 0

N = size(x_tilde_i)[2]

for i = 1:N
    trajectory_MSE_i += sqrt((x_i[1, i] - x_tilde_i[1, i])^2 + (x_i[2, i] - x_tilde_i[2, i])^2)
    trajectory_MSE_ii += sqrt((x_ii[1, i] - x_tilde_ii[1, i])^2 + (x_ii[2, i] - x_tilde_ii[2, i])^2)
    trajectory_MSE_iii += sqrt((x_iii[1, i] - x_tilde_iii[1, i])^2 + (x_iii[2, i] - x_tilde_iii[2, i])^2)
end

trajectory_MSE_i /= N
trajectory_MSE_ii /= N
trajectory_MSE_iii /= N
#--- YOUR CODE ENDS HERE ---#

0.23505113510143033

In [19]:
for trajectory_MSE in [trajectory_MSE_i, trajectory_MSE_ii, trajectory_MSE_iii]
    @assert size(trajectory_MSE) == ()  "The trajectory MSE is a scalar."
end


## 2) DMD of the flow past a cylinder - (5 points)

Perform a DMD analysis on the vortex shedding data set. This phenomenon occurs frequently in nature, when a fluid hits a bluff body (see, e.g., the following NASA video on cloud patterns: https://www.youtube.com/watch?v=SawKLWT1bDA). The data set consists of 50 snapshots (with $\Delta t = 0.1$) of the absolute velocity of a fluid entering from the left and flowing around a cylinder. Each snapshot contains velocity measurements on a $449 x 199$ grid. In vectorized form, this leads to a data matrix $Z \in \mathbb{R}^{89351 \times 50}$.

In [20]:
# load the data
data = matread("VortexShedding.mat")
data = data["Z"];

You may use this code to produce an example plot of the initial state:
```
display(heatmap(reshape(data[:, 1], 199, 449), aspect_ratio=1, color=:auto, title="Initial state"))
```
**Please remove your plotting commands before submisson!**

<img src="./initial_state.png" alt="initial_state" width="800"/>

Additionally, if you want to take a closer look at the data, there is an animation of the data as a gif in the exercise folder.

### Task a) - (1 point)
Decompose the data matrix into the two time-shifted versions $Z$ and $Z’$ and compute a singular value decomposition of $Z$. Store the 10 largest singular values $\sigma$ in a separate array.

In [21]:
largest10singularValues = NaN

#--- YOUR CODE STARTS HERE ---#
Z_base = data[:, 1:end-1]
Z_shift = data[:, 2:end]
U, S, V = svd(Z_base)
largest10singularValues = S[1:10]
#--- YOUR CODE ENDS HERE ---#

10-element Vector{Float32}:
 3389.3948
  338.71133
  330.96008
  115.889626
  113.969765
   38.848377
   38.727436
   29.540396
   28.96133
   10.872071

In [22]:
@assert length(largest10singularValues) == 10
@assert size(Z_base,2) == size(data,2)-1
@assert size(Z_shift,2) == size(data,2)-1


### Task b) - (2 x 0.5 point)
How many singular vectors are at least required to reconstruct 1) 90% and 2) 99% of the original information? (_Hint: Take a look at the singular values_)

In [23]:
r_90 = nothing
r_99 = nothing

#--- YOUR CODE STARTS HERE ---#
total = sum(S)
current = 0
r_90 = 1
r_99 = 1
for i = 1:length(S)
    current += S[i]
    if current / total < 0.9
        r_90 += 1
    end
    if current / total < 0.99
        r_99 += 1
    end
end
#--- YOUR CODE ENDS HERE ---#

In [24]:
@assert isa(r_90, Number)


In [25]:
@assert isa(r_99, Number)


### Task c) - (1 point)
Identify a linear model with system matrix $\tilde{A}$ from the data via dynamic mode decomposition. Set the rank of the reduced DMD method to $r=20$.

In [45]:
tilde_A = nothing

#--- YOUR CODE STARTS HERE ---#
r = 20
tilde_A = U[:,1:r]' * Z_shift * V[:,1:r] * diagm(1 ./ S[1:r])
#--- YOUR CODE ENDS HERE ---#

20×20 Matrix{Float32}:
  1.0          -0.000223817  -7.70178f-5   …  -0.000262729   0.00284551
  0.000167809   0.976911     -0.210735        -0.00397781    6.78352f-5
  0.000155628   0.202028      0.980057         0.00318481    0.000203506
  0.000138228  -0.000570483   0.00220297      -4.14952f-5   -0.000925333
 -6.42544f-5    0.00155736   -0.000552807     -0.000314875   0.0109769
  7.58721f-5   -0.000699936   0.000647382  …   0.0258125     0.000388565
  1.29193f-5    0.000481143   0.000596152     -7.23526f-5    0.00016572
 -7.53532f-5    0.000518309  -0.000976336      0.000138132  -0.0127684
  1.14032f-5   -0.000297764  -2.01185f-5      -0.00128613   -0.00469888
  2.79618f-5   -0.000250253   0.000317489      0.0576721     0.000350779
 -1.92936f-5    8.94697f-5   -0.000279946  …  -0.00539892   -0.000809709
 -2.54222f-5    0.000196995  -0.000311928      0.0019799    -0.019552
 -7.66561f-6    9.4723f-6    -0.000123599     -0.00199021   -0.0385133
 -8.46772f-6    7.43607f-5   -9.81109f-5 

In [46]:
@assert isa(tilde_A, Matrix)
@assert size(tilde_A) == (20, 20)


### Task d) - (2 x 0.5 point)
Sort the eigenvalues of $\tilde{A}$ according to their frequency in ascending order (starting with 0). Which is the lowest frequency $\omega$ in the system dynamics (aside from the stationary mode with $\lambda = 0$)? In addition, calculate the corresponding period $T$ with which the system oscillates.

In [111]:
lowest_freq = nothing
T = nothing

#--- YOUR CODE STARTS HERE ---#
μ = eigvals(tilde_A)
P = eigvecs(tilde_A)

λ = log.(μ) / dt
ω = imag.(λ)

i_sort = sortperm(ω, by=abs)

ω = ω[i_sort]
λ = λ[i_sort]
μ = μ[i_sort]
P = P[:, i_sort]

Φ = Z_shift * V[:, 1:r] * diagm(1 ./ S[1:r]) * real.(P)

lowest_freq = abs(ω[2])
count = 0
for i = 1:N
    if abs(sin(abs(ω[2]) * i * dt)) < 0.1
        count += 1
        if count == 2
            T = i*dt
            break
        end
    end
end
#--- YOUR CODE ENDS HERE ---#

In [112]:
# Public test
@assert isa(lowest_freq,Number)
@assert isa(T,Number)


In [None]:
# please leave this cell as it is


### Task e) - (1 point)
Use the reduced model for prediction via the “project-map-lift” approach. To this end:
1. Project the initial state onto the $r$-dimensional subspace,
2. Simulate the low-dimensional system for 20 time steps,
3. Lift the final state last state to the original space,
4. Calculate the error (using the 2-norm) between the the reduced order model solution and the original data at that time step.

In [138]:
error = nothing
#--- YOUR CODE STARTS HERE ---#
initial_projection = U[:,1:20]' * Z_base[:,1]
trajectory = zeros(20, 20)
trajectory[:, 1] = initial_projection
for i = 2:20
    trajectory[:, i] = tilde_A * trajectory[:, i-1]
end
reprojection = U[:,1:20] * trajectory[:,20]
error = norm(Z_base[:,20] - reprojection)
#--- YOUR CODE ENDS HERE ---#

0.1829502167109741

In [139]:
@assert isa(error,Number)
