# Plan for development directions for VE DM project

## Current state of play

Now that we have the most basic scenario working, let us outline the directions unexplored as well as the achievements already present.

On one hand, DeepMoD has been successful for finding the correct terms and coefficients for:

- a stress input

- a first order problem with second order terms present

- with relatively few data points (500)

- with a successfully working so far thresholding condition of scaled < 0.15

- with a decay time roughly of the same size as the time period of the input data (~15 seconds)

- with the input data in the form of a single frequency sinc curve

- with the equillibrium spring arm weighted down compared to viscous arm

On the other hand, we have not explored how well DeepMoD will cope with:

- Having two (or more) decay terms present

- Having the decay terms present that are not similar to the time period of the curve.

- Having decay terms present that are weighted more or less strongly than the other terms relatively (E_mods constitute weighting)

- Having noise present and when so, differing degrees of it.

- Having a sampling rate that is less frequent than the oscillation rate of the data

- Having a sampling rate that is less frequent than the frequency of the model decay constants.

- simpler input data forms (does sinc beat sine for same frequency?)

- more complex input data forms (multiple frequencies - I'm, thinking Fourier style addition of frequencies....)

- having third and higher order terms present even with only a first order problem.

- universal differences in time scales (both decay constants of model and input data kept the same relative to each other)

- Having more terms that need to be thresholded out of existence

In addition to this, I have been testing how DeepMoD handles different time decays in a 1st order problem setting. For a sinc curve with time period of ~13 seconds, DeepMoD has successfully run and found terms and coeffs for problems with time decays of 5 - 15 seconds but not outside of that. Amplitude of sinc is always 1.

## Prioritisation

First: independantly understand the affects of varying any one parameter at a time. This will give a good understanding of the current limitations of the implementation and give clues about how best to improve where the limitations are most severe.

Second: use the insight gained to see if I can improve the performance.

I will approch the directions in this order:

1. Vary decay constant (some preliminary experimentation already done)

2. Increase the number of higher order terms to be eliminated during thresholding.

3. Trying 2nd and higher order problems with easy constants

4. Vary equillibrium spring constant

5. Vary decay weighting modulus

6. Adding in different levels of noise

7. Changing the universal time range for the data to be active in. Keep all constants affecting time-dependant performance the same relative to each other.

8. (without random subsampling) having very low sampling rates

9. Trying alternative functional forms.

## General themes of investigation

All testing will build off a working base context wth only one parameter varied.

This base set up will involve:

- Stress always as the input (analytically described) variable

- A sinc curve of rotational frequency 1, and so time period of $2\pi$ and max amplitude 1 as input

- A first order Kelvin Model with data generated using Boltzmann superposition integrals, with model parameters of $E_0 = 5$, $E_1 = 1$ and $\eta_1 = 2\pi$ (therefore $\tau_1 = 2\pi$).

- No noise

- 1000 randomly sampled points from a data set originally synthesised with 5000 points.

- Data generated for 20 second. This means an average sampling rate of 50 data points / second after sub-sampling.

- Will run with a lr of 0.001

- Will run for 30001 epochs before thresholding. I will then run simply for 5000 more epochs after, as the constants settle very quickly by this point if the right ones have been chosen.

- I will use a network with 4 layers of 30 neurons.

- I will keep the default lambda of $10^{-5}$

- I will force all coeffs to always be initialised at positive values.

- I will ask DeepMoD to calculate up to second order derivatives, so that it has the opportunity to prove it can eliminate them.

I will implement the below rough protocol:

1. Loop through the variation of a single parameter.

2. Do this 3 times for each condition to check for consistency or sensitivity to random initialisation (differentiation weak and strong success).
    (is this needed?)

3. Record the error compared to the correct coefficients in each case + other data (see comments).

4. Graph success robustness to see where the limitations are currently.

## Outline

### 1.

Vary decay constant from $16\pi$ to $\pi/4$ (8x lower, 8x higher than time period), halving each time.

### 2.

Increase the number of extra derivatives in library from up to 2nd order, 3rd order, and 4th order, all for  a 1st order actual problem still.

### 3.

I have done some experimentation with branches that are similar to each other and noticed empirically as well as from the mathematical treatment that branches that have the same decay constant are indistinguishable from a single branch with a single decay constant.

I will test both a 2nd order problem and a third, I do not anticipate any chance of success beyond for now.

I will try a 2nd branch branch with parameters:

$
E_2 = 1, \eta_2 = 4\pi, (\tau_2 = 4\pi)
$

$
E_2 = 1, \eta_2 = 8\pi, (\tau_2 = 8\pi)
$

$
E_2 = 1, \eta_2 = \pi, (\tau_2 = \pi)
$

$
E_2 = 1, \eta_2 = \pi/2, (\tau_2 = \pi/2)
$


I have kept $E_2 = E_1$ as I want both decays to be equally weighted in this test. It is difficult to predict the interaction betwen the model paramters in the two branches to give DeepMoD thie best shot, but this simple scenario seems a good place to start.

Regarding the 3rd branch, I will mostly use the same options, but I will see how this second branch goes to decide what will be appropriate.

### 4.

Vary equillibrium modulus from 40 to $5/8$ (8x lower, 8x higher than reference), halving each time.

### 5.

Vary viscosity associated modulus from 8 to $1/8$ (8x lower, 8x higher than reference), halving each time.

### 6.

I will use noise scaled to different percentasges of the standard deviation of the data.

These percentages can be investigated in 'money steps', ie:

1%, 2%, 5%, 10%, 20%, 50%.

### 7.

The system seemed to struggle when I dropped both the time period of the sinc curve and the decay constant in the single branch to $\pi/10$. Motivated by this, I would like to understand the range in which the constants can be found. The only thing I can think of that could cause this is that the absolute magnitude of the derivatives became larger, and this could have made scaling tricky.

The time periods I will examine will be from $\pi/4$ to $16\pi$, 8x higher and lower than the known working condition, and I will double each step.

### 8.

I will define the sampling rates to test with reference to the characteristic time period of the decay and oscillation frequency (which are for now the same) of $2\pi$

Therefore I will try a sampling rate of $16/\pi$ /s to $1/\pi$ /s (Nyquist frequency limit), halving each time.

For the full time range of 20 seconds, this will correspond to 102, 51, 26, 13, 7 data points, which are all half each other but rounded up always.

### 9.

Finally, I will test:

- a pseudo-sinc function where the amplitude decay is modified

- a sine function of the same rotational frequency (1)

- summation of 3 exponential decays at decay constants of $\pi$, $2\pi$ (matching initial time period) and $4\pi$, weighted equally. This is an initial guess, depending on the capabilities of DeepMoD observed in earlier tests, these values may change. The point, however, is to see what is the more tractable problem for DeepMoD, so hopefully the performance will be different to the earlier tests anyway...

- summation of 3 sine curves, with time periods matching the above decay constants.

- summation of 3 sinc curves, also matching.

- continuous distribution of frequencies to produce wave packet.

## Comments

I have a notebook already with the flow of varying I think at least most of these parameters and running DeepMoD but....

- I will need to think of and design a flow for saving the relevant results of each test. These results should include (do you agree?):
    - The target coeffs
    - The actual coeffs arrived at
    - the sparsity mask
    - The average error in the coefficents
    - a binary decision on success or failure.
    - the network prediction data (strain)
    - the network target data (strain)
    - the time series data and stress series data
    - the conditions of the test, stating the varied parameter and its value.
    - The repeat (all tests done 3 times)


- I will perhaps be able to produce bar graphs of the key data points, ie, success and accuracy for each varied parameter. I imagine this will not take very long to replicate, once I figure out a nice way which will be consistant.

- This is going to take a long time. It will take time to:
    - run all of the DeepMoD tests, at up to 35000 epochs each.
    - design the system for saving all the relevant data in a format that makes sense.

Hopefully the graphing doesn't count as a time cost as I can do that whilst it is all running, but I will need to understand how I am going to save the data before I start.