# Common Algorithm Developments

## Content

1. [Speedup: Trials to improve speed [2024-02-28]](#log_moor_vAlgo_1)
2. [Drag and added mass for the line[2024-03-01]](#log_moor_vAlgo_2)
3. [Speedup: TransientSemiLinearFEOperator, Gridap_Update, evaluate_cache, Interpolations [2024-03-22]](#log_moor_vAlgo_3)
4. [Speedup: Break down eveluate_cache [2024-03-26]](#log_moor_vAlgo_4)
5. [CellField for Wave-Vel along the mooring line [2024-04-03]](#log_moor_vAlgo_5)
6. [GenerlisedAlpa2: Value of rhoInf[2024-09-02]](#log_moor_vAlgo_6)


## Attempting

- Empty


## List of Work

### List of Features to implement

- [x] Spring bed
- [x] Spring bed damping
- [ ] Separate out weight and buoyancy field
- [x] Drag: self
- [x] Drag: current
- [x] Drag: wave
- [ ] Fairlead motion: with wave-particle
- [ ] Wave-elevation profile output
- [x] Modular


---
---

<a id = 'log_moor_vAlgo_6' />

## GenerlisedAlpa2: Value of rhoInf [2024-09-02]

- GenAlpha is always stable
- rhoInf = 1.0 Midpoint
    - No dissipation case: Can Diverge due to high freq
- rhoInf 0.0 Fully implicit
    - Asymptotic annhilition: Highly dissipative
    - T < 10*Δt is dissipated. The following plot is from [Reference](https://miaodi.github.io/finite%20element%20method/newmark-generalized/)
- rhoInf = 0.4 
    - Used in OrcaFlex implicit
 
|  |
| :--- |
| <img width="100%" src="./img_vAlgo/C06_GA_dissipation.png" /> |
| **Plot of dissipation** for rhoInf = 0.0 |

In our model, it seems like anything other than rhoInf=0 seems to cause issues.
The following in the simulation with the bed and catenary line.
The noise could be from the bed, because, the straight line cases run fine with rhoInf=1

|  |
| :--- |
| <img width="100%" src="./img_vAlgo/C06_GATest_anim1.gif" /> |
| **Comparison for a line with bedSpring and bedDamp** |

|  |
| :--- |
| <img width="100%" src="./img_vAlgo/C06_GATest_anim2.gif" /> |
| **Plot of ETang Magnitude** (Black) rhoInf = 0, (Green) rhoInf = 0.2, (Brown) rhoInf = 0.4, (Red) rhoInf = 1.0 |




---

---
---

<a id = 'log_moor_vAlgo_5' />

## CellField for Wave-Vel along the mooring line [2024-04-03]

- `FEFunction` is a type of `CellField`
- `CellState` requires a `CellField`
- So for the wavae vel along the line, we will linearise the problem, where we will use the waveVel(n) at time-step n for evaluating the solution at t(n+1). 
    - This is an acceptable assumption because
        1. I dont think this will effect stability, as the wave vel is only used here for adding damping in the solution
- So the plan is 
    - Create a `CellState` which returns the WaveVel vector at all the quadrature points.
    - Update the `CellState` every time-step based on presently known solution.
    - I was thinking of doing it using `FEFunction`.
        - But `FEFunction` is a type of `CellField`. It will contain info about the `FESpace` and other things, which is not necessary for now
        - Instead the `CellField(fnc, \Omega)` will create array of operations.
        - The len of the array is the number of elements in \Omega.
        - Each entry of this array contains the call for the `fnc` in that element
        - Using this `CellField` we can then evaluate the `CellState`, thus returning the value at all quadrature points.

---
---

<a id = 'log_moor_vAlgo_4' />

## Speedup: Break down eveluate_cache [2024-03-26]


- So usually we do interpolation at probes as follows

```
xNew = X + uh
xNewPrb = xNew.(rPrb)
```

- This is simple and easy. However its very slow.
- In the previous update [link](#log_moor_vAlgo_3), I had switched to evaluate_cache() approach to speed this up by 1.03x.
- The evaluate_cache() code is as follows

```
xNew = X + uh
cache_xNew = Gridap.Arrays.return_cache(xNew, rPrb)
xNewPrb = evaluate!(cache_xNew, xNew, rPrb)
```

- However there is another level of optimisation.
- This `cache_xNew` actually consists of two caches => `(sub_cache1, sub_cache2) = cache_xNew`
- Out of this, `sub_cache1` contains the KDTree, that never changes as long as you probe at the same points.
- Hence, we can save time by stop evaluating this `sub_cache1`
- Moreover, this `sub_cache1` can also be used for interpolating other values, such as stress, at the same points. Hence saving even more time!.
- Finally, although the `sub_cache2` has be still assembled for each OperationCellField, we can re-use certain terms within this too.
- Therefore, we can save some time by opening up the `return_cache()` function and re-using terms that only require one time calculation, or have reusable parts.

One time calc
```
xNew = X + uh
save_cache1, save_cache2 = Gridap.Arrays.return_cache(xNew, rPrb)
save_f_cache = save_cache2[2]
```

Reusable parts in each time-step
```
xNew = X + uh   
cell_f = get_array(xNew)
cell_f_cache = array_cache(cell_f)    
cache2 = cell_f_cache, save_f_cache, cell_f, xNew
cache_xNew = (save_cache1, cache2)
xNewPrb = evaluate!(cache_xNew, xNew, rPrb)
```

This entire edit gives us **1.057x speedup**.


---
---

<a id = 'log_moor_vAlgo_3' />

## Speedup: TransientSemiLinearFEOperator, Gridap_Update, evaluate_cache, Interpolations [2024-03-22]

### Gridap_Update

- There was a recent update to gridap main branch on 2024-March-19
- This merged the rk-solvers branch with main branch
- There was a big restrucuturing of certain Jacobian calculations
- After the update, the mooring code was **1.35x faster !!**. 
    - This is one of the single biggest gains for us.
    - Thanks to the developers for the update

|  |
| :--- |
| <img width="100%" src="./img_vAlgo/C03_gridapUpdate_1p35x_faster.png" /> |
| **Old Gridap (Blue). After Gridap update (orange)** |


### Interpolations and evaluate!(cache)

- **1.045x speedup** after doing intp in fairlead and cache in sigma
- Main gain is from evaluate_cache, which now takes 69-120ms per tStep instead of 200ms

### TransientSemiLinearFEOperator

- This is particularly useful when the mass is constant
- **1.28x speedup** after doing this
    - Cannot do this when including the added-mass.
    
|  |
| :--- |
| <img width="100%" src="./img_vAlgo/C03_transSemiLinOp_1p288x_faster.png" /> |
| **TransientFEOperator (Blue). TransientSemiLinearOperator (Orange). Top: Time in seconds. Bottom: Number of iterations. x-axis is the time-instant t (s).** |

### TransientQuasiLinearFEOperator

- This would be useful when the mass is a linear function of u and dtt(u), I think.

---

---
---

<a id = 'log_moor_vAlgo_2' />

## Drag and added mass for the line[2024-03-01]

Now the next changes will add the drag term 


C6) Add normal drag term
    
- I used `CellState` wherever possible
- One possible inneficiency is `sΛ = (t1m2.^0.5) / T1m_cs` because this involves calculation of a sqrt


_Checkpoint_5:_ 268s <br> 
After C6, 268s for 50 time-steps for the case, including drag and julia -O0. <br>
Checkpoint_5 (3.7x) **slower** than Checkpoint_4 


C7) Add normal and axial drag terms
    
- Compute both in one fnc `drag_\GammaX` and add them
- Possibly slow coz we are adding two terms
- But calculating them together avoids recalculation of a number of terms.


_Checkpoint_6:_ 507s <br> 
After C7, 507s for 50 time-steps for the case, including drag and julia -O0. <br>
Checkpoint_6 (1.9x) **slower** than Checkpoint_5


C8) Add normal and axial drag terms
    
- Compute in individual fncs `drag_n_\GammaX` and `drag_t_\GammaX`
- Possibly slow coz we are recomputing quantities
- But it avoids adding the terms before the integral


_Checkpoint_7:_ 442s <br> 
After C8, 442s for 50 time-steps for the case, including drag and julia -O0. <br>
Checkpoint_7 (1.65x) **slower** than Checkpoint_5 <br>
Checkpoint_7 (1.14x) faster than Checkpoint_6

Checkpoint_7 with `julia -O3` is 373s. Its only (1.18x) faster.

---

### Speedup by switch off drag on line on bed [NOT IMPLEMENTED]

- On adding the drag terms, the convergence and the iteration stepping gets slower.
- Switching them off using the tanh actually slows the sims down a bit. Dont understand why. 
    - Tried a gentler dying of the drag terms, Still slows down the code a bit.
    - So not doing this
    
---

---
---

<a id = 'log_moor_vAlgo_1' />

## Speedup: Trials to improve speed [2024-02-28]

- We are trying to speed up the computations by pre computing the constant values at the quadrature points and saving them
- This will prevent their repeated computing and interpolation of the constant quantities.
- https://gridap.github.io/Tutorials/dev/pages/t010_isotropic_damage/#Main-function-1


_Checkpoint_0:_  465s <br>
465s for 50 time-steps for the case without drag and julia -O0.


C1) Cell State

- Created `CellState` for the constant quantities.
    - `JJ_cs = (J \odot J)^0.5`: Done
        

C2) Sum of individual terms in the dynamic res (resD), instead of integrating the sum

- Noticed while developing Bsnq that `\int( fnc1() + fnc2() )d\Omega` is slower than `\int( fnc1() )d\Omega + \int( fnc2() )d\Omega`


_Checkpoint_1:_ 315s <br> 
After C1 and C2, 315s for 50 time-steps for the case without drag and julia -O0. <br>
Checkpoint_1 (1.5x) faster than Checkpoint_0


C3) Cell State

- `P_cs`


_Checkpoint_2:_ 240s <br> 
After C3, 240s for 50 time-steps for the case without drag and julia -O0. <br>
Checkpoint_2 (1.3x) faster than Checkpoint_1 <br>
Checkpoint_2 (1.9x) faster than Checkpoint_0


C4) Cell State

- `QTrans_cs`


_Checkpoint_3:_ 80s <br> 
After C4, 80s for 50 time-steps for the case without drag and julia -O0. <br>
Checkpoint_3 (3x) faster than Checkpoint_2 <br>
Checkpoint_3 (5.8x) faster than Checkpoint_0


C5) Cell state in the trial function term

- Modified `\Nabla X_Dir(ψu)` to `(∇(ψu)' ⋅ QTrans_cs)`


_Checkpoint_4:_ 72s <br> 
After C5, 72s for 50 time-steps for the case without drag and julia -O0. <br>
Checkpoint_4 (1.1x) faster than Checkpoint_3 <br>
Checkpoint_4 (6.2x) faster than Checkpoint_0

---

### Comparison of results - Whalin Shoal

Empty

---

---
---

## References

