In [2]:
%%javascript
require.config({paths: {toc: "//rawgit.com/rweigel/js-rsw/master/jupyter/toc"}});require(["toc"]);

<IPython.core.display.Javascript object>

# Experiments

## 1-D Diffusion

Experiements on the 1-D diffusion problem with time-varying boundary conditions are considered.

The BFGS minimization algorithm was found to be the best (from a speed and stability standpoint) of the options available in the scipy.optimize package. I tried all of the other algorithms, and all were successful to a greater or lesser extent, but based on the results here, I will use BFGS going forward. [BFGS](https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html) uses the analytical gradient rather than a numerically-estimated gradient.

In all cases, `nx = nt = 11` was used, with the random number seed of 0 used in most cases.

Uness otherwise stated, diffusion coefficient `D=1`.

All analytical solutions obtained using Mathematica. Solutions were typically infinite Fourier series, so the series were arbitrarily terminated after 100 terms for this work.

All times using a Jupyter notebooks on my Mac.

### Set 1: Stability checks

#### [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb)

This is the simplest possible check. The initial profile is flat at `Y=0` from `x=0` to `x=1`. The BC at `x=0` and `x=1` are fixed at 0. Therefore, the profile should not change with time, and should remain flat at `Y=0`.

The initial run used the BFGS algorithm, with defaults for all parameters. The default convergence criterion was `gtol=1e-5`, where `gtol` is the "infinite norm" (the maximum value) of the gradient of the objective function. The absolute error was O(1e-6). Since the analytical solution was 0, the relative error was undefined. Interestingly, the error was highest (O(3e-6)) at the first time step (`t=0.1`), then oscillated about 0 as time proceeded.

I then decreased the value of gtol to determine how precise the final results could be. I examined the results with gtol=1e-e (absolute error O(1e-6)), 1e-8 (absolute error O(6e-8)), 1e-10 (absolute error O(1e-9)), 1e-12 (absolute error O(1e-9)), and 1e-14 (terminated due to precision loss, but final result had absolute error O(1e-9).

Execution time ranged from 18 s to 584 s.

#### [diff1d_1_BFGS.ipynb](/notebooks/experiments/diff1d_1_BFGS.ipynb)

This case was similar to [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb), but the profile was fixed at `Y=1`. The trained network parameters and error behavior for this case were identical to those for the [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb) case.

Execution time ranged from 17 s to 537 s.

#### [diff1d_flat_BFGS.ipynb](/notebooks/experiments/diff1d_flat_BFGS)

This case was similar to [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb), but the profile was fixed at `Y=0.5`. The trained network parameters and error behavior for this case were identical to those for the [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb) case.

Execution time ranged from 16 s to 555 s.

#### [diff1d_rampup_BFGS.ipynb](diff1d_rampup_BFGS.ipynb)

In this case, the initial profile was a linear ramp from `Y(0,0)=0` to `Y(1,0)=1`. The BC were fixed at 0 for `x=0`, 1 for `x=1`, and so the profile should not change with time. The trained network parameters and error behavior for this case were identical to those for the [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb) case.

Execution time ranged from 16 s to 593 s.

#### [diff1d_rampdown_BFGS.ipynb](diff1d_rampdown_BFGS.ipynb)

In this case, the initial profile was a linear ramp from `Y(0,0)=1` to `Y(1,0)=0`. The BC were fixed at 1 for `x=0`, 0 for `x=1`, and so the profile should not change with time. The trained network parameters and error behavior for this case were identical to those for the [diff1d_0_BFGS.ipynb](/notebooks/experiments/diff1d_0_BFGS.ipynb) case.

Execution time ranged from 16 s to 514 s.

### Set 2: Nonlinear, static BC cases

#### [diff1d_sine_BFGS.ipynb](/notebooks/experiments/diff1d_sine_BFGS.ipynb)

This case was not run successfully due to overflow exceptions.

#### [diff1d_flat+sine_BFGS.ipynb](/notebooks/experiments/diff1d_flat+sine_BFGS.ipynb)

This case used a starting profile consisting of a half-sine wave with amplitude of 0.5, based at `Y=0.5`:

`Y(x,0) = 0.5*(1 + sin(pi*x))`

The boundary conditions at `x=0` and 1 were fixed at 0.5. The expected behavior was a decay of the profile to become flat at `Y=0.5`. A different random number seed (3) was used, since the original value of 0 resulted in overflow exceptions.

The initial case using BFGS defaults ended when the maximum iteration count of 8000 was reached. Upon examination of the incomplete results, as in previous cases, the  error decreased rapidly over the first few time steps. The relative error was initially as high as 0.025, but by `t=0.4` was of O(1e-3). The maximum iteration count was then increased to 16000, using the same default gtol of 1e-5. In this case, the training terminated due to precision loss. These incomplete results were very similar to the previous case, which is not surprising given that the precision loss forced termination after 8146 iterations, as opposed to 8000 iterations in the previous case.

Execution time ranged from 1459 s to 1483 s.

#### [diff1d_flat+sine_D1e-1_BFGS.ipynb](/notebooks/experiments/diff1d_flat+sine_D1e-1_BFGS.ipynb)

This case is similar to the previous case, the only difference being a diffusion coefficient `D=0.1` was used. Since the large-time solutions did not approach closely to 0, the training was more stable. Relative error for the default case of gtol=1e-5 was O(1e-5), decreasing to O(1e-6) for gtol=1e-6, 1e-7 at gtol=1e-8, somewhat smaller at 1e-10 (at which point the iteration limit was set to 32000 to ensure convergence). For gtol=1e-10, the error was O(1e-7), despite early termination due to precision loss.

Execution time ranged from 35 s to 5298 s.

### Set 3: Time-varying BC cases

#### [diff1d_increase_BFGS.ipynb](/notebooks/experiments/diff1d_increase_BFGS.ipynb)

The initial profile was flat at `Y=0`. The BC at `x=1` was fixed at 0, while the BC at `x=0` increased at an accelerating rate:

`dY/dt @ x=0 = 0.5*a*t**2`, with `a=1`

This BC was selected after repeated failures to obtain reasonable solutions with a linear increase at `x=0`. Presumably this problem is similar to the error decay over time in previous cases, and is probably caused by the discontinuity in the higher-order derivatives of the boundary condition function at x=0.

With the exception of the first time step, absolute error was of O(1e-5). The error decay after the first time step was much faster than observed in previous cases. The error improved to O(1e-6) for `gtol=1e-6`, and did not improve significantly as gtol was increased, since the calculations terminated due to precision loss.

Execution time ranged from 189 s to 2328 s.

#### [diff1d_decrease_BFGS.ipynb](/notebooks/experiments/diff1d_decrease_BFGS.ipynb)

This case was analogous to the previous case, but the profile started flat at `Y=1` and decreased at `x=0`. The error was again O(1e-5) in the default case, with little or no improvement as gtol was refined and the maximum iteration count was increased.

Execution time ranged from 509 s to 5298 s.

### Summary

Initial thoughts:

The stabililty check cases ensured that the code did not introduce spurious behavior. I found it interesting that the 5 linear cases all resulted in exactly the same values for the network parameters. Since no dynamic behavior was expected, and all cases started with the same random seed value, perhaps this should not be surprising.

The static BC cases are now (finally) exhibiting the correct quantitative behavior. Earlier attempts at these cases behaved correectly qualitatively, but were quantitatively poor. The problem was traced to an error in the form of the trial solution used by the neural network. Once that error was corrected, the quantitative behavior was much improved.

Of course, the interesting part here is the cases with dynamic BC. These cases, as well as the static BC cases, exhibited the behavior of rapid decay in the solution error as time steps proceeded. As mentioned above, I believe this is caused by the discontinuity in the higher-order derivatives of the BC function. If I recall correctly, similar effects are observed in finite-difference solutions of this type.

### TODO

* Perform new set of runs with increased number of training points. Will this reduce the effect of the error decay in the early time steps?
* Investigate oscillatory behavior with time - will the spatial oscillation frequency increase as t > 1?
* Add code that shows problem with linear ramp boundary condition.
* Enhance the code to 2-D, and then 3-D, diffusion problems. The main code difference will be in the trial function, and the increased number of training points - the actual training code and computation of the objective function will remain the same as in the 1-D case.