# Technical note: Logging

A Myokit model defines a function

\begin{equation}
\dot{y}(t)=f\left(y(t),u(t),t, p\right)
\end{equation}

Here $y$ is the system state, $\dot{y}$ contains the state derivatives and $t$ is the time variable.
External inputs to the system are given as $u\left(t\right)$.
Common entries in $u$ are a dimensionless pacing variable and a diffusion current from neighbouring cells.
Parameters, physical constants and other time-invariant values are lumped together in $p$.
During the course of a simulation these values will not change, so for the rest of this document we'll simply write

\begin{equation}
\dot{y}=f(y,u,t)
\end{equation}

## Intermediate variables and notation

For a typical cell model, $f$ is calculated in two parts:

1. A number of currents and fluxes are calculated based on the state
2. Time-derivatives are calculated based on these _intermediary variables_

We can write such as system as:

\begin{align}
i(t)       &= f_1(y, u, t) \\
\dot{y}(t) &= f_2\left(i(y, u, t), y, u, t \right) \\
           &= f(y,u,t)
\end{align}

However, for some models, we have a slightly more complicated case where some of the intermediary variables at time $t$ _depend on some of the time-derivatives at time_ $t$.

We can write this more general system by assuming a right-hand-side function $g$, which has two outputs:

\begin{equation}
f(t), i(t) = g\left(y(t),u(t),t, p\right)
\end{equation}

Another notation that is sometimes used is to view $i$ as the result of some separate _output function_ $h$:

\begin{align}
\dot{y}(t) &= f\left(y(t), u(t), t\right) \\
i(t)       &= h\left(\dot{y}(t), y(t), u(t), t\right)
\end{align}

This can be useful for mathematical analysis, where the existence of $i$ is a bit of an afterthought.

For implementation, we will want to avoid duplicate calculations and prefer a notation where $y$ explicitly depends on $i$.
Alternatively, we can view it as a programming issue, and treat $i$ as a _by-product_ of evaluating $f$.

## Logging simulations

At any time $t$, for a known set of parameters $p$, a full and minimal description of the (time-variant parts of the) system can be given as:

\begin{equation}
\left\langle y\left(t\right),u\left(t\right),t\right\rangle
\end{equation}

Any simulation method must be capable of logging these values at any visited time $t$,
We'll call logging these 3 _basic logging_.
In addition, methods may choose to offer logging of intermediary variables and/or derivatives.
Finally, it is often usufull to dissociate logging times from ODE solving times, so that interpolation may be necessary.

## Order of operations, ideal scenario

We consider the case where, at the start of a simulation,
\begin{eqnarray}
t & = & t_{min}\\
u & = & u\left(t_{min}\right)\\
y & = & y\left(t_{min}\right)=y_{0}
\end{eqnarray}

and nothing else is known.
If only basic logging is provided, a log entry for $t_{min}$ can be written at this time.
During a simulation, at each step we do three things:

1. Call the rhs, thereby obtaining
\begin{eqnarray*}
i & = & i\left(t\right)\\
\dot{y} & = & \dot{y}\left(t\right)
\end{eqnarray*}

2. If a log entry should be made at time $t$, now is the time to do it
3. Perform the update:
   1. Calculate $\Delta t$
   2. Calculate $y\left(t+\Delta t\right)$
   3. Calculate $u\left(t+\Delta t\right)$
   4. Update the time to $t+\Delta t$


## CVODE Simulation

In the CVODE based simulation, there is no control over the points $\left\langle y\left(t\right),t\right\rangle$ at which the solver evaluates $f$ when moving to the next position.
This has two consequences:

### 1. External inputs are not re-calculated

When CVODE explores the state space around $\left\langle y\left(t\right),u\left(t\right),t\right\rangle $, it can vary $t$ and $y\left(t\right)$ but $u\left(t\right)$ stays fixed.
This is a result of how Myokit calculates these external functions.
However, this is acceptable for the following reasons:

1. The main entry in $u$ is the discontinuous pacing current (diffusion is not supported in cvode simulations).
Whenever the pacing changes, the solver is reset to the exact time of the change and re-initialised in the new situation.
2. Other entries in $u$ (system time and number of evaluations) are provided for logging only and should not be used to calculate parts of the RHS!



### 2. By-products of calling `rhs()` are unreliable

Since we don't know where $f$ was last evaluated, we don't know which derivatives and intermediary variables are currently in memory.
To perform accurate logging of these values an extra call to the rhs function is needed.
This is implemented in Myokit's CVODES sim, resulting in a few extra RHS() calls (if and only if derivatives or intermediary variables are being logged).


## OpenCL Simulation

The OpenCL simulation mostly uses a fixed step size, but will occasionally make shorter steps to hit a logging point or a change in the pacing signal exactly.

To save memory and memory access time, derivatives are calculated in local memory (on the device) and used to update the state immediatly.
Thus, after a call to the RHS defined by the kernel we have

\begin{equation}
t,u\left(t\right),y\left(t+\Delta t\right)
\end{equation}

Derivatives are not currently logged in the OpenCL simulation, but intermediary variables can be made available so that we obtain

\begin{equation}
t,u\left(t\right),y\left(t+\Delta t\right),i\left(t\right)
\end{equation}

on the device, after each call to the RHS.

To implement correct logging, updates in the OpenCL simulations proceed with the following steps:

- Check if we need to log at time $t$, store in `need_to_log`
- Determine the $\Delta t$ for the next step
- Calculate diffusion at time $t$ on the device
- Do we `need_to_log`? Then download the state at time $t$ from the device
- Update the states on the device to $t+\Delta t$ (which updates the intermediary variables on the device to $t$)
- Do we `need_to_log`? Then download the intermediary and diffusion variables from the device.
  Write state, intermediary and diffusion variables to log.
- Update time to $t+dt$
- Update pacing signal to $t+dt$

## Timing

Timing of logging steps is important.
There are two conflicting use cases:

1. A user runs a single simulation from t0=0 to t1=1000. 
   Next, a plot is made of these results.
   This requires t0=0 and t1=1000 to both be included in the interval.
2. A user runs a simulation from t0=0 to t1=1000. 
   Then a simulation from t0=1000 to t1=2000.
   The result should be a log from t=0 to t=2000 without duplicate values.
   1. This can be done by passing log1 in as the log argument to the second simulation.
   2. This can be done by logging in two separate logs and then joining them

For case 1, it would be ideal to always log the initial and final step.
For case 2 it would be better to log half-open intervals (so include t0 but not t1).

Use case 1 is very common.
The most common version of use case 1 is a single cell simulation with variable steps, simulating exactly one beat.
This means the step sizes at the end of the log become wide, so that the last step in a 1000ms simulation may be somewhere in the range 800-900ms.

However, half-open intervals avoid all problems in use-case 2 (while use case 1 is simply less pretty) and are the defacto standard in computing (for-loops, python ranges, etc.).

Decision: Split the cases for variable- and fixed-interval logging.
For variable logging, log full intervals except when appending.
This is slightly awkward for the situation where logs are appended _outside_ of the simulation, but methods that can deal with variable interval logs can most-likely deal with this too.


### Logging with variable time-steps

When logging with variable time-steps, a log entry is inserted at the following stages:

1. At the start of each simulation, unless appending to an existing log.
2. After each step, including the last.


### Logging with fixed intervals

When logging with fixed log intervals, a log entry is inserted at the following stages:

1. At the start of each simulation.
2. \#TODO