# Parameter management
Depending on which parameters are free or kept fixed, a simple fit model can answer very different questions. For example, let's assume we want to fit a powerlaw MSD with localization error:
\begin{equation}
\text{MSD}(\Delta t) = \sigma^2 + Γ(\Delta t)^\alpha\,.
\end{equation}

 + should localization error be the same for all dimensions or vary independently? For example for 3D data acquired by z-stacking, we would expect the error along the $z$ axis to be significantly higher than in $x$ and $y$. So maybe we should fix $\sigma_x$ and $\sigma_y$ to be the same, but keep $\sigma_z$ independent?
 + maybe localization error is known externally, so we can just fix it to a constant
 + if we fix the exponent $\alpha$, we simply measure the associated (anomalous) diffusion constant. Notably, for $\alpha = 1$ this reduces to the optimal estimator introduced by (Vestergaard, 2014).
 + in an anisotropic medium it is conceivable that the prefactor $\Gamma$ could be different along the Cartesian directions; can we account for that possibility, but keep $\alpha$ the same for all the fits?

All of these cases fall under the same framework of a powerlaw fit, but introduce different constraints between the parameters. Introducing and keeping track of these constraints is the topic of this tutorial.

To illustrate how this is done in ``bayesmsd`` we set up an example fit. Like in the [Quickstart](00_intro.ipynb) we use ``lib.NPXFit`` on synthetic standard normal diffusion data, but here in 3D:

In [None]:
import numpy as np
import bayesmsd

In [None]:
np.random.seed(509354708)
data = np.cumsum(np.random.normal(size=(20, 100, 3)), axis=1) # 20 trajectories, 100 frames, 3D
fit = bayesmsd.lib.NPXFit(data, ss_order=1)

## Getting an overview over fit parameters
To achieve full flexibility, ``NPXFit`` has a whole lot of parameters:

 + $\sigma^2$ (localization error),
 + $\Gamma$ (MSD prefactor), and
 + $\alpha$ (anomalous scaling);
 + all of the above independently for each spatial dimension.

For our example data set in 3D, this results in a total of 9 fit parameters:

In [None]:
for name in fit.parameters:
    print(name)

Of course most of the time we do not need or want this high flexibility. So by default ``NPXFit`` constrains $\Gamma$ and $\alpha$ to be the same across dimenions. This is achieved internally by simply fixing everything to the values for dimension 0:
```py
fit.parameters['log(Γ) (dim 0)'].fix_to = None
fit.parameters['log(Γ) (dim 1)'].fix_to = 'log(Γ) (dim 0)'
fit.parameters['log(Γ) (dim 2)'].fix_to = 'log(Γ) (dim 0)'
... # etc.

fit.parameters['α (dim 0)'].fix_to = None
fit.parameters['α (dim 1)'].fix_to = 'α (dim 0)'
fit.parameters['α (dim 2)'].fix_to = 'α (dim 0)'
... # etc.
```
This leaves only five independent fit parameters: $\sigma_x$, $\sigma_y$, $\sigma_z$, $\Gamma$, and $\alpha$. We can check this:

In [None]:
print(fit.independent_parameters())

Note that since $\Gamma$ and $\alpha$ are just fixed to the values for dimension 0, of course they still carry that ``'(dim 0)'`` postfix. This can serve as a good reminder of what's happening behind the scenes: ``'log(Γ) (dim 0)'`` is the MSD prefactor for each dimension. So when plotting the final MSD—and summing over dimensions—we will see a prefactor of ``d*Γ``, where ``d`` is the number of spatial dimensions.

## Manually fixing parameters
Using the ``fix_to`` attribute, we can tie parameters together however we see fit:

In [None]:
print("Independent parameters in different scenarios")
print("---------------------------------------------")

# Ex 1: same localization error across dimensions
fit.parameters['log(σ²) (dim 1)'].fix_to = 'log(σ²) (dim 0)'
fit.parameters['log(σ²) (dim 2)'].fix_to = 'log(σ²) (dim 0)'
print("Ex 1:", fit.independent_parameters())

# Ex 2: known localization error: σ_x = 4, σ_y = 3, σ_z = 7
fit.parameters['log(σ²) (dim 0)'].fix_to = np.log(4**2)
fit.parameters['log(σ²) (dim 1)'].fix_to = np.log(3**2)
fit.parameters['log(σ²) (dim 2)'].fix_to = np.log(7**2)
print("Ex 2:", fit.independent_parameters())

# Ex 3: make Γ_z independent from Γ_x and Γ_y
# We keep the fixed localization error from Ex 2.
fit.parameters['log(Γ) (dim 2)'].fix_to = None
print("Ex 3:", fit.independent_parameters())

# Finally: reset
fit.parameters['log(σ²) (dim 0)'].fix_to = None
fit.parameters['log(σ²) (dim 1)'].fix_to = None
fit.parameters['log(σ²) (dim 2)'].fix_to = None
fit.parameters[ 'log(Γ) (dim 0)'].fix_to = None
fit.parameters[ 'log(Γ) (dim 1)'].fix_to = 'log(Γ) (dim 0)'
fit.parameters[ 'log(Γ) (dim 2)'].fix_to = 'log(Γ) (dim 0)'
print("End :", fit.independent_parameters())

## Bounds
Each parameter in ``fit.parameters`` has ``bounds`` that can be used to constrain that parameter to a specific interval on the real line. Bounds might be infinite (``np.inf`` or ``-np.inf``):

In [None]:
for name in fit.parameters:
    print(f"{name:>15s} : {fit.parameters[name].bounds}")

## Advanced: functional constraints
We saw above how ``Parameter.fix_to`` is used to fix parameters to numerical constants or the value of other parameters. What about more complicated constraints?

There are two types of constraints: equality constraints of the form $f(p_1, p_2, ...) = 0$ and inequality constraints of the form $f(p_1, p_2, ...) > 0$. The treatment for these is somewhat different.

Equality constraints must be solvable for one of the parameters and can then be implemented by setting ``Parameter.fix_to`` to a callable. As an example, let us assume that we know (for some reason) the total variance $\sigma^2\equiv\sigma_x^2 + \sigma_y^2 + \sigma_z^2$ of the localization error, but are completely ignorant as to the individual dimensions. We would therefore like to enforce that all the error variances sum to $\sigma^2$, without fixing any of them to a constant (or to each other). This can be reformulated as
\begin{equation}
\sigma_x^2 = \sigma^2 - \sigma_y^2 - \sigma_z^2\,,
\end{equation}
which we then implement as

In [None]:
def sx_fixer(parameters, total_error=5):
    sy2 = np.exp(parameters['log(σ²) (dim 1)'])
    sz2 = np.exp(parameters['log(σ²) (dim 2)'])
    return np.log(np.abs(total_error - sy2 - sz2))

fit.parameters['log(σ²) (dim 0)'].fix_to = sx_fixer
fit.parameters['log(σ²) (dim 1)'].fix_to = None
fit.parameters['log(σ²) (dim 2)'].fix_to = None
print(fit.independent_parameters())

So now $\sigma_x$ is fixed in terms of $\sigma_y$ and $\sigma_z$. But what if $\sigma_y$ becomes very large, such that $\sigma^2 - \sigma_y^2 - \sigma_z^2 < 0$? Clearly we cannot allow this, so in fact we need a second constraint, this time an inequality one: $\sigma_y^2 + \sigma_z^2 \leq \sigma^2$.

Inequality constraints are tricky to enforce numerically, since they introduce sharp cuts in parameter space: as long as $f(p_1, p_2, ...) > 0$, everything is perfectly fine; but as soon as $f$ switches sign we are in an infeasible region. If the boundary $f(p_1, p_2, ...) = 0$ is reasonably irregularly shaped, this can make a precise solution of the problem very costly. ``bayesmsd`` goes the less precise, but more affordable path of introducing a smooth penalty function across the boundary, such that we get three regimes:

 + $f < 0$ is infeasible, i.e. the likelihood function might not even be well-defined for these parameter points
 + $f > 1$ is "perfectly fine", i.e. no penalization will be applied
 + $f \in [0, 1]$ is "still fine-ish": in this regime we subtract a smooth penalty function from the actual likelihood function, thus discouraging the fit from exploring these regions (and coming closer to $f < 0$). Specifically, the penalization is given by:
\begin{equation}
\log L \gets \log L - \exp\cot (\pi f) \,,
\end{equation}
 which smoothly crosses over to 0 at $f = 1$ and diverges for $f \searrow 0$.
 
By rescaling the constraint function appropriately, we can tune the crossover width to our liking. In our example of fixed total localization error it might be enough to have the penalization kick in if $\sigma^2 - \sigma_y^2 - \sigma_z^2 < 10^{-5}$:

In [None]:
def sy_sz_constraint(parameters, total_error=5):
    sy2 = np.exp(parameters['log(σ²) (dim 1)'])
    sz2 = np.exp(parameters['log(σ²) (dim 2)'])
    return (total_error - sy2 - sz2) / 1e-5

fit.constraints.append(sy_sz_constraint)

So finally, we implemented an equality and an inequality constraint to enforce $\sigma_x^2 + \sigma_y^2 + \sigma_z^2 = \sigma^2 = \text{const.}$ So let's run the fit and look at the output:

In [None]:
res = fit.run(show_progress=True)

# Print parameters
for name in ['log(σ²) (dim 0)',
             'log(σ²) (dim 1)',
             'log(σ²) (dim 2)',
             'log(Γ) (dim 0)',
             'α (dim 0)',
            ]:
    print(f"{name:>15s} = {res['params'][name]:.5f}")

# Check constraint
print()
sum_xyz = np.sum([np.exp(res['params'][f"log(σ²) (dim {d})"]) for d in range(3)])
print(f"σx² + σy² + σz² = {sum_xyz:.5f}")

Indeed, the final result satisfies the constraint; so this seems to have worked!

Note that this example was mainly meant to illustrate the use of equality and inequality constraints; it might not be particularly realistic. Specifically, the notable asymmetry in the inferred localization errors is due to a spontaneous symmetry breaking mechanism: we would presumably get a similarly "good" fit upon permutation of the localization error dimensions, since the data is symmetric across dimensions. But since they have to add to a constant, at least one of the localization errors has to be large.

A more relevant use case for an inequality constraint is ``bayesmsd.lib.SplineFit`` (which is covered in more detail elsewhere): for an arbitrarily drawn MSD curve, the associated covariance matrix is not necessarily positive definite, so we have to check this by hand. The constraint function in this case is implemented in the abstract base class as ``Fit.Cpositive_constraint``, such that it is available to any derived class that might want to use it.