# Transforming cost functions using MinimallyDisruptiveCurves.jl

The premise of MinimallyDisruptiveCurves.jl is to
> move as far away as possible from the initial parameters, while keeping model behaviour as similar as possible.

(in other words, while keeping a cost function as low as possible).

The words *as far away as possible* imply some metric on the space of parameters. A lot of key workflows involve manipulating this metric. For instance, by

- fixing/freeing parameters that the minimally disruptive curve can change
- looking at relative, rather than absolute changes in parameters
- biasing parameters to have a larger/smaller influence on the metric, so that minimally disruptive curves are encouraged (not /) to align with them
- something custom.

Each of these corresponds to a **transformation** of parameter space. The easiest way to do all of these things is by **reparameterising** the cost function $C(\theta)$. We take a transformation of parameter space: $T(\theta)$, and a new cost function $D$ satisfying
$$ D[T(\theta)] = C(\theta). $$

**The purpose of this notebook is to show you an easy way to make/perform these reparameterisations.**

The (slight) complication is that cost functions compatible with MinimallyDisruptiveCurves.jl must be **differentiable**. That is, they have two methods:
```
## method 1
function cost(𝜃)
    ...
    return cost
end

## method 2
function cost(𝜃, grad_template)
    ....
    grad_template[:] = ∇C    # mutate to get gradient wrt parameters
    
    return cost
end

```

So we want an easy way of applying composable transformations to cost functions, which also recompute the gradient automatically. And that is what we shall provide, through the `TransformationStructure` type. 

First let's make an arbitrary cost function. We'll keep it very simple for didactic purposes. 

In [3]:
θ₀ = [5.,6.,7.]
C(θ) = sum( @. (θ - θ₀)^2)

function C(θ, g)
    g[:] = @. 2(θ-θ₀)
    return C(θ)
end

g = deepcopy(θ₀)
@show C(θ₀, g)
@show g;


C(θ₀, g) = 0.0
g = [0.0, 0.0, 0.0]


Next let's build a transformation structure:

In [4]:
T(p) = @. p + 1
Tinv(p) = @. p - 1

using MinimallyDisruptiveCurves
Trf = TransformationStructure("simple_transform", T, Tinv)

TransformationStructure{typeof(T),typeof(Tinv)}("simple_transform", T, Tinv)

and reparameterise the cost function with it...

In [6]:
D, θnew = transform_cost(C, θ₀, Trf)

(DiffCost{MinimallyDisruptiveCurves.var"#new_cost#35"{typeof(C),TransformationStructure{typeof(T),typeof(Tinv)}},MinimallyDisruptiveCurves.var"#new_cost2#36"{typeof(C),TransformationStructure{typeof(T),typeof(Tinv)},typeof(ForwardDiff.jacobian)}}(MinimallyDisruptiveCurves.var"#new_cost#35"{typeof(C),TransformationStructure{typeof(T),typeof(Tinv)}}(C, TransformationStructure{typeof(T),typeof(Tinv)}("simple_transform", T, Tinv)), MinimallyDisruptiveCurves.var"#new_cost2#36"{typeof(C),TransformationStructure{typeof(T),typeof(Tinv)},typeof(ForwardDiff.jacobian)}(C, TransformationStructure{typeof(T),typeof(Tinv)}("simple_transform", T, Tinv), ForwardDiff.jacobian)), [6.0, 7.0, 8.0])

In [10]:
@show θnew
@show D(θnew, g)
@show g;

θnew = [6.0, 7.0, 8.0]
D(θnew, g) = 0.0
g = [0.0, 0.0, 0.0]


Done! Now we have made some preloaded `TransformationStructure`s to get you started:

In [14]:
LA  = logabs_transform(θ₀);

This one transforms parameters to their absolute values, then takes the logs. Useful if you want to quantify relative (instead of absolute) changes to parameters

In [17]:
idxs = [1,3]
biases = [400.,1/400]
BA = bias_transform(θnew, idxs, biases );

This one does the transformation 
$$ θ[i] \to b_i θ[i] $$
for $i$ in idxs and $b_i$ in biases.

In [20]:
FA = fix_params(θ₀, idxs); #is self explanatory
OFA = only_free_params(θ₀, idxs); # opposite of above: all except params[idxs] are fixed.

Any of these transformations can be composably applied to any cost function amenable to MinimallyDisruptiveCurves.jl. And as we showed, you can make your own. 