# The Stepsize of DSEA+

DSEA+ extends the original DSEA with an adaptively chosen stepsize between iterations. This tutorial assumes you already know the other notebook at `doc/01-getting-started.ipynb`.

In [1]:
using CherenkovDeconvolution
using ScikitLearn, MLDataUtils, Random
using Discretizers: encode, CategoricalDiscretizer

# load the example data, encode labels with integers
X, y_labels, _ = load_iris()
y = encode(CategoricalDiscretizer(y_labels), y_labels)

# split the data into training and observed data sets
Random.seed!(42) # make split reproducible
(X_train, y_train), (X_data, y_data) = splitobs(shuffleobs((X', y), obsdim = 1), obsdim = 1)

# prepare the arguments for all deconvolution methods
@sk_import naive_bayes : GaussianNB # naive Bayes for DSEA
binning = TreeBinning(3); # up to 3 clusters for IBU & RUN

## Adaptive Step Size

The adaptive step size is specified through the `alpha` argument of DSEA. This argument expects a `CherenkovDeconvolution.Stepsize` object, for which CherenkovDeconvolution.jl provides several implementations.

The most important implementation, `RunStepsize` uses the objective function of the regularized unfolding (RUN) to determine the step size adaptively. We further specify `epsilon`, the minimum Chi square distance between iterations. Convergence is assumed if the distance drops below this threshold.

In this example, convergence is assumed immediately because the input- and output distributions are approximately equal.

In [2]:
stepsize = RunStepsize(binning; decay=true)
dsea = DSEA(GaussianNB(); K=100, epsilon=1e-6, stepsize=stepsize)
f_dsea = deconvolve(dsea, X_data, X_train, y_train)

┌ Info: DSEA iteration 1/100 uses alpha = 2.8284110834738245e-13 (chi2s = 2.237160223400213e-28)
└ @ CherenkovDeconvolution.Methods /home/bunse/.julia/dev/CherenkovDeconvolution/src/methods/dsea.jl:169
┌ Info: DSEA convergence assumed from chi2s = 2.237160223400213e-28 < epsilon = 1.0e-6
└ @ CherenkovDeconvolution.Methods /home/bunse/.julia/dev/CherenkovDeconvolution/src/methods/dsea.jl:174


3-element Array{Float64,1}:
 0.3333333333333333
 0.3333333333333394
 0.3333333333333272

## Other Step Sizes

Another adaptive step size, based on a least-square objective, is the `LsqStepsize`. Two decaying step sizes can be obtained with `ExpDecayStepsize` and `MulDecayStepsize`. There is also a `ConstantStepsize`.

If you want to implement additional step size strategies, you only need an implementation of the `stepsize` method for a custom `Stepsize` type.

## Further Documentation

In [3]:
?Stepsizes.value

```
value(s, k, p, f, a)
```

Use the `Stepsize` object `s` to compute a step size for iteration number `k` with the search direction `p`, the previous estimate `f`, and the previous step size `a`.

**See also:** `ConstantStepsize`, `RunStepsize`, `LsqStepsize`, `ExpDecayStepsize`, `MulDecayStepsize`.


In [4]:
?RunStepsize

search: [0m[1mR[22m[0m[1mu[22m[0m[1mn[22m[0m[1mS[22m[0m[1mt[22m[0m[1me[22m[0m[1mp[22m[0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m



```
RunStepsize(binning; kwargs...)
```

Adapt the step size by maximizing the likelihood of the next estimate in the search direction of the current iteration, much like in the `RUN` deconvolution method.

**Keyword arguments:**

  * `decay = false` specifies whether `a_k+1 <= a_k` is enforced so that step sizes never increase.
  * `tau = 0.0` determines the regularisation strength.
  * `warn = false` specifies whether warnings should be emitted for debugging purposes.


In [5]:
?LsqStepsize

search: [0m[1mL[22m[0m[1ms[22m[0m[1mq[22m[0m[1mS[22m[0m[1mt[22m[0m[1me[22m[0m[1mp[22m[0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m



```
LsqStepsize(binning; kwargs...)
```

Adapt the step size by solving a least squares objective in the search direction of the current iteration.

**Keyword arguments:**

  * `decay = false` specifies whether `a_k+1 <= a_k` is enforced so that step sizes never increase.
  * `tau = 0.0` determines the regularisation strength.
  * `warn = false` specifies whether warnings should be emitted for debugging purposes.


In [6]:
?ExpDecayStepsize

search: [0m[1mE[22m[0m[1mx[22m[0m[1mp[22m[0m[1mD[22m[0m[1me[22m[0m[1mc[22m[0m[1ma[22m[0m[1my[22m[0m[1mS[22m[0m[1mt[22m[0m[1me[22m[0m[1mp[22m[0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m



```
ExpDecayStepsize(eta, a=1.0)
```

Reduce the first stepsize `a` by `eta` in each iteration:

```
value(ExpDecayStepsize(eta, a), k, ...) == a * eta^(k-1)
```


In [7]:
?MulDecayStepsize

search: [0m[1mM[22m[0m[1mu[22m[0m[1ml[22m[0m[1mD[22m[0m[1me[22m[0m[1mc[22m[0m[1ma[22m[0m[1my[22m[0m[1mS[22m[0m[1mt[22m[0m[1me[22m[0m[1mp[22m[0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m



```
MulDecayStepsize(eta, a=1.0)
```

Reduce the first stepsize `a` by `eta` in each iteration:

```
value(MulDecayStepsize(eta, a), k, ...) == a * k^(eta-1)
```


In [8]:
?ConstantStepsize

search: [0m[1mC[22m[0m[1mo[22m[0m[1mn[22m[0m[1ms[22m[0m[1mt[22m[0m[1ma[22m[0m[1mn[22m[0m[1mt[22m[0m[1mS[22m[0m[1mt[22m[0m[1me[22m[0m[1mp[22m[0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m



```
ConstantStepsize(alpha)
```

Choose the constant step size `alpha` in every iteration.


In [9]:
?DEFAULT_STEPSIZE

search: [0m[1mD[22m[0m[1mE[22m[0m[1mF[22m[0m[1mA[22m[0m[1mU[22m[0m[1mL[22m[0m[1mT[22m[0m[1m_[22m[0m[1mS[22m[0m[1mT[22m[0m[1mE[22m[0m[1mP[22m[0m[1mS[22m[0m[1mI[22m[0m[1mZ[22m[0m[1mE[22m



```
const DEFAULT_STEPSIZE = ConstantStepsize(1.0)
```

The default stepsize in all deconvolution methods.
