`ml_derivatives` is a python module for the computation of high order derivatives (up to order 4 in the current implementation). Two appealing features of the modules are: 

1. **Automatic tuning**: 
   
    the `ml_derviatives` module enables an automatic tuning of the trade-off involving the bandwidth of the time-series and the level of affecting the measurements. This seems to be absent from the currently available alternatives which always come with tuning parameters that are to be provided by the user. 
   
1. **Confidence interval computation**:  

    the `ml_derviatives` module provides (optionally) an estimation of an interval of confidence to which belong the true values of the derivatives with high probabiliy. This might be of great value when the estimated derivatives are extracted to build a model involving the high derivatives. Indeed, regions where the confidence interval is large might be de-selected in order to be excluded from the so-called *training dataset*.

::: {.callout-tip collapse="true" title='Tipycal comparison results with available alternatives (unfold to see)'}

![Typical comparison results with available alternatives](images/comparison.png){fig-align="center" width="80%"}

:::

::: {.callout-tip collapse="true" title='Example of derivatives reconstruction with confidence intervals'}

::: {.panel-tabset}

### Noise 5% / 1.94Hz


![Example of derivatives reconstruction with confidence intervals](images/derivatives_reconstruction_1.png){fig-align="center" width="80%"}


### Noise 5% / 12.54Hz

![Example of derivatives reconstruction with confidence intervals](images/derivatives_reconstruction_2.png){fig-align="center" width="80%"}

### Noise 10% / 4.22Hz

![Example of derivatives reconstruction with confidence intervals](images/derivatives_reconstruction_3.png){fig-align="center" width="80%"}

:::

:::

For a complete description of the algorithm, see the reference paper [cited below](#citing)

Here, Only the main elements are explained briefly for an easy and quick use of the solver. 

## Installation 
```default
pip install ml_derivatives
```

## Problem statement {#problem}
Given a sequence `yn` of `N` values representing noisy measurements of of a physical signal `y`which is measured using  **fixed acquisition period** `dt` over the time interval $I$. We want to compute an estimation of the `d`-th derivative of the underlying signal over the same time interval leading to the `N` measurement represented in `yn`, namely,  

$$
\dfrac{d^d}{dt^d}\Bigl[y\Bigr]\Biggl\vert_I = \texttt{estimate\_derivative(yn, dt, d, ...)}
$${#eq-problem}

### Limitations {#limitations}

::: {.callout-warning title="Minimum number of `N`=50 points required."} 

As the derivatives estimator involves an estimation of the bandwidth of the signal and since the precision of this estimation is inversly proportional to the number of points, the current implementation of the module required a minimum number od points `N`=50.

:::


::: {.callout-warning title="Maximum bandwidth for a given `dt`."} 

Obviously, given the acquisition period `dt`, there is a limitation on the bandwidth of the signal for which derivatives can be computed. For this reason, it is not advisable to attempt estimation beyond the following maximum pulsation: 
$$
\omega_\text{max}= \dfrac{2\pi}{5\times \texttt{dt}}
$$

:::

::: {.callout-warning title="`ml_derivatives` is slower than standard filters although incomparatively more precise"} 

It is important to underline that the counterpart of the above nice features is that the processing time is not that of a point-wise filter although computation time remains descently small. See below for more details regarding standard compuation times using the `ml_derivatives` module. 
:::

::: {.callout-warning title='The length of the time-series is limited to 10000'}

Since the estimation is based on some stored information, for the sake of memory, the length of the sequence `yn` used as input argument in (@eq-problem) is limited to 10000. If estimation of derivatives for longer sequence is required, please decompose the signal into multiple segments. 
:::

In [None]:
import numpy as np
from ml_derivatives.mld import Derivator
import plotly.graph_objects as go
from plotly.subplots import make_subplots

nt = 5001
omega = 5.0
dt = 0.02

der = Derivator()
t, Y = der.generate_a_time_series(nt, omega, dt)
yn = (1+0.05 * np.random.randn(len(Y[0]))) * Y[0]

:::{.embed}
<iframe src="images/generated_plots.html" width="100%" height="500"></iframe>
:::


## The `solve` method {#solve}

The solve method is an instance method of the class `Pol`that enables to minimize, maximize or find a root of the polynomial instance calling it. 

The call for the `solve` method takes the following form:

```python

solution, cpu = pol.solve(x0,...)   

# see the list of arguments below
# with their default values if any.
```

### Input arguments 
The table below describes the input arguments of the `solve` method.

:::{.tbl-caption}
#### Input arguments of the `solve` method.
| **Parameter**     | **Description**      | **Default**|
|---|---------------|----:|
| `x0` |  The initial guess for the solver. This is a vector of dimension `nx`. Notice that when several starting points are used (`Ntrials`>1 as explained below), the next initial guesses are randomly sampled in the admissible domain defined by `xmin` and `xmax`. | --|
| `xmin`| The vector of lower bounds of the decision variables| -- |
| `xmax`| The vector of lower bounds of the decision variables| -- |
| `Ntrials`| The number of different starting points used in order to enhance the avoidance of local minima.| 1 |

Table: Input arguments for the `solve`method of the class `Pol`.
:::

### Output arguments 

:::{.tbl-caption}
#### Output arguments of the `solve`method.
| **parameters** | **Description** |
|---|----------------|
| `solution`| A python `namedtuple` object containing the solution provided by the `solve` method. The dictionary show the following fields: <br> <br> - `x`: The best solution found <br> - `f`: the corresponding best value <br> <br> Therfore, the best solution and the best values can be obtained through `solution.x` and `solution.f`.|
| `cpu` | The computation time needed to perform the compuations. |

:::

## Examples of use {#example}

The following script gives an example of a call that asks for the maximization of the polynomial defined earlier (see then prints the results so obtained:

```python
nx = 3
x0 = np.zeros(nx)
ntrials = 6
ngrid = 1000
xmin = -1*np.ones(nx)
xmax = 2*np.ones(nx)

solution, cpu = pol.solve(x0=x0, 
                          xmin=xmin, 
                          xmax=xmax, 
                          ngrid=ngrid, 
                          Ntrials=ntrials, 
                          psi=lambda v:-v
                          )
                          
print(f'xopt = {solution.x}')
print(f'fopt = {solution.f}')
print(f'computation time = {solution.cpu}')

>> xopt = [-1.  2.  0.]
>> fopt = 16.0
>> computation time = 0.0046999454498291016
```

Changing the argument `psi`to `psi=lambda v:abs(v)` asks the solver to zero the polynomial and hence, leads to the following results:

```python
>> xopt = [-0.996997    0.58858859  0.63963964]
>> fopt = -9.305087356087371e-05
>> computation time = 0.003011941909790039
```

Finally, using the default definition leads to `solve` trying to find a minimum of the polynomial leading to:

```python 
>> xopt = [-1. -1.  2.]
>> fopt = -6.0
>> computation time = 0.005150318145751953
```

## Citing ml_derivatives {#citing}


```bibtex
@misc{alamir2025reconstructinghighderivativesnoisy,
      title={On reconstructing high derivatives of noisy time-series with confidence intervals}, 
      author={Mazen Alamir},
      year={2025},
      eprint={2503.05222},
      archivePrefix={arXiv},
      primaryClass={eess.SY},
      url={https://arxiv.org/abs/2503.05222}, 
}
```

::: {.callout-note}
The above reference contains a detailed description of the algorithm together with an extensive comparison with the best available alternatives. It also explain the general targeted scope of the module that mainly focus on extracting high derivatives from noisy time-series in the aim of building learning datasets that contains *virtual sensor*-like columns representing derivatives of different orders of the raw columned coming from the ground measurements. 
:::