In [1]:
import sys
sys.path.append('..')
from lagmat import val_to_cont, cont_to_val, Cont

import numpy as np

Generate some random variables $x>0$.
Assume `x` is a time series with 3 variables and 7 observations.
The oldest observation is in the first row.

In [2]:
x = np.random.normal(size=(7,3)) * 5 + 75
x

array([[78.0877671 , 79.30726046, 85.34555657],
       [74.90328579, 79.54796297, 81.53223818],
       [75.653042  , 77.70831033, 77.36530076],
       [76.49884867, 77.93786108, 85.03403969],
       [72.65049201, 75.53753939, 77.4555681 ],
       [70.34303524, 69.40404179, 73.71343856],
       [73.56140499, 74.38698975, 71.91527053]])

## To Continous Returns

The continous return, the (exponential) growth rate, or log differences between $x_t$ and the previous observation $x_{t-1}$ is

$$
r_t
= \log\left(x_t\right) - \log\left(x_{t-1}\right) 
= \log\left(\frac{x_t}{x_{t-1}}\right)
$$


In [3]:
ret = val_to_cont(x)
ret

array([[        nan,         nan,         nan],
       [-0.04163565,  0.00303047, -0.04570988],
       [ 0.00995989, -0.02339794, -0.05246013],
       [ 0.01111804,  0.00294965,  0.09451327],
       [-0.05161553, -0.03128211, -0.09334719],
       [-0.03227639, -0.08468464, -0.04951933],
       [ 0.04473672,  0.06933595, -0.0246965 ]])

## Compounding
The relation between two observations is

$$
x_t = x_{t-1} \, e^{r_t}
$$

or for multiple time steps

$$
x_T = \prod_{t=1}^T x_{t-1} \, e^{r_t}
$$

The user must provide the intial values $x_0$.
You could use the first row `x[0,:]` from the original dataset.

In [4]:
cont_to_val(ret, initial=x[0,:])

array([[78.0877671 , 79.30726046, 85.34555657],
       [74.90328579, 79.54796297, 81.53223818],
       [75.653042  , 77.70831033, 77.36530076],
       [76.49884867, 77.93786108, 85.03403969],
       [72.65049201, 75.53753939, 77.4555681 ],
       [70.34303524, 69.40404179, 73.71343856],
       [73.56140499, 74.38698975, 71.91527053]])

## Index to 1
However, an initial value of 1 is often used to compare multiple time series.
This is called "indexed to 1" or "indexed to 100 percent".

In [5]:
cont_to_val(ret, initial=1)

array([[1.        , 1.        , 1.        ],
       [0.9592192 , 1.00303506, 0.95531908],
       [0.96882066, 0.97983854, 0.90649477],
       [0.97965215, 0.98273299, 0.99634993],
       [0.93036969, 0.95246689, 0.90755244],
       [0.90082016, 0.87512847, 0.86370564],
       [0.94203494, 0.93795939, 0.84263638]])

## sklearn API
* set `ContRef(initial=value)` if the `inverse_transform` should always use certain initial values. You can temporarly overwrite the behavior with `inverse_transform(initial=othervalue)`
* if no initial values are specified, e.g. `ContRet()`, then `ContRet().fit(X)` will store the first row of `X` as initial values `X[0,:]`. Again, you can temporarily overwrite these inital values with `inverse_transform(initial=othervalue)`

transform, inverse_transform

* it assumed that `X` nor `Z` have any missing values, i.e. `NaN`
* suggest approaches: a) use "previous tick" interpolation before `transform`, b) impute `0.0` before `inverse_transform`


In [6]:
obj = Cont()
obj.fit(x)
z = obj.transform(x)
z

array([[        nan,         nan,         nan],
       [-0.04163565,  0.00303047, -0.04570988],
       [ 0.00995989, -0.02339794, -0.05246013],
       [ 0.01111804,  0.00294965,  0.09451327],
       [-0.05161553, -0.03128211, -0.09334719],
       [-0.03227639, -0.08468464, -0.04951933],
       [ 0.04473672,  0.06933595, -0.0246965 ]])

In [7]:
obj.inverse_transform(z)

array([[78.0877671 , 79.30726046, 85.34555657],
       [74.90328579, 79.54796297, 81.53223818],
       [75.653042  , 77.70831033, 77.36530076],
       [76.49884867, 77.93786108, 85.03403969],
       [72.65049201, 75.53753939, 77.4555681 ],
       [70.34303524, 69.40404179, 73.71343856],
       [73.56140499, 74.38698975, 71.91527053]])

In [8]:
obj.inverse_transform(z, initial=100)

array([[100.        , 100.        , 100.        ],
       [ 95.92192039, 100.30350627,  95.53190752],
       [ 96.88206592,  97.98385404,  90.6494771 ],
       [ 97.96521466,  98.27329885,  99.63499343],
       [ 93.03696943,  95.24668859,  90.75524398],
       [ 90.08201649,  87.51284736,  86.37056399],
       [ 94.20349399,  93.79593914,  84.26363764]])