Skip to content

Commit

Permalink
Merge branch 'master' into fix/unit8co#1101
Browse files Browse the repository at this point in the history
  • Loading branch information
hrzn committed Aug 7, 2022
2 parents 6f989af + 6ec5e72 commit 4f53c86
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 20 deletions.
42 changes: 27 additions & 15 deletions darts/dataprocessing/dtw/dtw.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,12 @@
from typing import Callable, Union

import numpy as np
import pandas as pd
import xarray as xr

from darts import TimeSeries
from darts.logging import get_logger, raise_if, raise_if_not
from darts.timeseries import DIMS

from .cost_matrix import CostMatrix
from .window import CRWindow, NoWindow, Window
Expand Down Expand Up @@ -203,25 +206,34 @@ def warped(self) -> (TimeSeries, TimeSeries):
Two new TimeSeries instances of the same length, indexed by pd.RangeIndex.
"""

series1 = self.series1
series2 = self.series2

xa1 = series1.data_array(copy=False)
xa2 = series2.data_array(copy=False)

xa1 = self.series1.data_array(copy=False)
xa2 = self.series2.data_array(copy=False)
path = self.path()

warped_series1 = xa1[path[:, 0]]
warped_series2 = xa2[path[:, 1]]

time_dim1 = series1._time_dim
time_dim2 = series2._time_dim
values1, values2 = xa1.values[path[:, 0]], xa2.values[path[:, 1]]

# We set a RangeIndex for both series:
warped_series1 = xr.DataArray(
data=values1,
dims=xa1.dims,
coords={
self.series1._time_dim: pd.RangeIndex(values1.shape[0]),
DIMS[1]: xa1.coords[DIMS[1]],
},
attrs=xa1.attrs,
)

range_index = True
warped_series2 = xr.DataArray(
data=values2,
dims=xa2.dims,
coords={
self.series2._time_dim: pd.RangeIndex(values2.shape[0]),
DIMS[1]: xa2.coords[DIMS[1]],
},
attrs=xa2.attrs,
)

if range_index:
warped_series1 = warped_series1.reset_index(dims_or_levels=time_dim1)
warped_series2 = warped_series2.reset_index(dims_or_levels=time_dim2)
time_dim1, time_dim2 = self.series1._time_dim, self.series2._time_dim

# todo: prevent time information being lost after warping
# Applying time index from series1 to series2 (take_dates = True) is disabled for consistency reasons
Expand Down
8 changes: 4 additions & 4 deletions docs/userguide/timeseries.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ In addition, some models can work on *multiple time series*, meaning that they c

* **Example of a multivariate series:** The blood pressure and heart rate of a single patient over time (one multivariate series with 2 components).

* **Example of multiple series:** The blood pressure and heart rate of multiple patients; potentially measured at different times for different patients (one univariate series per patient).
* **Example of multiple series:** The blood pressure and heart rate of multiple patients; potentially measured at different times for different patients (one multivariate series with 2 components per patient).


### Should I use a multivariate series or multiple series for my problem?
Expand All @@ -50,9 +50,9 @@ In Darts, probabilistic forecasts are represented by drawing Monte Carlo samples
## Creating `TimeSeries`
`TimeSeries` objects can be created using factory methods, for example:

* [TimeSeries.from_dataframe()](https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.from_dataframe) can create `TimeSeries` from a Pandas Dataframe having one or several columns representing values (several columns would correspond to a multivariate series).
* [TimeSeries.from_dataframe()](https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.from_dataframe) can create `TimeSeries` from a Pandas Dataframe having one or several columns representing values (columns correspond to components, and several columns would correspond to a multivariate series).

* [TimeSeries.from_values()](https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.from_values) can create `TimeSeries` from a 2-D or 3-D NumPy array. It will generate an integer-based time index (of type `pandas.RangeIndex`). 2-D corresponds to deterministic (potentially multivariate) series, and 3-D to stochastic series.
* [TimeSeries.from_values()](https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.from_values) can create `TimeSeries` from a 1-D, 2-D or 3-D NumPy array. It will generate an integer-based time index (of type `pandas.RangeIndex`). 1-D corresponds to univariate deterministic series, 2-D to multivariate deterministic series, and 3-D to multivariate stochastic series.

* [TimeSeries.from_times_and_values()](https://unit8co.github.io/darts/generated_api/darts.timeseries.html#darts.timeseries.TimeSeries.from_times_and_values) is similar to `TimeSeries.from_values()` but also accepts a time index.

Expand All @@ -67,7 +67,7 @@ my_multivariate_series = concatenate([series1, series2, ...], axis=1)
produces a multivariate series from some series that share the same time axis.

## Implementation
Behind the scenes, `TimeSeries` is wrapping around a 3-dimensional `xarray.DataArray` object. The dimensions are *(time, component, sample)*, where the size of the *component* dimension is larger than 1 for multivariate series and the size of the *sample* dimension is larger than 1 for stochastic series. The `DataArray` is itself backed by a a 3-dimensional NumPy array, and it has a time index (either `pandas.DatetimeIndex` or `pandas.RangeIndex`) on the *time* dimension and another `pandas.Index` on the *component* (or "columns") dimension. `TimeSeries` is intended to be immutable.
Behind the scenes, `TimeSeries` is wrapping around a 3-dimensional `xarray.DataArray` object. The dimensions are *(time, component, sample)*, where the size of the *component* dimension is larger than 1 for multivariate series and the size of the *sample* dimension is larger than 1 for stochastic series. The `DataArray` is itself backed by a 3-dimensional NumPy array, and it has a time index (either `pandas.DatetimeIndex` or `pandas.RangeIndex`) on the *time* dimension and another `pandas.Index` on the *component* (or "columns") dimension. `TimeSeries` is intended to be immutable and most operations return new `TimeSeries` objects.

## Exporting data from a `TimeSeries`
`TimeSeries` objects offer a few ways to export the data, for example:
Expand Down
2 changes: 1 addition & 1 deletion requirements/core.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ prophet>=1.1
requests>=2.22.0
scikit-learn>=1.0.1
scipy>=1.3.2
statsforecast>=0.5.2
statsforecast==0.6.0
statsmodels>=0.13.0
tbats>=1.1.0
tqdm>=4.60.0
Expand Down

0 comments on commit 4f53c86

Please sign in to comment.