Skip to content

Commit

Permalink
Merge branch 'issue-515' of https://github.com/tinkoff-ai/etna into i…
Browse files Browse the repository at this point in the history
…ssue-515
  • Loading branch information
Ama16 committed Mar 9, 2022
2 parents 3375261 + 37d50e0 commit 66bd488
Show file tree
Hide file tree
Showing 61 changed files with 1,947 additions and 1,322 deletions.
43 changes: 0 additions & 43 deletions .github/workflows/docker-stable.yml

This file was deleted.

34 changes: 34 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,37 @@ jobs:
env:
NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}

docker-build-and-push:
needs: publish
runs-on: ubuntu-latest

strategy:
matrix:
dockerfile:
- {"name": etna-cpu, "path": docker/etna-cpu/Dockerfile}
- {"name": etna-cuda-10.2, "path": docker/etna-cuda-10.2/Dockerfile}
- {"name": etna-cuda-11.1, "path": docker/etna-cuda-11.1/Dockerfile}

steps:
- uses: actions/checkout@v2

- name: Build image
run: |
cd $( dirname ${{ matrix.dockerfile.path }})
VERSION=$(echo "${{ github.ref }}" | sed -e 's,.*/\(.*\),\1,')
sed -i "s#etna\[all\]#etna\[all\]==$VERSION#g" requirements.txt
cat requirements.txt
docker build . --tag image
- name: Log into registry
run: echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin

- name: Push image
run: |
IMAGE_ID=ghcr.io/${{ github.repository }}/${{ matrix.dockerfile.name }}
VERSION=$(echo "${{ github.ref }}" | sed -e 's,.*/\(.*\),\1,')
echo IMAGE_ID=$IMAGE_ID
echo VERSION=$VERSION
docker tag image $IMAGE_ID:$VERSION
docker push $IMAGE_ID:$VERSION
20 changes: 17 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add plot_time_series_with_change_points function ([#534](https://github.com/tinkoff-ai/etna/pull/534))
- Add plot_trend ([#565](https://github.com/tinkoff-ai/etna/pull/565))
- Add find_change_points function ([#521](https://github.com/tinkoff-ai/etna/pull/521))
-
- Add option `day_number_in_year` to DateFlagsTransform ([#552](https://github.com/tinkoff-ai/etna/pull/552))
- Add plot_residuals ([#539](https://github.com/tinkoff-ai/etna/pull/539))
-
- Create `PerSegmentBaseModel`, `PerSegmentPredictionIntervalModel` ([#537](https://github.com/tinkoff-ai/etna/pull/537))
- Create `MultiSegmentModel` ([#551](https://github.com/tinkoff-ai/etna/pull/551))
- Create `EnsembleMixin` ([#574](https://github.com/tinkoff-ai/etna/pull/574))
-
- Add option `season_number` to DateFlagsTransform ([#567](https://github.com/tinkoff-ai/etna/pull/567))
-
- Add stl_plot ([#575](https://github.com/tinkoff-ai/etna/pull/575))
- Add community section to README.md ([#580](https://github.com/tinkoff-ai/etna/pull/580))
- Create `AbstaractPipeline` ([#573](https://github.com/tinkoff-ai/etna/pull/573))
-
### Changed
- Change the way `ProphetModel` works with regressors ([#383](https://github.com/tinkoff-ai/etna/pull/383))
Expand All @@ -34,16 +42,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Update CONTRIBUTING.md ([#536](https://github.com/tinkoff-ai/etna/pull/536))
-
- Rename `_CatBoostModel`, `_HoltWintersModel`, `_SklearnModel` ([#543](https://github.com/tinkoff-ai/etna/pull/543))
-
- Add logging to TSDataset.make_future, log repr of transform instead of class name ([#555](https://github.com/tinkoff-ai/etna/pull/555))
- Rename `_SARIMAXModel` and `_ProphetModel`, make `SARIMAXModel` and `ProphetModel` inherit from `PerSegmentPredictionIntervalModel` ([#549](https://github.com/tinkoff-ai/etna/pull/549))
-
- Update get_started section in README ([#569](https://github.com/tinkoff-ai/etna/pull/569))
- Make detrending polynomial ([#566](https://github.com/tinkoff-ai/etna/pull/566))
- Update documentation about transforms that generate regressors, update examples with them ([#572](https://github.com/tinkoff-ai/etna/pull/572))
-
- Make `LabelEncoderTransform` and `OneHotEncoderTransform` multi-segment ([#554](https://github.com/tinkoff-ai/etna/pull/554))
### Fixed
- Fix `TSDataset._update_regressors` logic removing the regressors ([#489](https://github.com/tinkoff-ai/etna/pull/489))
- Fix `TSDataset.info`, `TSDataset.describe` methods ([#519](https://github.com/tinkoff-ai/etna/pull/519))
- Fix regressors handling for `OneHotEncoderTransform` and `HolidayTransform` ([#518](https://github.com/tinkoff-ai/etna/pull/518))
- Fix wandb summary issue with custom plots ([#535](https://github.com/tinkoff-ai/etna/pull/535))
-
-
-
- Fix import Literal in plotters ([#558](https://github.com/tinkoff-ai/etna/pull/558))
-
-
-
Expand Down
117 changes: 90 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,87 @@ The library started as an internal product in our company -
we use it in over 10+ projects now, so we often release updates.
Contributions are welcome - check our [Contribution Guide](https://github.com/tinkoff-ai/etna/blob/master/CONTRIBUTING.md).

## Get started

Let's load and prepare the data.
```python
import pandas as pd
from etna.datasets import TSDataset

# Read the data
df = pd.read_csv("examples/data/example_dataset.csv")

# Create a TSDataset
df = TSDataset.to_dataset(df)
ts = TSDataset(df, freq="D")

# Choose a horizon
HORIZON = 14

# Make train/test split
train_ts, test_ts = ts.train_test_split(test_size=HORIZON)
```

Define transformations and model:
```python
from etna.models import CatBoostModelMultiSegment
from etna.transforms import DateFlagsTransform
from etna.transforms import DensityOutliersTransform
from etna.transforms import FourierTransform
from etna.transforms import LagTransform
from etna.transforms import LinearTrendTransform
from etna.transforms import MeanTransform
from etna.transforms import SegmentEncoderTransform
from etna.transforms import TimeSeriesImputerTransform
from etna.transforms import TrendTransform

# Prepare transforms
transforms = [
DensityOutliersTransform(in_column="target", distance_coef=3.0),
TimeSeriesImputerTransform(in_column="target", strategy="forward_fill"),
LinearTrendTransform(in_column="target"),
TrendTransform(in_column="target", out_column="trend"),
LagTransform(in_column="target", lags=list(range(HORIZON, 122)), out_column="target_lag"),
DateFlagsTransform(week_number_in_month=True, out_column="date_flag"),
FourierTransform(period=360.25, order=6, out_column="fourier"),
SegmentEncoderTransform(),
MeanTransform(in_column=f"target_lag_{HORIZON}", window=12, seasonality=7),
MeanTransform(in_column=f"target_lag_{HORIZON}", window=7),
]

# Prepare model
model = CatBoostModelMultiSegment()
```

Fit `Pipeline` and make a prediction:
```python
from etna.pipeline import Pipeline

# Create and fit the pipeline
pipeline = Pipeline(model=model, transforms=transforms, horizon=HORIZON)
pipeline.fit(train_ts)

# Make a forecast
forecast_ts = pipeline.forecast()
```

Let's plot the results:
```python
from etna.analysis import plot_forecast

plot_forecast(forecast_ts=forecast_ts, test_ts=test_ts, train_ts=train_ts, n_train_samples=50)
```

![](examples/assets/readme/get_started.png)

Print the metric value across the segments:
```python
from etna.metrics import SMAPE

metric = SMAPE(mode="macro")
metric_value = metric(y_true=test_ts, y_pred=forecast_ts)
>>> {'segment_b': 3.3017151519000967, 'segment_c': 5.270557433427279, 'segment_a': 5.272811627335398, 'segment_d': 4.689085450895735}
```

## Installation

Expand Down Expand Up @@ -79,35 +159,10 @@ For example, `etna.models.ProphetModel` needs `prophet` extension and can't be u

ETNA supports configuration files. It means that library will check that all the specified packages are installed prior to script start and NOT during runtime.

To set up a configuration for your project you should create a `.etna` file at the project's root. To see the available options look at [`Settings`](https://github.com/tinkoff-ai/etna/blob/master/etna/settings.py#L68). There is an [example](https://github.com/tinkoff-ai/etna/tree/master/examples/configs/.etna) of configuration file.

## Get started
Here's some example code for a quick start.
```python
import pandas as pd
from etna.datasets.tsdataset import TSDataset
from etna.models import ProphetModel
from etna.pipeline import Pipeline

# Read the data
df = pd.read_csv("examples/data/example_dataset.csv")

# Create a TSDataset
df = TSDataset.to_dataset(df)
ts = TSDataset(df, freq="D")

# Choose a horizon
HORIZON = 8

# Fit the pipeline
pipeline = Pipeline(model=ProphetModel(), horizon=HORIZON)
pipeline.fit(ts)

# Make the forecast
forecast_ts = pipeline.forecast()
```
To set up a configuration for your project you should create a `.etna` file at the project's root. To see the available options look at [`Settings`](https://github.com/tinkoff-ai/etna/blob/master/etna/settings.py#L68). There is an [example](https://github.com/tinkoff-ai/etna/tree/master/examples/configs/.etna) of configuration file.

## Tutorials

We have also prepared a set of tutorials for an easy introduction:

| Notebook | Interactive launch |
Expand All @@ -121,8 +176,14 @@ We have also prepared a set of tutorials for an easy introduction:
| [Ensembles](https://github.com/tinkoff-ai/etna/tree/master/examples/ensembles.ipynb) | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/tinkoff-ai/etna/master?filepath=examples/ensembles.ipynb) |

## Documentation

ETNA documentation is available [here](https://etna-docs.netlify.app/).

## Community

To ask the questions or discuss the library you can join our [telegram chat](t.me/etna_support).
[Discussions section](https://github.com/tinkoff-ai/etna/discussions) on github is also open for this purpose.

## Resources

- [Forecasting with ETNA: Fast and Furious](https://medium.com/its-tinkoff/forecasting-with-etna-fast-and-furious-1b58e1453809) on Medium
Expand All @@ -134,6 +195,7 @@ ETNA documentation is available [here](https://etna-docs.netlify.app/).
## Acknowledgments

### ETNA.Team

[Andrey Alekseev](https://github.com/iKintosh),
[Nikita Barinov](https://github.com/diadorer),
[Dmitriy Bunin](https://github.com/Mr-Geekman),
Expand All @@ -148,6 +210,7 @@ ETNA documentation is available [here](https://etna-docs.netlify.app/).
[Julia Shenshina](https://github.com/julia-shenshina)

### ETNA.Contributors

[Artem Levashov](https://github.com/soft1q),
[Aleksey Podkidyshev](https://github.com/alekseyen),
[Carlosbogo](https://github.com/Carlosbogo)
Expand Down
1 change: 1 addition & 0 deletions etna/analysis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from etna.analysis.eda_utils import distribution_plot
from etna.analysis.eda_utils import sample_acf_plot
from etna.analysis.eda_utils import sample_pacf_plot
from etna.analysis.eda_utils import stl_plot
from etna.analysis.feature_relevance.relevance import ModelRelevanceTable
from etna.analysis.feature_relevance.relevance import RelevanceTable
from etna.analysis.feature_relevance.relevance import StatisticsRelevanceTable
Expand Down
75 changes: 75 additions & 0 deletions etna/analysis/eda_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,21 @@
import warnings
from itertools import combinations
from typing import TYPE_CHECKING
from typing import Any
from typing import Dict
from typing import List
from typing import Optional
from typing import Sequence
from typing import Tuple

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import statsmodels.api as sm
from matplotlib.ticker import MaxNLocator
from statsmodels.graphics import utils
from statsmodels.tsa.seasonal import STL

if TYPE_CHECKING:
from etna.datasets import TSDataset
Expand Down Expand Up @@ -221,3 +226,73 @@ def distribution_plot(
sns.boxplot(data=df_slice.sort_values(by="segment"), y="z", x="segment", ax=ax[i], fliersize=False)
ax[i].set_title(f"{period}")
i += 1


def stl_plot(
ts: "TSDataset",
in_column: str = "target",
period: Optional[int] = None,
segments: Optional[List[str]] = None,
columns_num: int = 2,
figsize: Tuple[int, int] = (10, 10),
plot_kwargs: Optional[Dict[str, Any]] = None,
stl_kwargs: Optional[Dict[str, Any]] = None,
):
"""Plot STL decomposition for segments.
Parameters
----------
ts:
dataset with timeseries data
segments:
segments to plot
columns_num:
number of columns in subplots
figsize:
size of the figure per subplot with one segment in inches
plot_kwargs:
dictionary with parameters for plotting, `matplotlib.axes.Axes.plot` is used
stl_kwargs:
dictionary with parameters for STL decomposition, `statsmodels.tsa.seasonal.STL` is used
"""
if plot_kwargs is None:
plot_kwargs = {}
if stl_kwargs is None:
stl_kwargs = {}
if not segments:
segments = sorted(ts.segments)

segments_number = len(segments)
columns_num = min(columns_num, len(segments))
rows_num = math.ceil(segments_number / columns_num)

figsize = (figsize[0] * columns_num, figsize[1] * rows_num)
fig = plt.figure(figsize=figsize, constrained_layout=True)
subfigs = fig.subfigures(rows_num, columns_num)

df = ts.to_pandas()
for i, segment in enumerate(segments):
segment_df = df.loc[:, pd.IndexSlice[segment, :]][segment]
segment_df = segment_df[segment_df.first_valid_index() : segment_df.last_valid_index()]
decompose_result = STL(endog=segment_df[in_column], period=period, **stl_kwargs).fit()

# start plotting
subfigs.flat[i].suptitle(segment)
axs = subfigs.flat[i].subplots(4, 1, sharex=True)

# plot observed
axs.flat[0].plot(segment_df.index, decompose_result.observed, **plot_kwargs)
axs.flat[0].set_ylabel("Observed")

# plot trend
axs.flat[1].plot(segment_df.index, decompose_result.trend, **plot_kwargs)
axs.flat[1].set_ylabel("Trend")

# plot seasonal
axs.flat[2].plot(segment_df.index, decompose_result.seasonal, **plot_kwargs)
axs.flat[2].set_ylabel("Seasonal")

# plot residuals
axs.flat[3].plot(segment_df.index, decompose_result.resid, **plot_kwargs)
axs.flat[3].set_ylabel("Residual")
axs.flat[3].tick_params("x", rotation=45)
Loading

0 comments on commit 66bd488

Please sign in to comment.