Skip to content

Add plot_residuals #539

Merged
merged 9 commits into from Feb 18, 2022
Merged

Add plot_residuals #539

merged 9 commits into from Feb 18, 2022

Conversation

Mr-Geekman
Copy link
Contributor

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #514.

Related Issue

#514.

Closing issues

Closes #514.

@Mr-Geekman Mr-Geekman added the enhancement New feature or request label Feb 16, 2022
@Mr-Geekman Mr-Geekman self-assigned this Feb 16, 2022
@Mr-Geekman
Copy link
Contributor Author

Example of usage:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from etna.pipeline import Pipeline
from etna.metrics import MAE
from etna.models import LinearPerSegmentModel
from etna.transforms import LagTransform
from etna.datasets import TSDataset
from etna.analysis import plot_residuals, plot_backtest


def multitrend_df() -> pd.DataFrame:
    """Generate one segment pd.DataFrame with multiple linear trend."""
    df = pd.DataFrame({"timestamp": pd.date_range("2020-01-01", "2021-05-31")})
    ns = [100, 150, 80, 187]
    ks = [0.4, -0.3, 0.8, -0.6]
    x = np.zeros(shape=(len(df)))
    left = 0
    right = 0
    for i, (n, k) in enumerate(zip(ns, ks)):
        right += n
        x[left:right] = np.arange(0, n, 1) * k
        for _n, _k in zip(ns[:i], ks[:i]):
            x[left:right] += _n * _k
        left = right
    df["target"] = x
    df["segment"] = "segment_1"
    df = TSDataset.to_dataset(df=df)
    return df


df = multitrend_df()
df.loc[:, pd.IndexSlice["segment_2", "target"]] = np.log(df.loc[:, pd.IndexSlice["segment_1", "target"]] + 100)
df.loc[:, pd.IndexSlice["segment_3", "target"]] = np.log(df.loc[:, pd.IndexSlice["segment_1", "target"]] + 1000)
ts = TSDataset(df=df, freq="D")
transforms = [LagTransform(in_column="target", out_column="lags", lags=[5, 6, 7, 8, 9, 10])]
pipeline = Pipeline(model=LinearPerSegmentModel(), transforms=transforms, horizon=5)
metrics, forecast_df, info = pipeline.backtest(ts=ts, metrics=[MAE()], n_folds=5)

plot_backtest(forecast_df=forecast_df, ts=ts)
plt.savefig("backtest.png")

plot_residuals(forecast_df=forecast_df, ts=ts, feature="timestamp")
plt.savefig("image_timestamp.png")

plot_residuals(forecast_df=forecast_df, ts=ts, transforms=transforms, feature="lags_5")
plt.savefig("image_lags.png")

Backtest plot:
backtest

Plot against timestamp:
image_timestamp

Plot against lag feature:
image_lags

@martins0n martins0n self-requested a review February 17, 2022 07:49
martins0n
martins0n previously approved these changes Feb 18, 2022
Copy link
Contributor

@martins0n martins0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@martins0n martins0n enabled auto-merge (squash) February 18, 2022 08:09
@codecov-commenter
Copy link

Codecov Report

Merging #539 (adadd9d) into master (95a988c) will decrease coverage by 32.86%.
The diff coverage is 15.90%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #539       +/-   ##
===========================================
- Coverage   87.17%   54.30%   -32.87%     
===========================================
  Files         118      118               
  Lines        5777     5797       +20     
===========================================
- Hits         5036     3148     -1888     
- Misses        741     2649     +1908     
Impacted Files Coverage Δ
etna/analysis/plotters.py 12.12% <13.95%> (+1.50%) ⬆️
etna/analysis/__init__.py 100.00% <100.00%> (ø)
etna/commands/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
etna/commands/backtest_command.py 0.00% <0.00%> (-96.43%) ⬇️
etna/commands/forecast_command.py 0.00% <0.00%> (-92.00%) ⬇️
etna/commands/__main__.py 0.00% <0.00%> (-87.50%) ⬇️
etna/commands/resolvers.py 0.00% <0.00%> (-80.00%) ⬇️
etna/analysis/outliers/density_outliers.py 22.44% <0.00%> (-75.52%) ⬇️
etna/datasets/datasets_generation.py 26.47% <0.00%> (-73.53%) ⬇️
etna/transforms/timestamp/time_flags.py 27.02% <0.00%> (-72.98%) ⬇️
... and 68 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95a988c...adadd9d. Read the comment docs.

@martins0n martins0n merged commit ab902ba into master Feb 18, 2022
@iKintosh iKintosh deleted the issue-514 branch March 22, 2022 08:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Residuals Plot
3 participants