Skip to content

Add plot_holidays #624

Merged
merged 5 commits into from Mar 29, 2022
Merged

Add plot_holidays #624

merged 5 commits into from Mar 29, 2022

Conversation

Mr-Geekman
Copy link
Contributor

@Mr-Geekman Mr-Geekman commented Mar 23, 2022

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #586.

Related Issue

#586.

Closing issues

Closes #586.

@Mr-Geekman Mr-Geekman added the enhancement New feature or request label Mar 23, 2022
@Mr-Geekman Mr-Geekman self-assigned this Mar 23, 2022
@Mr-Geekman
Copy link
Contributor Author

Example script:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from etna.analysis import plot_holidays
from etna.datasets import TSDataset


def main():
    rng = np.random.default_rng(42)

    df = pd.read_csv("examples/data/example_dataset.csv", parse_dates=["timestamp"])
    ts = TSDataset(df=TSDataset.to_dataset(df), freq="D")

    # plain iso
    plot_holidays(ts=ts, holidays="RU")
    plt.savefig("holidays_plain_iso")

    # plain dataframe
    timestamp = pd.date_range("2020-01-01", periods=100, freq="D")
    df = pd.DataFrame({"timestamp": timestamp, "target": rng.normal(size=len(timestamp)), "segment": "segment_0"})
    ts = TSDataset(df=TSDataset.to_dataset(df), freq="D")
    holidays_df = pd.DataFrame(
        {
            "timestamp": timestamp,
            "New Year": [1.0] * 10 + [0.0] * 90,
            "Some Holiday": [0.0] * 11 + [1.0] * 4 + [0.0] * 85,
            "Single Holiday 1": [0.0] * 20 + [1.0] + [0.0] * 79,
            "Single Holiday 2": [0.0] * 50 + [1.0] + [0.0] * 49,
            "Single Holiday 3": [0.0] * 80 + [1.0] + [0.0] * 19,
        }
    )
    holidays_df = holidays_df.set_index("timestamp")
    plot_holidays(ts=ts, holidays=holidays_df)
    plt.savefig("holidays_plain_df")

    # resampling iso
    timestamp = pd.date_range("2020-01-01", periods=3000, freq="H")
    df = pd.DataFrame({"timestamp": timestamp, "target": rng.normal(size=len(timestamp)), "segment": "segment_0"})
    ts = TSDataset(df=TSDataset.to_dataset(df), freq="H")
    plot_holidays(ts=ts, holidays="RU")
    plt.savefig("holidays_resample_iso")


if __name__ == "__main__":
    main()

holidays_plain_iso:
holidays_plain_iso

holidays_plain_df:
holidays_plain_df

holidays_resample_iso:
holidays_resample_iso

@codecov-commenter
Copy link

codecov-commenter commented Mar 25, 2022

Codecov Report

Merging #624 (dc3f05a) into master (fb53641) will decrease coverage by 31.40%.
The diff coverage is 12.96%.

@@             Coverage Diff             @@
##           master     #624       +/-   ##
===========================================
- Coverage   84.60%   53.19%   -31.41%     
===========================================
  Files         118      118               
  Lines        6033     6087       +54     
===========================================
- Hits         5104     3238     -1866     
- Misses        929     2849     +1920     
Impacted Files Coverage Δ
etna/analysis/plotters.py 10.99% <11.32%> (-11.15%) ⬇️
etna/analysis/__init__.py 100.00% <100.00%> (ø)
etna/commands/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
etna/commands/backtest_command.py 0.00% <0.00%> (-96.43%) ⬇️
etna/commands/forecast_command.py 0.00% <0.00%> (-92.00%) ⬇️
etna/commands/__main__.py 0.00% <0.00%> (-87.50%) ⬇️
etna/commands/resolvers.py 0.00% <0.00%> (-80.00%) ⬇️
etna/analysis/outliers/density_outliers.py 22.44% <0.00%> (-75.52%) ⬇️
etna/datasets/datasets_generation.py 27.02% <0.00%> (-72.98%) ⬇️
etna/transforms/timestamp/time_flags.py 27.02% <0.00%> (-72.98%) ⬇️
... and 68 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@Mr-Geekman
Copy link
Contributor Author

Example with intersecting holidays (look at the script above and replace holidays_df there). Hoidays:

holidays_df = pd.DataFrame(
        {
            "timestamp": timestamp,
            "New Year": [1.0] * 10 + [0.0] * 90,
            "Some Holiday": [0.0] * 11 + [1.0] * 4 + [0.0] * 85,
            "Single Holiday 1": [0.0] * 20 + [1.0] + [0.0] * 79,
            "Single Holiday 2": [0.0] * 50 + [1.0] + [0.0] * 49,
            "Single Holiday 3": [0.0] * 80 + [1.0] + [0.0] * 19,
        }
    )

Result:
holidays_plain_df

@julia-shenshina julia-shenshina merged commit 0a7851d into master Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Holidays plot
3 participants