Add `plot_periodogram` #606

Mr-Geekman · 2022-03-16T12:17:32Z

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

Did you read the contribution guide?
Did you update the docs? We use Numpy format for all the methods and classes.
Did you write any new necessary tests?
Did you update the CHANGELOG?

Type of Change

Examples / docs / tutorials / contributors update
Bug fix (non-breaking change which fixes an issue)
Improvement (non-breaking change which improves an existing feature)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #593.

Related Issue

#593.

Closing issues

Closes #593.

Mr-Geekman · 2022-03-16T12:20:25Z

Script with demo:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from etna.analysis import plot_periodogram
from etna.datasets import TSDataset


def main():
    df = pd.read_csv("examples/data/example_dataset.csv", parse_dates=["timestamp"])
    df_wide = TSDataset.to_dataset(df)
    df_wide.iloc[:3, 0] = np.NaN
    ts = TSDataset(df=df_wide, freq="D")

    plot_periodogram(ts=ts, period=365.25, amplitude_aggregation_mode="per-segment")
    plt.savefig("periodogram_per_segment")

    plot_periodogram(
        ts=ts, period=365.25, amplitude_aggregation_mode="mean", periodogram_params=dict(scaling="spectrum")
    )
    plt.savefig("periodogram_mean")


if __name__ == "__main__":
    main()

periodogram_per_segment:

periodogram_mean:

# Conflicts: # CHANGELOG.md

codecov-commenter · 2022-03-16T12:59:05Z

Codecov Report

Merging #606 (f95302d) into master (7dd9448) will decrease coverage by 32.19%.
The diff coverage is 6.25%.

@@             Coverage Diff             @@
##           master     #606       +/-   ##
===========================================
- Coverage   85.14%   52.95%   -32.20%     
===========================================
  Files         118      118               
  Lines        5884     5932       +48     
===========================================
- Hits         5010     3141     -1869     
- Misses        874     2791     +1917

Impacted Files	Coverage Δ
etna/analysis/plotters.py	`10.95% <4.25%> (-6.59%)`	⬇️
etna/analysis/__init__.py	`100.00% <100.00%> (ø)`
etna/commands/__init__.py	`0.00% <0.00%> (-100.00%)`	⬇️
etna/commands/backtest_command.py	`0.00% <0.00%> (-96.43%)`	⬇️
etna/commands/forecast_command.py	`0.00% <0.00%> (-92.00%)`	⬇️
etna/commands/__main__.py	`0.00% <0.00%> (-87.50%)`	⬇️
etna/commands/resolvers.py	`0.00% <0.00%> (-80.00%)`	⬇️
etna/analysis/outliers/density_outliers.py	`22.44% <0.00%> (-75.52%)`	⬇️
etna/datasets/datasets_generation.py	`26.47% <0.00%> (-73.53%)`	⬇️
etna/transforms/timestamp/time_flags.py	`27.02% <0.00%> (-72.98%)`	⬇️
... and 68 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Ama16 · 2022-03-17T08:23:49Z

etna/analysis/plotters.py

+    columns_num: int = 2,
+    figsize: Tuple[int, int] = (10, 5),
+):
+    """Plot the periodogram to determine the optimal order parameter for `etna.transforms.FourierTransform`.


Maybe reference on scipy.signal.periodogram?

Ok, I'll add it, but remain mention of FourierTransform. Task creator told that this plot is useful exactly for determining the order for FourierTransform.

Ama16 · 2022-03-17T08:26:54Z

etna/analysis/plotters.py

+    ts:
+        TSDataset with timeseries data
+    period:
+        the period of the seasonality to capture in frequency units of time series, it should be >= 2


Isnt it too difficult?

What is the alternative? We have done this parameter like in FourierTransform.

Ama16 · 2022-03-17T08:33:30Z

etna/analysis/plotters.py

+            segment_df = df.loc[:, pd.IndexSlice[segment, "target"]]
+            segment_df = segment_df[segment_df.first_valid_index() : segment_df.last_valid_index()]
+            if segment_df.isna().any():
+                raise ValueError(f"Periodogram can't be calculated on segment with NaNs inside: {segment}")


If 'NaNs' exists, but we will cut all of them in future, is it right to raise error?

I don't really understand the problem. If we have NaNs at the edges, we are cutting them. NaNs in the middle leads to answer in all NaNs. Cut them out before applying periodogram isn't reasonable, because we are breaking frequencies and seasonalities.

Ama16 · 2022-03-17T08:36:46Z

etna/analysis/plotters.py

+            frequencies_segments.append(frequencies)
+            spectrums_segments.append(spectrum)
+
+        frequencies = frequencies_segments[0]


Why did we create frequencies_segments array if we need only one value from it?

I think, that this implementation is easy. Other implementations that came to my mind require writing if cases for some iteration, or rewriting the same value inside the array, that can be considered as a bug. Current implementation looks very simple.
Do you know simple alternatives?

Ama16 · 2022-03-17T08:39:57Z

etna/analysis/plotters.py

+        _, ax = plt.subplots(figsize=figsize, constrained_layout=True)
+        ax.step(frequencies, spectrum)  # type: ignore
+        ax.set_xscale("log")  # type: ignore
+        ax.set_title("Periodogram")  # type: ignore


Maybe in this case it is worth naming x-axis?

Reasonable, I'll add it.

…tion, add xlabel, ylabel

Mr-Geekman · 2022-03-18T09:56:14Z

Script is the same.

periodogram_per_segment:

periodogram_mean:

Add

2590cfd

Mr-Geekman added the enhancement New feature or request label Mar 16, 2022

Mr-Geekman self-assigned this Mar 16, 2022

Update changelog

505ed1b

Merge remote-tracking branch 'origin/master' into issue-593

554c3f0

# Conflicts: # CHANGELOG.md

martins0n requested a review from Ama16 March 17, 2022 07:38

Ama16 reviewed Mar 17, 2022

View reviewed changes

d.a.bunin added 3 commits March 18, 2022 12:36

Merge remote-tracking branch 'origin/master' into issue-593

baa8107

Fix changelog

d204eea

Update docstring about period parameter, update docstring of the func…

f95302d

…tion, add xlabel, ylabel

Ama16 approved these changes Mar 21, 2022

View reviewed changes

Mr-Geekman merged commit 3097a83 into master Mar 21, 2022

Mr-Geekman deleted the issue-593 branch March 21, 2022 10:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `plot_periodogram` #606

Add `plot_periodogram` #606

Mr-Geekman commented Mar 16, 2022 •

edited

Loading

Mr-Geekman commented Mar 16, 2022

codecov-commenter commented Mar 16, 2022 •

edited

Loading

Ama16 Mar 17, 2022

Mr-Geekman Mar 18, 2022

Ama16 Mar 17, 2022

Mr-Geekman Mar 18, 2022

Ama16 Mar 17, 2022

Mr-Geekman Mar 18, 2022

Ama16 Mar 17, 2022

Mr-Geekman Mar 18, 2022

Ama16 Mar 17, 2022

Mr-Geekman Mar 18, 2022

Mr-Geekman commented Mar 18, 2022

Add plot_periodogram #606

Add plot_periodogram #606

Conversation

Mr-Geekman commented Mar 16, 2022 • edited Loading

Before submitting (must do checklist)

Type of Change

Proposed Changes

Related Issue

Closing issues

Mr-Geekman commented Mar 16, 2022

codecov-commenter commented Mar 16, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mr-Geekman commented Mar 18, 2022

Add `plot_periodogram` #606

Add `plot_periodogram` #606

Mr-Geekman commented Mar 16, 2022 •

edited

Loading

codecov-commenter commented Mar 16, 2022 •

edited

Loading