Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Improve TimeSeries.__getitem__ frequency inference #2152

Merged

Conversation

DavidKleindienst
Copy link
Contributor

@DavidKleindienst DavidKleindienst commented Jan 10, 2024

Fixes #2089 .

Summary

Currently when indexing a TimeSeries, the frequency of the new TimeSeries is inferred from dates of the new TimeSeries.
However, we often have information to do a more correct inference, in particular in the following cases:

  • The __getitem__ argument is a pd.DatetimeIndex with a set .freq attribute
  • We index with a slice

This PR contains the following improvements to frequency inference in the TimeSeries.__getitem__ method:

  • When a pd.DatetimeIndex with a set .freq attribute is used as __getitem__ argument, we use this attribute as the new TimeSeries frequency
  • When a slice without step parameter is used as __getitem__ argument, the original TimeSeries frequency is used
  • When a slice with step parameter is used as __getitem__ argument, the new frequency is derived by multiplying step with the original TimeSeries frequency
  • A unittest for frequency inference is added

This means that inferring the frequency from the dates of the new TimeSeries will only be used either when a pd.DatetimeIndex without .freq attribute is used as __getitem__ argument or when indexing by a single int or pd.Timestamp value.

RangeIndexed TimeSeries are not affected by this PR.

Other Information

@codecov-commenter
Copy link

codecov-commenter commented Jan 10, 2024

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (de4afd1) 93.91% compared to head (d8b5eae) 93.88%.

Files Patch % Lines
darts/timeseries.py 77.27% 5 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2152      +/-   ##
==========================================
- Coverage   93.91%   93.88%   -0.04%     
==========================================
  Files         135      135              
  Lines       13321    13335      +14     
==========================================
+ Hits        12511    12520       +9     
- Misses        810      815       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @DavidKleindienst , looks great 🚀

I made some minor adaptions so the frequency inference applies identically to integer indexed TimeSeries.

@dennisbader
Copy link
Collaborator

@DavidKleindienst, you can ignore the failing unit tests for now, it is a problem with one of the dataset sources that they're fixing now.

@dennisbader dennisbader merged commit 962fd78 into unit8co:master Jan 15, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] TimeSeries slicing silently changes the TimeSeries' frequency
3 participants