Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ refactor indexing + ✂️ decouple stride & window + ✨ support segment idxs #71

Merged
merged 51 commits into from Oct 11, 2022

Conversation

jvdd
Copy link
Member

@jvdd jvdd commented Jun 23, 2022

♻️ Refactor indexing

  • 🐛 fix bug with vectorized=True for strided rolling
    • vectorized support for single feature windows
    • vectorized support for empty feature windows
    • test the above
  • ♻️ refactor the strided window segmentation (& indexing)
    • add include_final_window argument to FeatureCollection .calculate (& StridedRolling)
    • update docs
    • test the above
    • remove support for TimeSequenceStridedRolling
      => Decided to not do this in this PR. We will leave this for another PR.
  • 🙈 undo breaking change from 🐛 fix bug with bound_method + ✨ new integrations #62 (to make code backwards compatible)
    • revert the default window_idx argument in FeatureCollection.calculate to "end"
  • 🤖 extend test matrix & update dependencies

✂️ Decouple stride

We (@jonasvdd, @emield12, and @jvdd) believe that the stride should not be hardly coupled with a FeatureDescriptor. Therefore, to make tsflex more flexible (:wink:) we make the stride argument optional for FeatureDescriptor and MultipleFeatureDescriptor and add the functionality to pass your stride(s) to the FeatureCollection.calculate method.

  • 👀 externally visible changes
    • make stride optional in FeatureDescriptor
    • add stride argument to FeatureCollection.calculate
    • FeatureDescriptor / FeatureCollection.calculate should accept multiple strides
      • StridedRolling should accept multiple strides
      • test the above
    • remove stride from output column name
      • update reduce method
      • update reduce method tests
    • update the logging to handle multiple strides
      • extend logging tests
  • 📦 internal changes
    • change stride -> strides: which is either a list of stride sizes (float or pd.Timedelta) or None (in StridedRolling and StridedRollingFactory)
    • identify feature descriptors (FD) based on their window - output names
      • set interection after StridedRolling search sorted
        TODO: moet geoptimaliseerd worden

✨ Support setpoints

  • support setpoints
  • test the above
    => Note: we allow setpoints of different timezones as the np.datetime64 conversion of these allow comparison..
  • trim range if not in data
    => Decided to not do this! (as this is somewhat an ambiguous operation)
    As using segment indexes is already an advanced operation, it is the user its responsability to either trim the segmented indexes or make their features robust
    .

🙈 other stuff

  • 🐛 fix bug with features.logger that does not handle numeric window & strides
    • improve parsing for window and stride values in _parse_logging_execution_to_df
    • test the above
  • 👀 other minor stuff
    • support offline data load
    • warn the user with a RuntimeWarining when the data its index (passed to FeatureCollection.calculate) is not monotonically increasing
    • test the above

@codecov-commenter
Copy link

codecov-commenter commented Jun 23, 2022

Codecov Report

Merging #71 (e684a09) into main (468f2b3) will increase coverage by 0.16%.
The diff coverage is 99.59%.

@@            Coverage Diff             @@
##             main      #71      +/-   ##
==========================================
+ Coverage   97.73%   97.89%   +0.16%     
==========================================
  Files          23       22       -1     
  Lines        1106     1238     +132     
==========================================
+ Hits         1081     1212     +131     
- Misses         25       26       +1     
Impacted Files Coverage Δ
tsflex/features/function_wrapper.py 100.00% <ø> (ø)
tsflex/utils/data.py 92.15% <83.33%> (-1.18%) ⬇️
tsflex/features/feature.py 97.91% <100.00%> (+0.29%) ⬆️
tsflex/features/feature_collection.py 99.52% <100.00%> (-0.48%) ⬇️
tsflex/features/integrations.py 98.48% <100.00%> (-1.52%) ⬇️
tsflex/features/logger.py 100.00% <100.00%> (ø)
tsflex/features/segmenter/strided_rolling.py 98.29% <100.00%> (+2.79%) ⬆️
...flex/features/segmenter/strided_rolling_factory.py 100.00% <100.00%> (ø)
tsflex/features/utils.py 100.00% <100.00%> (ø)
tsflex/utils/attribute_parsing.py 100.00% <100.00%> (ø)
... and 5 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jvdd jvdd requested a review from jonasvdd July 7, 2022 10:57
@jvdd jvdd requested a review from emield12 July 13, 2022 20:24
Copy link
Member

@jonasvdd jonasvdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just need to think sometime later about the TimeSequenceStridedRolling

tsflex/features/feature_collection.py Outdated Show resolved Hide resolved
tests/test_strided_rolling.py Outdated Show resolved Hide resolved
tests/test_features_feature_collection.py Outdated Show resolved Hide resolved
✨ decouple stride + support setpoints
@jvdd jvdd changed the title ♻️ refactor indexing ♻️ refactor indexing + ✂️ decouple stride + ✨ support setpoints Aug 6, 2022
@jvdd
Copy link
Member Author

jvdd commented Aug 6, 2022

FYI: all tests that fail now are due to TimeIndexSampleStridedRolling, which I did not update accordingly in this PR => I believe we should decide on whether we include the support for this or drop it... @jonasvdd @emield12

@jvdd
Copy link
Member Author

jvdd commented Aug 19, 2022

macOS-3.10 test is failing due to problems with installing matrixprofile blue-yonder/tsfresh#937

@jvdd jvdd changed the title ♻️ refactor indexing + ✂️ decouple stride + ✨ support setpoints ♻️ refactor indexing + ✂️ decouple stride & window + ✨ support segment ixs Aug 20, 2022
@jvdd jvdd changed the title ♻️ refactor indexing + ✂️ decouple stride & window + ✨ support segment ixs ♻️ refactor indexing + ✂️ decouple stride & window + ✨ support segment idxs Aug 20, 2022
@jonasvdd
Copy link
Member

is this ready for a re-review? @jvdd

@arturdaraujo
Copy link

Waiting for this awesome PR to be merged! A lot of cool stuff!

tsflex/utils/data.py Outdated Show resolved Hide resolved
tsflex/utils/data.py Outdated Show resolved Hide resolved
tsflex/features/segmenter/strided_rolling.py Outdated Show resolved Hide resolved
tsflex/features/segmenter/strided_rolling.py Outdated Show resolved Hide resolved
tsflex/features/segmenter/strided_rolling.py Show resolved Hide resolved
tsflex/features/segmenter/strided_rolling.py Outdated Show resolved Hide resolved
tsflex/features/segmenter/strided_rolling.py Show resolved Hide resolved
tsflex/features/feature_collection.py Outdated Show resolved Hide resolved
tsflex/features/feature_collection.py Outdated Show resolved Hide resolved
tsflex/features/feature_collection.py Show resolved Hide resolved
@jvdd
Copy link
Member Author

jvdd commented Oct 11, 2022

Tests on windows are failing bc poetry installation issue snok/install-poetry#94

@jvdd jvdd merged commit 9787fc0 into main Oct 11, 2022
@jvdd jvdd deleted the refactoring branch January 29, 2023 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants