Skip to content

ScalerTransform: micro and macro mode #61

Merged
merged 7 commits into from
Sep 17, 2021
Merged

ScalerTransform: micro and macro mode #61

merged 7 commits into from
Sep 17, 2021

Conversation

alex-hse-repository
Copy link
Collaborator

@alex-hse-repository alex-hse-repository commented Sep 17, 2021

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (checklist)

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contribution guide?
  • Did you check the code style? make lint (poetry install -E style).
  • Did you make sure to update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you check that your code passes the unit tests pytest tests/ ?
  • Did you add your new functionality to the docs?
  • Did you update the CHANGELOG?

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

  • Add micro and macro mode for ScalerTransform

Related Issue

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Closing issues

Put closes #XXXX in your comment to auto-close the issue that your PR fixes (if such).

mode:
"macro" or "per-segment", way to transform features over segments.
If "macro", transforms features globally, gluing the corresponding ones for all segments.
If "per-segment", transforms features for each segment separately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we should add Raises block here.

Copy link
Collaborator Author

@alex-hse-repository alex-hse-repository Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have Raises block in the similar case in Metric

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Raises
------
ValueError:
if incorrect strategy given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strategy -> mode

Raises
------
ValueError:
if incorrect strategy given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strategy -> mode

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@codecov-commenter
Copy link

codecov-commenter commented Sep 17, 2021

Codecov Report

Merging #61 (6514f31) into master (26d88da) will increase coverage by 0.15%.
The diff coverage is 92.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
+ Coverage   90.33%   90.48%   +0.15%     
==========================================
  Files          45       45              
  Lines        1976     2008      +32     
==========================================
+ Hits         1785     1817      +32     
  Misses        191      191              
Impacted Files Coverage Δ
etna/transforms/sklearn.py 95.65% <91.89%> (-4.35%) ⬇️
etna/transforms/power.py 100.00% <100.00%> (ø)
etna/transforms/scalers.py 100.00% <100.00%> (ø)
etna/model_selection/backtest.py 97.41% <0.00%> (-0.03%) ⬇️
etna/models/sklearn.py 100.00% <0.00%> (ø)
etna/transforms/log.py 100.00% <0.00%> (ø)
etna/models/catboost.py 100.00% <0.00%> (ø)
etna/transforms/lags.py 100.00% <0.00%> (ø)
etna/transforms/imputation.py 100.00% <0.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26d88da...6514f31. Read the comment docs.

Mr-Geekman
Mr-Geekman previously approved these changes Sep 17, 2021
Copy link
Contributor

@martins0n martins0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the same for power transforms.

@@ -23,6 +24,7 @@ def __init__(
inplace: bool = True,
with_mean: bool = True,
with_std: bool = True,
mode: str = TransformMode.per_segment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type TransformMode

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we pass the sting, not the TransformMode

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> from etna.transforms.sklearn import TransformMode
>>> TransformMode.per_segment
<TransformMode.per_segment: 'per-segment'>
>>> type(TransformMode.per_segment)
<enum 'TransformMode'>

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be "per-segment" instead of TransformMode.per_segment)

macro = "macro"
per_segment = "per-segment"

@classmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this.
Maybe we will add this with mixin later for every enum.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

segments = sorted(set(df.columns.get_level_values("segment")))
x = df.loc[:, pd.IndexSlice[:, self.in_column]]
x = pd.concat([x[segment] for segment in segments]).values
return x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x - is numpy array, isnt it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

x = self._reshape(df)
transformed = self.transformer.inverse_transform(X=x)
transformed = self._inverse_reshape(df, transformed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not great finalize without else statement -- this is kind of pattern matching and we should raise Exception or doing something for other possible cases.
The probablity of self.mode not equal to something from enum is mostly equal to zero. But it can be, for example you change mode by hand in tests and you could pass everything.
So we should raise ValueError in else block

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@martins0n martins0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@martins0n martins0n merged commit ada6523 into master Sep 17, 2021
@martins0n martins0n deleted the ETNA-745 branch September 17, 2021 13:45
Mr-Geekman pushed a commit that referenced this pull request Sep 20, 2021
* Add micro and macro mode for ScalerTransform

* Add macro/micro mode for power transformers


Co-authored-by: a.p.chikov <a.p.chikov@macbook-a.p.chikov>
julia-shenshina pushed a commit that referenced this pull request Sep 22, 2021
* Add cap and floor to prophet regressor columns

* Refactor if-block

* Add changelog (#60)

* Create CHANGELOG.md
* Add version 1.0.0
* add unreleased changes

* LinearTrendTransform bug fix (#49)

* Fix tsdataset getitem method (#25)

* fix tsdataset getitem method

* fix style

* add test for all indexes

* fix imports

* add Nan info

* fix typo

Co-authored-by: an.alekseev <an.alekseev@tinkoff.ru>

* ScalerTransform: micro and macro mode (#61)

* Add micro and macro mode for ScalerTransform

* Add macro/micro mode for power transformers


Co-authored-by: a.p.chikov <a.p.chikov@macbook-a.p.chikov>

* Revert "Add cap and floor to prophet regressor columns"

This reverts commit 8c60bba.

* Add processing regressor_cap and regressor_floor features, add notes in documentation about it

* Update changelog

* Fix missing add regressor

* Fix wrong code, edit test

* Reformat code

Co-authored-by: Andrey Alekseev <ilekseev@gmail.com>
Co-authored-by: alex-hse-repository <55380696+alex-hse-repository@users.noreply.github.com>
Co-authored-by: an.alekseev <an.alekseev@tinkoff.ru>
Co-authored-by: a.p.chikov <a.p.chikov@macbook-a.p.chikov>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants