Implement TS2VecModel #253

egoriyaa · 2024-02-27T02:01:23Z

Before submitting (must do checklist)

Did you read the contribution guide?
Did you update the docs? We use Numpy format for all the methods and classes.
Did you write any new necessary tests?
Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #245

github-actions · 2024-02-27T02:06:09Z

🚀 Deployed on https://deploy-preview-253--etna-docs.netlify.app

etna/libs/ts2vec/ts2vec.py

etna/transforms/embeddings/models/ts2vec.py

tests/test_transforms/test_embeddings/test_models/utils.py

etna/libs/ts2vec/__init__.py

egoriyaa · 2024-02-27T02:32:49Z

Can I change something in libs/? Or I should leave all that it was in a source code?

For example, unused imports and typings

codecov · 2024-02-27T02:42:44Z

Codecov Report

Attention: Patch coverage is 75.63218% with 106 lines in your changes are missing coverage. Please review.

Project coverage is 88.64%. Comparing base (aa9890a) to head (fa1e821).

Files	Patch %	Lines
etna/libs/ts2vec/ts2vec.py	66.26%	56 Missing ⚠️
etna/libs/ts2vec/utils.py	56.89%	25 Missing ⚠️
etna/libs/ts2vec/encoder.py	65.45%	19 Missing ⚠️
etna/transforms/embeddings/models/base.py	82.35%	3 Missing ⚠️
etna/libs/ts2vec/losses.py	95.55%	2 Missing ⚠️
etna/libs/ts2vec/dilated_conv.py	97.14%	1 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff               @@
##           embeddings     #253      +/-   ##
==============================================
- Coverage       89.07%   88.64%   -0.43%     
==============================================
  Files             199      208       +9     
  Lines           13249    13684     +435     
==============================================
+ Hits            11801    12130     +329     
- Misses           1448     1554     +106

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

etna/transforms/embeddings/models/base.py

etna/transforms/embeddings/models/ts2vec.py

tests/test_transforms/test_embeddings/test_models/test_ts2vec.py

etna/transforms/embeddings/models/ts2vec.py

d-a-bunin · 2024-02-28T07:06:41Z

etna/transforms/embeddings/models/ts2vec.py

+        )
+
+    def _prepare_data(self, df: pd.DataFrame) -> np.ndarray:
+        """Prepare data for the embedding model."""


We should probably clarify what this transformation does. E.g. in the comments or in the docstring.

etna/transforms/embeddings/models/ts2vec.py

egoriyaa · 2024-02-29T07:50:09Z

etna/transforms/embeddings/models/ts2vec.py

+        self.embedding_model.fit(train_data=x, n_epochs=self.n_epochs, n_iters=self.n_iters, verbose=self.verbose)
+        return self
+
+    def encode_segment(


i decided to pass parameters for encode here.
the advantage is that we can call encode methods several times with different params with one model

egoriyaa · 2024-02-29T07:51:03Z

etna/transforms/embeddings/models/ts2vec.py

+            data=x,
+            mask=mask,
+            encoding_window=encoding_window,
+            causal=True,


@Ama16 is that true for encode_window?

If we want to generate without leaks and prevent the user from generating with leaks, then yes

etna/transforms/embeddings/models/ts2vec.py

d-a-bunin · 2024-02-29T09:17:05Z

tests/test_transforms/test_embeddings/test_models/test_ts2vec.py

+
+
+@pytest.mark.parametrize("output_dims", [2, 3])
+def test_full_series_equal_values(simple_ts_with_exog, output_dims):


I think checking that values are the same is a part of the format. Or, as I suggested above, we shouldn't make a copy of the same data.

Ama16 · 2024-02-29T10:40:22Z

tests/test_transforms/test_embeddings/test_models/test_ts2vec.py

+
+
+@pytest.mark.parametrize("output_dims", [2, 3])
+def test_full_series_equal_values(simple_ts_with_exog, output_dims):


What are you testing in this test? That all values are equal?
Why then is there a function that cuts off the nans? On the contrary, you need to check that it generates the same values with nans

etna/transforms/embeddings/models/base.py

d-a-bunin · 2024-03-04T07:45:07Z

etna/transforms/embeddings/models/ts2vec.py

+        n_timestamps = len(df.index)
+        n_segments = df.columns.get_level_values("segment").nunique()
+        df = df.sort_index(axis=1)
+        x = df.values.reshape((n_timestamps, n_segments, self.input_dims)).transpose(1, 0, 2)


What happens if there are other features except for target inside df? Do we want to consider such a case?

Let's assume that common interface is using Transform, where attribute in_column is used.
Here we can add note about such case

d-a-bunin · 2024-03-04T07:46:34Z

etna/transforms/embeddings/models/ts2vec.py

+
+        Notes
+        -----
+        Model works with the index sorted in the alphabetic order. Thus, output embeddings correspond to the segments,


I don't really think it should be in a Notes section. It is really important for the user, not some implementation detail. There are two options:

Return segments in their order in df.

Write in a docstring extended description: Output embeddings correspond to the segments, sorted in alphabetical order.

@Ama16 why is there such condition?

The same as above

d-a-bunin · 2024-03-04T07:46:46Z

etna/transforms/embeddings/models/ts2vec.py

+
+        Notes
+        -----
+        Model works with the index sorted in the alphabetic order. Thus, output embeddings correspond to the segments,


Same about this.

The same as above

* Implement TS2VecModel (#253) * add ts2vec model * delete unnecessary utils * add multiscale mode * revert to common encode in model class * lints * reformat save method, add _is_fitted attr * fix embeddings shapes * fix * one more fix * pass numpy array to fit * add tests checking nans in embeddings * update changelog --------- Co-authored-by: Egor Baturin <egoriyaa@github.com> * Implement EmbeddingSegmentTransform and EmbeddingWindowTransform (#265) * add transforms * update changelog * fix ts2vec tests * fix * update rst, encoding_params * fix * fix * fix * fix docstring * add training_params * add freeze method * fix inference tests * lints * fix lisence * fix lisence, fix docs * fix quotes --------- Co-authored-by: Egor Baturin <egoriyaa@github.com> * Implement TSTCC (#294) * add tstcc * add einops package * remove pd.testing in inference tests * fix * add verbose param, refactor logging, fix warning * fix logging loss * add changelog * catch torch warning * fix * catch nn.Conv1d warning --------- Co-authored-by: Egor Baturin <egoriyaa@github.com> * lints * fix * Add tutorial how to work with embedding models (#304) * fix tstcc * move lr param from __init__ to fit * add tutorial * fix notebook * update changelog * fix changelog * lints * fix notebook * update readme * fix readme * fix readme * write comment in libs/ts2vec/ts2vec.py * fix notebook * remove multiscale option in ts2vec * lints * fix notebook --------- Co-authored-by: Egor Baturin <egoriyaa@github.com> * fix atol in inference tests * downgrade poetry --------- Co-authored-by: Egor Baturin <egoriyaa@github.com>

add ts2vec model

98e2f79

egoriyaa added this to the Embeddings milestone Feb 27, 2024

egoriyaa requested review from Ama16 and d-a-bunin February 27, 2024 02:01

egoriyaa self-assigned this Feb 27, 2024

github-actions bot temporarily deployed to pull request February 27, 2024 02:06 Inactive

delete unnecessary utils

f00ef07

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/libs/ts2vec/ts2vec.py Outdated Show resolved Hide resolved

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/libs/ts2vec/ts2vec.py Outdated Show resolved Hide resolved

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/transforms/embeddings/models/ts2vec.py Outdated Show resolved Hide resolved

github-actions bot temporarily deployed to pull request February 27, 2024 02:20 Inactive

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/transforms/embeddings/models/ts2vec.py Outdated Show resolved Hide resolved

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/transforms/embeddings/models/ts2vec.py Outdated Show resolved Hide resolved

egoriyaa commented Feb 27, 2024

View reviewed changes

tests/test_transforms/test_embeddings/test_models/utils.py Outdated Show resolved Hide resolved

egoriyaa commented Feb 27, 2024

View reviewed changes

etna/libs/ts2vec/__init__.py Show resolved Hide resolved

add multiscale mode

b042330

github-actions bot temporarily deployed to pull request February 27, 2024 14:59 Inactive

Ama16 reviewed Feb 28, 2024

View reviewed changes

etna/transforms/embeddings/models/base.py Show resolved Hide resolved

d-a-bunin requested changes Feb 28, 2024

View reviewed changes

Egor Baturin added 2 commits February 29, 2024 10:12

revert to common encode in model class

86e2e29

lints

e8c00d0

github-actions bot temporarily deployed to pull request February 29, 2024 07:24 Inactive

egoriyaa commented Feb 29, 2024

View reviewed changes

reformat save method, add _is_fitted attr

d9e0b3b

egoriyaa requested review from Ama16 and d-a-bunin February 29, 2024 08:49

github-actions bot temporarily deployed to pull request February 29, 2024 08:51 Inactive

d-a-bunin requested changes Feb 29, 2024

View reviewed changes

Ama16 reviewed Feb 29, 2024

View reviewed changes

Egor Baturin added 3 commits March 4, 2024 09:00

fix embeddings shapes

39bb49d

fix

ae11d75

one more fix

3503533

egoriyaa requested review from Ama16 and d-a-bunin March 4, 2024 06:04

github-actions bot temporarily deployed to pull request March 4, 2024 06:48 Inactive

d-a-bunin reviewed Mar 4, 2024

View reviewed changes

pass numpy array to fit

48c1a94

egoriyaa requested a review from d-a-bunin March 5, 2024 11:30

github-actions bot temporarily deployed to pull request March 5, 2024 11:33 Inactive

add tests checking nans in embeddings

0835da9

github-actions bot temporarily deployed to pull request March 5, 2024 16:26 Inactive

update changelog

fa1e821

github-actions bot temporarily deployed to pull request March 6, 2024 07:54 Inactive

d-a-bunin approved these changes Mar 6, 2024

View reviewed changes

egoriyaa merged commit 61d8077 into embeddings Mar 6, 2024
16 checks passed

d-a-bunin mentioned this pull request Mar 7, 2024

Implement TS2VecModel #245

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement TS2VecModel #253

Implement TS2VecModel #253

egoriyaa commented Feb 27, 2024

github-actions bot commented Feb 27, 2024 •

edited

Loading

egoriyaa commented Feb 27, 2024 •

edited

Loading

codecov bot commented Feb 27, 2024 •

edited

Loading

d-a-bunin Feb 28, 2024

egoriyaa Feb 29, 2024

egoriyaa Feb 29, 2024

Ama16 Feb 29, 2024

d-a-bunin Feb 29, 2024

Ama16 Feb 29, 2024

d-a-bunin Mar 4, 2024

egoriyaa Mar 4, 2024 •

edited

Loading

d-a-bunin Mar 4, 2024

egoriyaa Mar 4, 2024

egoriyaa Mar 4, 2024

d-a-bunin Mar 4, 2024

egoriyaa Mar 4, 2024



		@pytest.mark.parametrize("output_dims", [2, 3])
		def test_full_series_equal_values(simple_ts_with_exog, output_dims):

Implement TS2VecModel #253

Implement TS2VecModel #253

Conversation

egoriyaa commented Feb 27, 2024

Before submitting (must do checklist)

Proposed Changes

Closing issues

github-actions bot commented Feb 27, 2024 • edited Loading

egoriyaa commented Feb 27, 2024 • edited Loading

codecov bot commented Feb 27, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egoriyaa Mar 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 27, 2024 •

edited

Loading

egoriyaa commented Feb 27, 2024 •

edited

Loading

codecov bot commented Feb 27, 2024 •

edited

Loading

egoriyaa Mar 4, 2024 •

edited

Loading