Skip to content

[DO NOT MERGE] Experimental PR to demonstrate TimeXer model usage with v2 design #1830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

PranavBhatP
Copy link
Contributor

Description

This PR demonstrates compatibility of TimeXer with the exisiting D1 implementation. It does not reflect the final implementation or design of the model. It's implementation has been slightly tweaked to also work with the D2 layer (EncoderDecoderDataModule) but with some changes to the data module. BaseModel is also being inherited from. This will be replaced with 'BaseTSLibModel' in future commits, along with a dedicated D2 Layer for tslib models only.

Checklist

  • Linked issues (if existing)
  • Amended changelog for large changes (and added myself there as contributor)
  • Added/modified tests
  • Used pre-commit hooks when committing to ensure that code is compliant with hooks. Install hooks with pre-commit install.
    To run hooks independent of commit, execute pre-commit run --all-files

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@PranavBhatP
Copy link
Contributor Author

FYI @phoeenniixx @fkiraly

@phoeenniixx
Copy link
Contributor

phoeenniixx commented May 12, 2025

Hi @PranavBhatP, Thanks for the PR!
I wanted to ask one thing, you said today that the current implementation of data_module needs to be changed for thuml implementation. May I ask what changes you are talking about?
(except adding target, as this point is already in the wishlist that we are going to incorporate in future iterations)

And how the implementation changes in v1 and v2?

@PranavBhatP
Copy link
Contributor Author

PranavBhatP commented May 13, 2025

@phoeenniixx ,currently I do not have a comprehensive design for the D2 layer but here are some reasons why we need a separate D2 for tslib.

you said today that the current implementation of data_module needs to be changed for thuml implementation. May I ask what changes you are talking about?

  1. One of the main reasons is that the existing data_module is specific to encoder-decoder architectures and many models in tslib aren't based on this. Models like TiDE and TSMixer (link for both) for example are MLP-based and this would require a separate data module with different mechanism, when it comes to things like windowing, feature handling, metadata etc.

  2. The reason why I had chosenTimeXer was because of your existing EncoderDecoderDataModule which works well with TimeXer considering that it is natively a transformer based encoder-decoder architecture with attention. It is working well so far in this PR, which is a good sign.

  3. Going forward, I think TSLib might need one base D2 layer for all kinds of models and specific implementation on top of this, for encoder-decoder models, MLP models etc etc...this is open to discussion.

  4. There are plans to add FMs into PTF (something like IBMs TinyTimeMixer). At that point, your current data module will be very helpful since it is tailored specially for such models.

@fkiraly
Copy link
Collaborator

fkiraly commented May 13, 2025

some reasons why we need a separate D2 for tslib.

Yes, the observation that different models come with their specific loaders was the prime reason why we split D1 and D2, where D2 would be specific to models, and D1 would be unified/generic.

So we should keep D1 but D2 probably needs to be completely new - I think this is a good exercise to check validity of our design! I anticipate we will learn a lot through this.

PS: your code formatting checks are failing.

Copy link

codecov bot commented May 14, 2025

Codecov Report

Attention: Patch coverage is 25.16411% with 342 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@5685c59). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pytorch_forecasting/models/timexer/sub_modules.py 21.51% 124 Missing ⚠️
...els/temporal_fusion_transformer/tft_version_two.py 0.00% 89 Missing ⚠️
pytorch_forecasting/models/timexer/_timexer.py 19.10% 72 Missing ⚠️
...rch_forecasting/models/base/base_model_refactor.py 23.28% 56 Missing ⚠️
pytorch_forecasting/data/data_module.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1830   +/-   ##
=======================================
  Coverage        ?   81.92%           
=======================================
  Files           ?       56           
  Lines           ?     6030           
  Branches        ?        0           
=======================================
  Hits            ?     4940           
  Misses          ?     1090           
  Partials        ?        0           
Flag Coverage Δ
cpu 81.92% <25.16%> (?)
pytest 81.92% <25.16%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@phoeenniixx
Copy link
Contributor

Hi @PranavBhatP, I would suggest to merge main to this branch so that i can fetch it for tests :)

@PranavBhatP
Copy link
Contributor Author

You can use it now @phoeenniixx.

Comment on lines +48 to +49
context_length: int,
prediction_length: int,
Copy link
Contributor

@phoeenniixx phoeenniixx May 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I am not very familiar with tslib nomenclature, so I just want to ask can these context_length and prediction_length be mapped to any of the ptf-v2 metadata keys, like: max_encoder_lengths etc? or are they totally different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context_length and prediction_length serve the same functionality as max_encoder_length and max_decoder_length

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then will this not be a good idea to use data_module.metadata to initialise them rather than making user to input these values again? ( as they entered the same info to data_module as welll)

Copy link
Contributor Author

@PranavBhatP PranavBhatP Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an old implementation from the v1 PR for TimeXer. I had tried to make minimal breaking changes when making this PR, so it had still had the context and prediction length from v1. For the completely new D2 design, refer this.

In the v1 design, we override the context_length and prediction_length parameter inside from_dataset (again a v1 feature). The same can be done here during init, it is just missing. I assumed this change wasn't required since this PR isn't going to be merged, and it was unnecessary.

@phoeenniixx
Copy link
Contributor

FYI @PranavBhatP, as TimeXer is not compatible with EncoderDecoderDataModule we need to decide on how to proceed (See Comments: #1868 (comment), #1868 (comment)) and add tslib models to ptf-v2

@fkiraly fkiraly moved this from PR in progress to Todo in May - Sep 2025 mentee projects Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants