Skip to content

Fix(informer): Correct tensor shape for input_size=1 #38780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Flink-ddd
Copy link
Contributor

What does this PR do?
This PR fixes a RuntimeError that occurs in InformerModel when config.input_size is set to 1.

When input_size=1, the loc and scale tensors calculated by the scaler retained an extra dimension (e.g., shape [B, 1, 1] instead of [B, 1]). This incorrect shape caused a dimension mismatch error during the expand() operation later when creating expanded_static_feat.

This fix applies .squeeze(-1) to both the loc and scale tensors to ensure they have the correct dimensionality before being used. This resolves the crash and allows the model to run correctly with univariate time series data.

Fixes #38745

When InformerConfig's input_size is set to 1, the loc and scale
tensors retain an extra dimension, causing a RuntimeError.

This commit fixes the issue by overriding
and applying .squeeze(1) to both tensors to ensure correct
dimensionality for all input_size cases.

Fixes huggingface#38745
@Rocketknight1
Copy link
Member

Hi @Flink-ddd, thanks for the PR! However, it creates quite a large change in the modeling file by adding a whole separate create_network_inputs function.

Can you try either:

  1. Make the change in the original create_network_inputs function that the others are inheriting from
  2. Add the new function in InformerModel not InformerEncoder, so that it only modifies the existing function, not creates a whole new one?

@Flink-ddd
Copy link
Contributor Author

Flink-ddd commented Jun 17, 2025

Hi @Rocketknight1, thank you so much for taking the time to review and for the great suggestion!

I completely agree that fixing the bug in the parent 'TimeSeriesTransformerModel' is a much more elegant solution than overriding the entire function. It's cleaner and avoids code duplication.

I will go ahead and close this PR, create a new branch to apply the fix directly to the parent class, and then open a new PR for review.

Thanks again for your guidance!

@Flink-ddd Flink-ddd closed this Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][InformerForPredict] The shape will cause a problem
2 participants