Skip to content

[BUG] Device inconstency in MQF2DistributionLoss raising: RuntimeError: Expected all tensors to be on the same device #1916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fnhirwa
Copy link
Member

@fnhirwa fnhirwa commented Jul 10, 2025

Fixes #1182

In the current implementation, the picnn is initialized during class construction, and the device that it defaults to isn't being updated when the model is moved to another device.

  • Added device movement method to() to ensure that picnn is moved along with the loss function.
  • Added automatic device sync in map_x_to_distribution to ensure that picnn is on the same device as the input tensor.

Also added the tests mocking the accelerators on a high level to test the synchronization of devices within this class.

Copy link

codecov bot commented Jul 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (main@e6a3ea7). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1916   +/-   ##
=======================================
  Coverage        ?   86.37%           
=======================================
  Files           ?       96           
  Lines           ?     7807           
  Branches        ?        0           
=======================================
  Hits            ?     6743           
  Misses          ?     1064           
  Partials        ?        0           
Flag Coverage Δ
cpu 86.37% <100.00%> (?)
pytest 86.37% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@fnhirwa fnhirwa marked this pull request as ready for review July 10, 2025 13:56
@fkiraly fkiraly added the bug Something isn't working label Jul 10, 2025
@fkiraly fkiraly moved this to Under review in Bugfixing - pytorch-forecasting Jul 10, 2025
@fkiraly
Copy link
Collaborator

fkiraly commented Jul 13, 2025

Requesting review from @PranavBhatP - is this loss currently tested via the framework?

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jul 13, 2025

No, not in the present state on main, because it requires soft dependencies to run a CI test on this metric. This is the only reason I can think of for excluding this metric in the current test_metrics.py.

@PranavBhatP
Copy link
Contributor

PranavBhatP commented Jul 13, 2025

I see that this PR handles the soft dependency, so I think it's a good start! Will review this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:metrics
Projects
Development

Successfully merging this pull request may close these issues.

[BUG] MQF2DistributionLoss - RuntimeError: Expected all tensors to be on the same device
3 participants