Skip to content

Conversation

@DiweiSun
Copy link
Contributor

@DiweiSun DiweiSun commented Oct 27, 2025

This PR is targeted to enable the CI for the test/quantization on XPU device.

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3249

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit a9043e2 with merge base ba3ac9f (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 27, 2025
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Oct 27, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 27, 2025

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Oct 27, 2025
@liangan1 liangan1 added topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) ciflow/xpu label used to trigger xpu CI jobs labels Oct 27, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Oct 28, 2025
@xiaowangintel xiaowangintel added the ciflow/xpu label used to trigger xpu CI jobs label Nov 4, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Nov 4, 2025
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Nov 4, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 4, 2025

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Nov 4, 2025
@liangan1 liangan1 requested a review from jerryzh168 November 5, 2025 01:16
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Nov 5, 2025
return devices


def auto_detect_device():
Copy link
Contributor

@jerryzh168 jerryzh168 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this include cuda?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.


torch.manual_seed(0)

_DEVICE = auto_detect_device()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_detect_device seems to be changing what we want to test, I think previous we only want to test on CUDA, can you preserve this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have refine the auto_detect_device functions and cpu will not be included.


class TestGPTQ(TestCase):
@unittest.skip("skipping until we get checkpoints for gpt-fast")
@unittest.skipIf(not torch.cuda.is_available(), "Need CUDA available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just change this to torch.accelerator.is_available()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



class TestMultiTensorFlow(TestCase):
@unittest.skipIf(not torch.cuda.is_available(), "Need CUDA available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to expand test to cpu I think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Nov 6, 2025
@zxd1997066 zxd1997066 force-pushed the molly/quantization_ut branch from cfe14e2 to 7b5d2c4 Compare November 6, 2025 02:42
@liangan1 liangan1 added xpu Intel XPU related features ciflow/xpu label used to trigger xpu CI jobs labels Nov 6, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 6, 2025

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Nov 6, 2025
@liangan1 liangan1 changed the title Molly/quantization ut [Intel XPU] Enable test/quantization UTs on XPU Nov 6, 2025
@liangan1 liangan1 requested a review from jerryzh168 November 6, 2025 02:55
@liangan1 liangan1 self-requested a review November 6, 2025 02:57
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Nov 6, 2025
m2.load_state_dict(state_dict)
m2 = m2.to(device="cuda")
example_inputs = map(lambda x: x.cuda(), example_inputs)
example_inputs = map(lambda x: x.to(_DEVICE), example_inputs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so when in CPU, _DEVICE will be None now, what happens when we do x.to(None)? I think we don't want auto detect here, since in L267, it is converting model m2 to "cuda"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed m2.to(_DEVICE), as it is skip if not torch.accelerator.available(), when in CPU, this case will be skipped

def test_get_group_qparams_symmetric_memory(self):
"""Check the memory usage of the op"""
weight = torch.randn(1024, 1024).to(device="cuda")
weight = torch.randn(1024, 1024).to(device=_DEVICE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test also has skip if no cuda, so it is still only going to run in cuda right? the change doesn't seem to have any effect right now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed skip if no cuda to skip if not torch.accelerator.available().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might make sense if we change all the skip if no cuda to skip if not torch.accelerator.available() in this file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_module_fqn_to_config_regex_basic, test_module_fqn_to_config_regex_fullmatch, test_module_fqn_to_config_regex_precedence and test_module_fqn_to_config_regex_precedence2 are not ready for XPU, others are changed to skip if not torch.accelerator.available()

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Nov 7, 2025
@liangan1 liangan1 requested a review from jerryzh168 November 7, 2025 06:57
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/xpu label used to trigger xpu CI jobs CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) xpu Intel XPU related features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants