Skip to content

Conversation

@vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Dec 4, 2025

Stack from ghstack (oldest at bottom):

Summary:

Enables copy_ support for the torch.float4_e2m1fn_x2 dtype. This is
useful when slicing a tensor across dim1 and then calling contiguous,
which can happen in vllm and therefore should be supported.

Test Plan:

pytest test/quantization/core/experimental/test_floatx.py -s -k test_float4_e2m1fn_x2

Reviewers:

Subscribers:

Tasks:

Tags:

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01

Summary:

Enables `copy_` support for the `torch.float4_e2m1fn_x2` dtype. This is
useful when slicing a tensor across dim1 and then calling contiguous,
which can happen in vllm and therefore should be supported.

Test Plan:

```
pytest test/quantization/core/experimental/test_floatx.py -s -k test_float4_e2m1fn_x2
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/169595

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (7 Unrelated Failures)

As of commit 257cef6 with merge base 685ba6b (image):

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category labels Dec 4, 2025
vkuzo added a commit that referenced this pull request Dec 4, 2025
Summary:

Enables `copy_` support for the `torch.float4_e2m1fn_x2` dtype. This is
useful when slicing a tensor across dim1 and then calling contiguous,
which can happen in vllm and therefore should be supported.

Test Plan:

```
pytest test/quantization/core/experimental/test_floatx.py -s -k test_float4_e2m1fn_x2
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 4db38fd
Pull Request resolved: #169595
Summary:

Enables `copy_` support for the `torch.float4_e2m1fn_x2` dtype. This is
useful when slicing a tensor across dim1 and then calling contiguous,
which can happen in vllm and therefore should be supported.

Test Plan:

```
pytest test/quantization/core/experimental/test_floatx.py -s -k test_float4_e2m1fn_x2
```

Reviewers:

Subscribers:

Tasks:

Tags:

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 aditew01

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Dec 4, 2025
Summary:

Enables `copy_` support for the `torch.float4_e2m1fn_x2` dtype. This is
useful when slicing a tensor across dim1 and then calling contiguous,
which can happen in vllm and therefore should be supported.

Test Plan:

```
pytest test/quantization/core/experimental/test_floatx.py -s -k test_float4_e2m1fn_x2
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: d1315b2
Pull Request resolved: #169595
vkuzo added a commit to pytorch/ao that referenced this pull request Dec 4, 2025
Summary:

Uses the `torch.float4_e2m1fn_x2` dtype in mxfp4 and nxfp4 torchao
tensors.

Requires pytorch/pytorch#169595, so we need
to wait for the next PyTorch branch cut.

Note: nvfp4 models in vllm currently hit an error
https://gist.github.com/vkuzo/e1407ee68c9ebb8d0f67478aedd81b96
before or after this PR, so some more debugging to do before
landing this.

Test Plan:

```
CUDA_VISIBLE_DEVICES=6 time pytest test/prototype/mx_formats/ -s
```

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: ee5d3a0
ghstack-comment-id: 3614417747
Pull-Request: #3440
@vkuzo
Copy link
Contributor Author

vkuzo commented Dec 5, 2025

@pytorchbot merge -f

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 5, 2025

❌ 🤖 pytorchbot command failed:

@pytorchbot merge: error: argument -f/--force: expected one argument

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Try @pytorchbot --help for more info.

@vkuzo
Copy link
Contributor Author

vkuzo commented Dec 5, 2025

@pytorchbot merge -f "failures unrelated"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants