Skip to content

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Sep 26, 2025

Summary:

  1. enrolls mxtensor in existing vllm slice and copy test, make it pass
    by moving to TorchAOBaseTensor's copy
  2. add an additional test for vllm narrow, make that test pass by fixing
    an incorrect slice implementation. This may be useful for other
    tensor, they can opt-in in separate PRs.

Test Plan:

pytest test/prototype/mx_formats/ -s -x

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vkuzo
Copy link
Contributor Author

vkuzo commented Sep 26, 2025

Copy link

pytorch-bot bot commented Sep 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3081

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f962851 with merge base a53a4db (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2025
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: e79dfc3
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: bd1c041
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 26, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
[ghstack-poisoned]
@vkuzo vkuzo changed the base branch from gh/vkuzo/127/head to main September 27, 2025 00:35
vkuzo added a commit that referenced this pull request Sep 27, 2025
Summary:

1. enrolls mxtensor in existing vllm slice and copy test, make it pass
   by moving to TorchAOBaseTensor's copy
2. add an additional test for vllm narrow, make that test pass by fixing
   an incorrect slice implementation. This may be useful for other
   tensor, they can opt-in in separate PRs.

Test Plan:

```
pytest test/prototype/mx_formats/ -s -x
```

also, this PR enables running mxfp4 weight-only Qwen MoE models in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 609a88e
ghstack-comment-id: 3339065416
Pull Request resolved: #3081
@vkuzo vkuzo merged commit de92bdc into main Sep 29, 2025
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants