Skip to content

Delete rowwise_scaled_linear_cutlass kernels and APIs#3723

Merged
jerryzh168 merged 1 commit intomainfrom
gh/jerryzh168/26/head
Jan 27, 2026
Merged

Delete rowwise_scaled_linear_cutlass kernels and APIs#3723
jerryzh168 merged 1 commit intomainfrom
gh/jerryzh168/26/head

Conversation

@jerryzh168
Copy link
Copy Markdown
Contributor

@jerryzh168 jerryzh168 commented Jan 26, 2026

Stack from ghstack (oldest at bottom):

Summary:
Deleted Int4DynamicActivationInt4WeightConfig and CutlassInt4PackedLayout of Int8DynamicActivationInt4WeightConfig top level API and related kernels since these are not used

BC breaking note:
We are removing Int4DynamicActivationInt4WeightConfig and CutlassInt4PackedLayout option of Int8DynamicActivationInt4WeightConfig

0.15.0

config = Int8DynamicActivationInt4WeightConfig(
  group_size=None,
  mapping_type=MappingType.SYMMETRIC,
  act_mapping_type=MappingType.SYMMETRIC,
  layout=CutlassInt4PackedLayout(),
)
quantize_(model, config)

config = Int4DynamicActivationInt4WeightConfig()
quantize_(model, config)

0.16.0
Both configs are dropped. Please use torchao <= 0.15.0 to use these configs

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:
TODO:

Files Deleted:

1. CUDA source files:
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass.cuh
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s4s4.cu
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s8s4.cu
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/README.md
2. Python layout files:
  - torchao/prototype/dtypes/uintx/cutlass_int4_packed_layout.py
  - torchao/dtypes/uintx/cutlass_int4_packed_layout.py
3. Test and benchmark files:
  - test/test_ops_rowwise_scaled_linear_cutlass.py
  - benchmarks/benchmark_rowwise_scaled_linear_cutlass.py

Files Modified:

1. torchao/ops.py - Removed lib.define for rowwise_scaled_linear_cutlass_s8s4 and rowwise_scaled_linear_cutlass_s4s4, and their function implementations
2. torchao/csrc/README.md - Removed reference to cutlass naming convention
3. torchao/dtypes/affine_quantized_tensor_ops.py - Removed imports and dispatch registrations for cutlass int4
4. torchao/prototype/dtypes/__init__.py and torchao/prototype/dtypes/uintx/__init__.py - Removed CutlassInt4PackedLayout exports
5. torchao/dtypes/__init__.py - Removed CutlassInt4PackedLayout import and export
6. torchao/quantization/__init__.py and torchao/quantization/quant_api.py - Removed CutlassInt4PackedLayout and Int4DynamicActivationInt4WeightConfig exports
7. torchao/prototype/quantization/quant_api.py - Removed Int4DynamicActivationInt4WeightConfig class and updated Int8DynamicActivationInt4WeightConfig to remove CutlassInt4PackedLayout handling
8. Test files (test/dtypes/test_affine_quantized.py, test/dtypes/test_uintx.py, test/quantization/test_quant_api.py, test/core/test_config.py, test/integration/test_vllm.py) - Removed all references to
CutlassInt4PackedLayout and Int4DynamicActivationInt4WeightConfig
9. benchmarks/microbenchmarks/utils.py - Removed int8adq-int4w-symm quantization option

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Jan 26, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3723

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 832e40e with merge base 79372e7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 added a commit that referenced this pull request Jan 26, 2026
Summary:
TODO:

Files Deleted:

1. CUDA source files:
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass.cuh
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s4s4.cu
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/rowwise_scaled_linear_cutlass_s8s4.cu
  - torchao/csrc/cuda/rowwise_scaled_linear_cutlass/README.md
2. Python layout files:
  - torchao/prototype/dtypes/uintx/cutlass_int4_packed_layout.py
  - torchao/dtypes/uintx/cutlass_int4_packed_layout.py
3. Test and benchmark files:
  - test/test_ops_rowwise_scaled_linear_cutlass.py
  - benchmarks/benchmark_rowwise_scaled_linear_cutlass.py

Files Modified:

1. torchao/ops.py - Removed lib.define for rowwise_scaled_linear_cutlass_s8s4 and rowwise_scaled_linear_cutlass_s4s4, and their function implementations
2. torchao/csrc/README.md - Removed reference to cutlass naming convention
3. torchao/dtypes/affine_quantized_tensor_ops.py - Removed imports and dispatch registrations for cutlass int4
4. torchao/prototype/dtypes/__init__.py and torchao/prototype/dtypes/uintx/__init__.py - Removed CutlassInt4PackedLayout exports
5. torchao/dtypes/__init__.py - Removed CutlassInt4PackedLayout import and export
6. torchao/quantization/__init__.py and torchao/quantization/quant_api.py - Removed CutlassInt4PackedLayout and Int4DynamicActivationInt4WeightConfig exports
7. torchao/prototype/quantization/quant_api.py - Removed Int4DynamicActivationInt4WeightConfig class and updated Int8DynamicActivationInt4WeightConfig to remove CutlassInt4PackedLayout handling
8. Test files (test/dtypes/test_affine_quantized.py, test/dtypes/test_uintx.py, test/quantization/test_quant_api.py, test/core/test_config.py, test/integration/test_vllm.py) - Removed all references to
CutlassInt4PackedLayout and Int4DynamicActivationInt4WeightConfig
9. benchmarks/microbenchmarks/utils.py - Removed int8adq-int4w-symm quantization option

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 44b139f
Pull Request resolved: #3723
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 26, 2026
@jerryzh168 jerryzh168 added the module: deprecation Use this tag if this PR deprecates a feature label Jan 26, 2026
@andrewor14 andrewor14 added the module: bc-breaking Use this tag if this PR breaks backward compatibility label Jan 26, 2026
Copy link
Copy Markdown
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Also cc @jcaip

@jerryzh168 jerryzh168 changed the base branch from gh/jerryzh168/26/base to main January 26, 2026 23:10
@jerryzh168 jerryzh168 merged commit 9b9d558 into main Jan 27, 2026
37 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: bc-breaking Use this tag if this PR breaks backward compatibility module: deprecation Use this tag if this PR deprecates a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants