Fix input processing for convolution #1554

Cecilwang · 2025-04-21T07:42:10Z

Thanks for the previous support for convolution layers. In this PR, I’ve added fixes for two specific cases:

Conv1D and unfold parameter requirement:
The unfold operation expects 2D parameters, so for Conv1D layers, both the input tensor and parameters need to be expanded by one dimension to match the expected shape.
Grouped convolutions:
Fixed the input tensor shape for convolutions that use the groups parameter to ensure the input is correctly formatted and grouped during computation.

ref:
https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html
https://discuss.pytorch.org/t/conv1d-implementation-using-torch-nn-functional-unfold/109643/3

Signed-off-by: Sixue(Cecil) Wang <cecilwang@preferred.jp>

Qubitium · 2025-04-21T08:16:20Z

@Cecilwang Do you know of a opensource model currently on HF that would trigger this code? I want want to add test_conv1d.py tests to CI so we can check for regressions in the future. I know you are currently applying to a private model but was wondering if you know a non-private model that we can do a simple quant/inference on?

Otherwise, this PR is great and ready to go!

Cecilwang · 2025-04-22T00:32:50Z

state-spaces/mamba-130m-hf is a MambaForCausalLM whose slow_forward function might trigger this PR's logical. You can try it without installing causal_conv1d.

https://github.com/huggingface/transformers/blob/main/src/transformers/models/mamba/modeling_mamba.py#L54-L61
https://github.com/huggingface/transformers/blob/main/src/transformers/models/mamba/modeling_mamba.py#L326-L328
https://github.com/huggingface/transformers/blob/fee1190601b5d04ec6d3f7f58fd22788d7f3236d/src/transformers/models/mamba/modeling_mamba.py#L274

Fix input processing for convolution

183256a

Signed-off-by: Sixue(Cecil) Wang <cecilwang@preferred.jp>

Cecilwang mentioned this pull request Apr 21, 2025

Support torch.nn.Conv1D? #1479

Closed

Qubitium merged commit 31051a5 into ModelCloud:main Apr 22, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix input processing for convolution #1554

Fix input processing for convolution #1554

Uh oh!

Cecilwang commented Apr 21, 2025

Uh oh!

Qubitium commented Apr 21, 2025

Uh oh!

Cecilwang commented Apr 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix input processing for convolution #1554

Fix input processing for convolution #1554

Uh oh!

Conversation

Cecilwang commented Apr 21, 2025

Uh oh!

Qubitium commented Apr 21, 2025

Uh oh!

Cecilwang commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Cecilwang commented Apr 22, 2025 •

edited

Loading