Skip to content

Conversation

@jiqing-feng
Copy link
Collaborator

@jiqing-feng jiqing-feng commented Nov 6, 2025

After this change, the torch fused op can combine with the whole model torch.compile. It could bring 4x speed-up when input_size > 128 on Intel Xeon CPU.

@jiqing-feng jiqing-feng requested a review from Qubitium November 6, 2025 05:34
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@Qubitium
Copy link
Collaborator

Qubitium commented Nov 6, 2025

@jiqing-feng Wow! Awesome. Thank you!

@Qubitium Qubitium merged commit 9d524ef into ModelCloud:main Nov 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants