Skip to content

[gguf] Refactor __torch_function__ to avoid unnecessary computation #11551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 15, 2025

Conversation

anijain2305
Copy link
Contributor

This helps with torch.compile compilation latency. Avoiding unnecessary computation should also lead to a slightly improved eager latency.

What does this PR do?

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

This helps with torch.compile compilation lantency. Avoiding unnecessary
computation should also lead to a slightly improved eager latency.
@anijain2305
Copy link
Contributor Author

cc @sayakpaul

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for this. Do you want to also include the speedups you obtained with this patch?

@sayakpaul
Copy link
Member

Along with this, do we think using regional compilation (cc: huggingface/accelerate#3529) could also benefit the compilation latency?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@anijain2305
Copy link
Contributor Author

I am going through a stack of PRs to tackle compilation time. I will update once the stack lands. Overall the compile time is roughly 280 seconds, and I am able to take off roughly 30 seconds till now .

Regional compilation will definitely benefit this model. @StrongerXi has the latest numbers once.

It seems that workflow needs some approval?

@sayakpaul
Copy link
Member

@bot /style

Copy link
Contributor

Style fixes have been applied. View the workflow run here.

@StrongerXi
Copy link

Oh yeah regional compilation would speed things up massively, when I tested a while back it went from 300s to 30s. Might be worth offering a similar api in diffusers and transformers?

@anijain2305
Copy link
Contributor Author

@DN6 a gentle ping in case this missed through the cracks

@sayakpaul
Copy link
Member

Oh yeah regional compilation would speed things up massively, when I tested a while back it went from 300s to 30s. Might be worth offering a similar api in diffusers and transformers?

@StrongerXi #11556

@DN6 DN6 merged commit 3a6caba into huggingface:main May 15, 2025
12 checks passed
@DN6 DN6 added the roadmap Add to current release roadmap label Jun 5, 2025
@DN6 DN6 moved this from In Progress to Done in Diffusers Roadmap 0.35 Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Add to current release roadmap
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants