Skip to content

qa: more lazy loading#45599

Draft
tarekziade wants to merge 5 commits intomainfrom
tarekziade-more-lazy-loading
Draft

qa: more lazy loading#45599
tarekziade wants to merge 5 commits intomainfrom
tarekziade-more-lazy-loading

Conversation

@tarekziade
Copy link
Copy Markdown
Collaborator

What does this PR do?

Per #44273 there are a few spots where we can do more lazy loading to speed up import transformers without complexifying the code too much

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@tarekziade
Copy link
Copy Markdown
Collaborator Author

The public API of transformers.utils forces me to be very agressive on hub.py -- I am not sure if it's worth the added complexity.

The current patch reduces to 270 imports / and 0.5 seconds on my M5

@LysandreJik WDYT?

I feel like doing this can be quite fragile/painful to maintain..

@tarekziade tarekziade marked this pull request as draft April 23, 2026 09:17
@tarekziade tarekziade requested a review from LysandreJik April 23, 2026 09:17
@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45599&sha=3c10a2

@LysandreJik
Copy link
Copy Markdown
Member

@albertvillanova can you confirm whether installing this branch results in dramatic speedups on your machine? not sure we want to guard numpy imports like this either

@albertvillanova
Copy link
Copy Markdown
Member

Thanks for the ping, @LysandreJik. I'm checking it right now.

@albertvillanova
Copy link
Copy Markdown
Member

albertvillanova commented Apr 28, 2026

Yes, I confirm I see a gain in speed:

time python -c "import transformers"

real	0m2.325s
user	0m0.518s
sys	0m0.150s

Before:

time python -c "import transformers"

real	0m2.836s
user	0m1.775s
sys	0m0.138s

@albertvillanova
Copy link
Copy Markdown
Member

On another machine:

  • Now:
time python -c "import transformers"

real	0m1.330s
user	0m1.230s
sys	0m0.100s
  • Before:
time python -c "import transformers"

real	0m1.819s
user	0m3.204s
sys	0m0.176s

@tarekziade
Copy link
Copy Markdown
Collaborator Author

@albertvillanova thanks for the tests!

Could you share a bit more about the use case where startup time is critical here?
In some CLI / serving setups, transformers can be lazy-loaded, so it’d be great to understand how this fits your scenario.

@albertvillanova
Copy link
Copy Markdown
Member

Lazy loading in transformers would be an ideal solution, but I think it is currently broken, e.g. importing anything from transformers.utils.

Indeed, my use case is improving the startup latency of the trl CLI. It imports transformers and this was a significant contributor to the overall startup time.

As a result, we had to introduce some workarounds in trl to avoid importing some transformers functions at CLI startup. See, e.g.:

If lazy loading in transformers is fixed and becomes more reliable, we could eventually revert these changes and simplify the current setup.

@tarekziade
Copy link
Copy Markdown
Collaborator Author

tarekziade commented Apr 28, 2026

Transformers is a large codebase, so importing some modules will inevitably pull in dependencies. Lazy loading can help, but it also adds complexity because we want to keep APIs backward compatible (like what's exposed in transformers.utils), so there’s a tradeoff between startup speed and maintainability. And in practice, once you really use Transformers, you usually end up needing things like NumPy or Torch anyway.

For the TRL CLI, I totally understand the goal. That’s exactly why I worked on making startup faster and added a regression test to keep it that way. But I think we’re reaching a point where pushing this further makes some modules harder to maintain, so I’d rather defer more of that laziness to downstream projects, similar to what you did.

That said, I wouldn’t say Transformers lazy loading is “broken”; that feels a bit strong to me. It may be that some imports still have more overhead than expected.

What would be an acceptable import-time overhead for your use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants