Skip to content

Speed create_import_structure up with os.scandir()#44253

Merged
Rocketknight1 merged 6 commits intomainfrom
faster_create_import_structure_from_path
Mar 10, 2026
Merged

Speed create_import_structure up with os.scandir()#44253
Rocketknight1 merged 6 commits intomainfrom
faster_create_import_structure_from_path

Conversation

@Rocketknight1
Copy link
Member

@Rocketknight1 Rocketknight1 commented Feb 24, 2026

create_import_structure_from_path does some redundant os calls, so I'm experimenting with changes to see if we can speed up loading a lot.

Related to #44246

@Rocketknight1 Rocketknight1 marked this pull request as ready for review February 24, 2026 13:04
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rocketknight1
Copy link
Member Author

cc @LysandreJik while you're working on this! I don't think it addresses the contributor's issue but it might be some free speedup regardless.

The tl;dr is that the existing code does two wasteful things. Firstly, it doesn't skip convert_*.py files in the module dir, which are never relevant. Those files are excluded in release versions but are still present if people install from main.

Secondly, the old approach of os.listdir(module_path) followed by os.path.isdir() requires multiple system calls per model, but it's much faster to use os.scandir because the OS actually returns the information we need in the directory listing call, so we avoid thousands of system stat calls this way. The speedup may not be huge, but there's no downside, so it's probably worth merging!

@Rocketknight1 Rocketknight1 force-pushed the faster_create_import_structure_from_path branch from e1a72ac to e966399 Compare March 2, 2026 18:01
@Rocketknight1
Copy link
Member Author

(Failing test is unrelated and goes on my list of flaky tests to fix)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Rocketknight1 !

@Rocketknight1 Rocketknight1 added this pull request to the merge queue Mar 10, 2026
Merged via the queue into main with commit 652f2f7 Mar 10, 2026
29 checks passed
@Rocketknight1 Rocketknight1 deleted the faster_create_import_structure_from_path branch March 10, 2026 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants