Skip to content

[PERF] Don't spawn so many compilers (3/2) (19m -> 250k) #15030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

blyxyas
Copy link
Member

@blyxyas blyxyas commented Jun 10, 2025

Optimize needless_doctest_main, make it short-circuit, make sure that we don't spin up a new compiler on EVERY code block.


The old implementation was creating a new compiler, new parser, new thread, new SessionGlobals, new everything for each code block. No matter if they actually didn't even contain fn main() or anything relevant.

On callgrind, seems that we're reducing about a 6.7242% de cycle count (which turns out to be a 38 million instruction difference, great!). Benchmarked in bumpalo-3.16.0. Also on bumpalo we spawn 78 less threads. This moves SessionGlobals::new from the top time-consuming function by itself in some benchmarks, into one not even in the top 500.

Also, populate the test files.

changelog:[needless_doctest_main]: Avoid spawning so many threads in unnecessary circumstances

@blyxyas blyxyas added the performance-project For issues and PRs related to the Clippy Performance Project label Jun 10, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 10, 2025

r? @y21

rustbot has assigned @y21.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Jun 10, 2025
Avoid creating so many SessionGlobals
@blyxyas blyxyas force-pushed the optimize-session-globals branch from c26ab0d to af51bb8 Compare June 13, 2025 19:03
//
// Also, as we only check for attribute names and don't do macro expansion,
// we can check only for #[test]
if !(text.contains("fn main") || text.contains("#[test]")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, if someone writes fn main() instead of fn main() will this cause false negatives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (and improved filtering!) Now we check for instances of fn (two times) and main in general.

Copy link
Member

@y21 y21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice optimisation, but I have one question

// Also, as we only check for attribute names and don't do macro expansion,
// we can check only for #[test]

if !((text.contains(" main") && text.splitn(2, "fn ").nth(2).is_none()) || text.contains("#[test]")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The splitn condition is a bit confusing, I can't tell what it's checking for. splitn(2) limits the iterator to 2 items, so isn't nth(2) always None?

Is this meant to be splitn(3, ..).nth(2).is_none() to check that there are at most two occurences of fn? But even then, I don't see why this is needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, what an oversight that splitn(2) should actually be splitn(3), I'll add tests with 3 and 4 functions.

This is necessary because in the actual check_code_sample we care only if the code block has one function and that function is main() (This was the behaviour even before this PR)

If there's more than one function, fn main would be relevant as a separate entity so we cannot report it as useless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified filtering again with the risk of being over-scoped. This is still a pretty heavy optimization just with some more flexibility. Also accounting for all the possible spaces in between fn and main.

@blyxyas blyxyas force-pushed the optimize-session-globals branch from 75e694d to 3c60c42 Compare June 19, 2025 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance-project For issues and PRs related to the Clippy Performance Project S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants