Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CheckUnused not slow. #20321

Merged
merged 10 commits into from
May 7, 2024
Merged

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented May 2, 2024

It doesn't mean that it's fast yet, but it is already a significant step in that direction. In particular, this goes in the direction of addressing #19671.

The most important commit is "Simplify the logic for checking unused imports.", whose commit message follows:

Instead of dealing with entire tpd.Imports at the end of the scope, we eagerly flatten them into individual ImportSelectors. We store them along with some data, including a mutable flag for whether a selector has been used.

This allows to dramatically simplify isInImport, as well as more aggressively cache the resolution of selectors. We also get rid of the IdentityHashMap.

The algorithm is still O(n*m) where n is the number of imports in a scope, and m the number of references found in that scope. It is not entirely clear to me whether the previous logic was already O(n*m) or worse (it may have included an additional p factor for the number of possible selections from a given qualifier).

Regardless, it is already quite a bit faster than before, thanks to smaller constant factors.

@sjrd sjrd force-pushed the make-check-unused-not-slow branch 5 times, most recently from 933d8fa to 9ad7463 Compare May 3, 2024 15:43
@sjrd sjrd changed the title WiP Make CheckUnused not slow. Make CheckUnused not slow. May 3, 2024
@sjrd sjrd marked this pull request as ready for review May 3, 2024 15:51
@sjrd sjrd requested a review from KacperFKorban May 3, 2024 15:53
sjrd added 8 commits May 6, 2024 13:51
It is used for every single tree in `CheckUnused`, so this is worth
it.
It is not efficient when the results are always used exactly once.
`Tree`s have structural equality. Even if `==` should be able to
exit quickly either because of `eq` or an early difference, sets
systematically call `hashCode`, which is going to recurse into the
entire structure.
It is pointless to sort a list before converting it into a Set.
@sjrd sjrd force-pushed the make-check-unused-not-slow branch from 9ad7463 to 6587ab4 Compare May 6, 2024 12:24
@sjrd sjrd requested a review from noti0na1 May 6, 2024 12:25
@sjrd
Copy link
Member Author

sjrd commented May 6, 2024

Rebased on top of #20163. @noti0na1 perhaps you could review this, since you changed isInImport last before me?

@noti0na1
Copy link
Member

noti0na1 commented May 6, 2024

Sure, I will review this today.

Copy link
Member

@KacperFKorban KacperFKorban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor looks good to me. I added some minor suggestions that might make the code easier to read.

Sorry for taking so long with the review, the end of last week was a public holiday in Poland.

val selector = selData.selector

if !selector.isWildcard then
if !altName.forall(explicitName => selector.rename == explicitName.toTermName) then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be changed to altName.exists(explicitName => selector.rename != explicitName.toTermName)? (It would read like the comment then)

// If the symbol is accessible in this scope without an import, do not register it for unused import analysis
val notForImport1 =
notForImport
|| (!name.exists(_.toTermName != sym.name.toTermName) && sym.isAccessibleAsIdent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to isInImport: can this first part be name.forall(_.toTermName == sym.name.toTermName)? (Less negation should make it simpler)

if !isConstructorOfSynth(sym) && !doNotRegister(sym) then
if sym.isConstructor && sym.exists then
registerUsed(sym.owner, None) // constructor are "implicitly" imported with the class
def registerUsed(sym: Symbol, name: Option[Name], notForImport: Boolean = false, isDerived: Boolean = false)(using Context): Unit =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the logic harder to follow because of the not in notForImport, can we flip the value of this flag to make it mean "possibly form an import"?

sjrd added 2 commits May 7, 2024 09:52
Instead of dealing with entire `tpd.Import`s at the end of the
scope, we eagerly flatten them into individual `ImportSelector`s.
We store them along with some data, including a mutable flag for
whether a selector has been used.

This allows to dramatically simplify `isInImport`, as well as more
aggressively cache the resolution of selectors. We also get rid of
the `IdentityHashMap`.

The algorithm is still O(n*m) where n is the number of imports in
a scope, and m the number of references found in that scope. It is
not entirely clear to me whether the previous logic was already
O(n*m) or worse (it may have included an additional p factor for
the number of possible selections from a given qualifier).

Regardless, it is already quite a bit faster than before, thanks
to smaller constant factors.
That test does not rely on any information dependent on the import
selectors. It only relies on information at the use site. Eagerly
checking it means we do not put as many symbols into the
`usedInScope` set, which is good because it is one of the
complexity factors of the unused-imports analysis.
@sjrd sjrd force-pushed the make-check-unused-not-slow branch from 6587ab4 to 8553bfc Compare May 7, 2024 07:56
@sjrd sjrd enabled auto-merge May 7, 2024 07:56
Copy link
Member

@noti0na1 noti0na1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, glad to see the logic can be simplified significantly.

@sjrd sjrd merged commit 360d473 into scala:main May 7, 2024
19 checks passed
@sjrd sjrd deleted the make-check-unused-not-slow branch May 7, 2024 11:11
@kyri-petrou
Copy link

@sjrd thank you very much for your work on this. You're going to make a lot of people happy with this!

Out of curiosity, do you happen to know what release will these changes be included in?

@sjrd
Copy link
Member Author

sjrd commented May 9, 2024

3.5.0-RC1

@Kordyjan Kordyjan added this to the 3.5.0 milestone May 10, 2024
WojciechMazur added a commit that referenced this pull request Jul 6, 2024
Backports #20321 to the LTS branch.

PR submitted by the release tooling.
[skip ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants