SemanticChecks: Avoid repeated type resolution of [ordered] #17328

IISResetMe · 2022-05-12T23:23:04Z

PR Summary

Avoid unnecessary attempts to resolve [ordered]'s underlying type (which isn't there) in post-parse checks

PR Context

As highlighted in #17308, parsing the cast expression [ordered]@{} is exactly as slow as parsing [NonExistingType]@{}!

Most of this overhead is incurred during post-parse analysis conducted by the SemanticChecks visitor, when repeated attempts are made to resolve the underlying runtime type represented by [ordered]. Since [ordered] is a special attribute and not really a type literal, the type resolution fails, which in itself is a relatively costly (and non-cacheable) operation.

This PR makes no functional changes to the flow of analysis, but reduces calls to TypeName.GetReflectionType() in situations where we've already assess the target type name is [ordered].

This alone cuts the wallclock time for parsing [ordered]@{} compared to [nonexisting]@{} by 45-50%:

PR Checklist

Type resolution failure is costly, since it triggers repeated attempts "upwards" until the type resolver has enumerated every single assembly in the hosting app domain, and failures are non-cacheable In the special case of `[ordered]` as the lhs in a cast expression, we currently trigger such a cascading failure in multiple places where its entirely unnecessary This commit seeks to reduce the overhead incurred by the parser in these cases. Initial testing shows a ~45% reduction in wall-clock time for parsing 12.000 `[ordered]@{}` expressions in a single script file

iSazonov · 2022-05-13T05:40:22Z

If I understand correctly the intention was to do optimization for all non-resolvable type names:

To the 2nd question, the same performance degradation would happen to another non-resolvable type name in the script, so it would be nice to resolve this perf issue in the root -- not looking up a non-resolvable type name multiple times during parse would provide a significant boost to performance.

IISResetMe · 2022-05-13T10:21:00Z

I must have misunderstood @daxian-dbw

daxian-dbw · 2022-05-13T16:57:24Z

@IISResetMe No, you are not 😄 Resolving the root issue will of course be the ultimate goal, but that will obviously require more time and yet still turn out to result in behavior changes. Also, it's possible that no one will pick up that work (or even the work to make ordered a real type accelerator because that also involves changing parser and compiler) for a long time.

The short term fix you discovered will largely mitigate the immediate problem right away with very minimal changes. I think it's reasonable to take it now.

IISResetMe · 2022-05-13T17:29:04Z

Gotcha, thanks @daxian-dbw!

There's a couple other mitigation ideas that came up while discussing the issue earlier today:

Early exit from TypeName.GetReflectionType() with a special heuristic for [ordered], just like we already have in parser and compiler anyways
- Fixes the [ordered] bug
Refactoring SemanticChecks to aggressively cache type lookups (should be safe, semantic analysis is a self-contained phase)
- This will benefit all type resolution failure cases, not just [ordered]

I think we should be able to implement both short-term with minimal change in behavior (beyond improving performance of course). I'll play around with it and see if it works :)

daxian-dbw

Please add a few tests about [ordered]:

define a Ordered type by Add-Type
check that [ordered]@{a = 1; b= 2} still returns OrderedDictionary.
check that [ordered]<other-expr> still fails with a parsing error.
check that <somevealue> -is [ordered] and <somevalue> -as [ordered] continue to treat [ordered] as a normal type name, which resolves to the type we defined.

src/System.Management.Automation/engine/parser/SemanticChecks.cs

daxian-dbw · 2022-05-13T17:45:30Z

There's a couple other mitigation ideas that came up while discussing the issue earlier today:

Please use a separate PR for further optimization, especially if the target is to solve the perf issue regarding non-resolvable type in general.

If the ideas are to further optimize how [ordered] is handled in parser with its current semantics, then I suggest you think about optimization for non-resolvable types in general in the parsing phase, also, think about making [ordered] a true type accelerator to OrderedDictionary (basically address all comments from the Engine WG 😄).

But but, just want to call out, don't feel obligated to address the root issues, and it's not urgent at all!

daxian-dbw

Looks good! Please add some tests based on the comment I left above. They are helpful to make sure this quick fix doesn't break the current behavior.

pull-request-quantifier-deprecated · 2022-05-17T01:09:19Z

This PR has 21 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!

Quantification details

Label      : Extra Small
Size       : +21 -0
Percentile : 8.4%

Total files changed: 2

Change summary by file extension:
.cs : +1 -0
.ps1 : +20 -0

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

Fast and predictable releases to production:
- Optimal size changes are more likely to be reviewed faster with fewer
  iterations.
- Similarity in low PR complexity drives similar review times.
Review quality is likely higher as complexity is lower:
- Bugs are more likely to be detected.
- Code inconsistencies are more likely to be detetcted.
Knowledge sharing is improved within the participants:
- Small portions can be assimilated better.
Better engineering practices are exercised:
- Solving big problems by dividing them in well contained, smaller problems.
- Exercising separation of concerns within the code changes.

What can I do to optimize my changes

Use the PullRequestQuantifier to quantify your PR accurately
- Create a context profile for your repo using the context generator
- Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
- Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
- Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
Change your engineering behaviors
- For PRs that fall outside of the desired spectrum, review the details and check if:
  - Your PR could be split in smaller, self-contained PRs instead
  - Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

One line was added: +1 -0
One line was deleted: +0 -1
One line was modified: +1 -1 (git diff doesn't know about modified, it will
interpret that line like one addition plus one deletion)
Change percentiles: Change characteristics (addition, deletion, modification)
of this PR in relation to all other PRs within the repository.

Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

ghost · 2022-05-23T16:20:32Z

🎉v7.3.0-preview.4 has been released which incorporates this pull request.:tada:

Handy links:

Release Notes

ghost assigned daxian-dbw May 12, 2022

pull-request-quantifier-deprecated bot added the Extra Small label May 12, 2022

daxian-dbw requested a review from SeeminglyScience May 12, 2022 23:51

IISResetMe closed this May 13, 2022

daxian-dbw reopened this May 13, 2022

daxian-dbw reviewed May 13, 2022

View reviewed changes

Revert unnecessary changes

5c3aff6

daxian-dbw reviewed May 13, 2022

View reviewed changes

Add tests about the 'ordered' type name

5c011dc

daxian-dbw marked this pull request as ready for review May 17, 2022 16:51

daxian-dbw approved these changes May 17, 2022

View reviewed changes

daxian-dbw merged commit afe99fc into PowerShell:master May 17, 2022

daxian-dbw mentioned this pull request May 17, 2022

High cpu usage when parsing scripts that include [ordered] #17308

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SemanticChecks: Avoid repeated type resolution of [ordered] #17328

SemanticChecks: Avoid repeated type resolution of [ordered] #17328

IISResetMe commented May 12, 2022 •

edited

iSazonov commented May 13, 2022

IISResetMe commented May 13, 2022

daxian-dbw commented May 13, 2022 •

edited

IISResetMe commented May 13, 2022 •

edited

daxian-dbw left a comment •

edited

daxian-dbw commented May 13, 2022 •

edited

daxian-dbw left a comment

pull-request-quantifier-deprecated bot commented May 17, 2022

What can I do to optimize my changes

How to interpret the change counts in git diff output

ghost commented May 23, 2022

SemanticChecks: Avoid repeated type resolution of [ordered] #17328

SemanticChecks: Avoid repeated type resolution of [ordered] #17328

Conversation

IISResetMe commented May 12, 2022 • edited

PR Summary

PR Context

PR Checklist

iSazonov commented May 13, 2022

IISResetMe commented May 13, 2022

daxian-dbw commented May 13, 2022 • edited

IISResetMe commented May 13, 2022 • edited

daxian-dbw left a comment • edited

Choose a reason for hiding this comment

daxian-dbw commented May 13, 2022 • edited

daxian-dbw left a comment

Choose a reason for hiding this comment

pull-request-quantifier-deprecated bot commented May 17, 2022

What can I do to optimize my changes

How to interpret the change counts in git diff output

ghost commented May 23, 2022

IISResetMe commented May 12, 2022 •

edited

daxian-dbw commented May 13, 2022 •

edited

IISResetMe commented May 13, 2022 •

edited

daxian-dbw left a comment •

edited

daxian-dbw commented May 13, 2022 •

edited