Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SemanticChecks: Avoid repeated type resolution of [ordered] #17328

Merged

Conversation

IISResetMe
Copy link
Collaborator

@IISResetMe IISResetMe commented May 12, 2022

PR Summary

Avoid unnecessary attempts to resolve [ordered]'s underlying type (which isn't there) in post-parse checks

PR Context

As highlighted in #17308, parsing the cast expression [ordered]@{} is exactly as slow as parsing [NonExistingType]@{}!

Most of this overhead is incurred during post-parse analysis conducted by the SemanticChecks visitor, when repeated attempts are made to resolve the underlying runtime type represented by [ordered]. Since [ordered] is a special attribute and not really a type literal, the type resolution fails, which in itself is a relatively costly (and non-cacheable) operation.

This PR makes no functional changes to the flow of analysis, but reduces calls to TypeName.GetReflectionType() in situations where we've already assess the target type name is [ordered].

This alone cuts the wallclock time for parsing [ordered]@{} compared to [nonexisting]@{} by 45-50%:

image

PR Checklist

Type resolution failure is costly, since it triggers repeated attempts
"upwards" until the type resolver has enumerated every single assembly
in the hosting app domain, and failures are non-cacheable

In the special case of `[ordered]` as the lhs in a cast expression, we
currently trigger such a cascading failure in multiple places where
its entirely unnecessary

This commit seeks to reduce the overhead incurred by the parser in
these cases.

Initial testing shows a ~45% reduction in wall-clock time for parsing
12.000 `[ordered]@{}` expressions in a single script file
@iSazonov
Copy link
Collaborator

If I understand correctly the intention was to do optimization for all non-resolvable type names:

To the 2nd question, the same performance degradation would happen to another non-resolvable type name in the script, so it would be nice to resolve this perf issue in the root -- not looking up a non-resolvable type name multiple times during parse would provide a significant boost to performance.

@IISResetMe
Copy link
Collaborator Author

I must have misunderstood @daxian-dbw

@IISResetMe IISResetMe closed this May 13, 2022
@daxian-dbw daxian-dbw reopened this May 13, 2022
@daxian-dbw
Copy link
Member

daxian-dbw commented May 13, 2022

@IISResetMe No, you are not 😄 Resolving the root issue will of course be the ultimate goal, but that will obviously require more time and yet still turn out to result in behavior changes. Also, it's possible that no one will pick up that work (or even the work to make ordered a real type accelerator because that also involves changing parser and compiler) for a long time.

The short term fix you discovered will largely mitigate the immediate problem right away with very minimal changes. I think it's reasonable to take it now.

@IISResetMe
Copy link
Collaborator Author

IISResetMe commented May 13, 2022

Gotcha, thanks @daxian-dbw!

There's a couple other mitigation ideas that came up while discussing the issue earlier today:

  • Early exit from TypeName.GetReflectionType() with a special heuristic for [ordered], just like we already have in parser and compiler anyways
    • Fixes the [ordered] bug
  • Refactoring SemanticChecks to aggressively cache type lookups (should be safe, semantic analysis is a self-contained phase)
    • This will benefit all type resolution failure cases, not just [ordered]

I think we should be able to implement both short-term with minimal change in behavior (beyond improving performance of course). I'll play around with it and see if it works :)

Copy link
Member

@daxian-dbw daxian-dbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a few tests about [ordered]:

  1. define a Ordered type by Add-Type
  2. check that [ordered]@{a = 1; b= 2} still returns OrderedDictionary.
  3. check that [ordered]<other-expr> still fails with a parsing error.
  4. check that <somevealue> -is [ordered] and <somevalue> -as [ordered] continue to treat [ordered] as a normal type name, which resolves to the type we defined.

@daxian-dbw
Copy link
Member

daxian-dbw commented May 13, 2022

There's a couple other mitigation ideas that came up while discussing the issue earlier today:

Please use a separate PR for further optimization, especially if the target is to solve the perf issue regarding non-resolvable type in general.

If the ideas are to further optimize how [ordered] is handled in parser with its current semantics, then I suggest you think about optimization for non-resolvable types in general in the parsing phase, also, think about making [ordered] a true type accelerator to OrderedDictionary (basically address all comments from the Engine WG 😄).

But but, just want to call out, don't feel obligated to address the root issues, and it's not urgent at all!

Copy link
Member

@daxian-dbw daxian-dbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Please add some tests based on the comment I left above. They are helpful to make sure this quick fix doesn't break the current behavior.

@pull-request-quantifier-deprecated

This PR has 21 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Small
Size       : +21 -0
Percentile : 8.4%

Total files changed: 2

Change summary by file extension:
.cs : +1 -0
.ps1 : +20 -0

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer
      iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detetcted.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will
    interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification)
    of this PR in relation to all other PRs within the repository.


Was this comment helpful? 👍  :ok_hand:  :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

@daxian-dbw daxian-dbw marked this pull request as ready for review May 17, 2022 16:51
@daxian-dbw daxian-dbw merged commit afe99fc into PowerShell:master May 17, 2022
@ghost
Copy link

ghost commented May 23, 2022

🎉v7.3.0-preview.4 has been released which incorporates this pull request.:tada:

Handy links:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants