-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Abstract interpretation algorithm for optimization of tier 2 uops #615
Comments
I see that you are proposing a DSL change like this:
I'm not sure what syntax you are proposing for types that allows
(Cross-ref: #611) |
Continuing on that DSL proposal, why And where is |
It seem like a good idea to have syntax for hard-coding values of metadata flags. We will probably have more use cases for this, like irregular bytecodes where the derivation logic isn't working well. |
In your GDoc about reading the paper you write
I do not understand what is happening here. I saw this in the YouTube video and didn't get it there either. How to they get from
to recognizing the fixpoint? |
That's another argument for |
I cannot claim to understand this part fully as well (I did not pay too much attention to it as we won't need to do this for CPython unless we plan to do loop invariant code motion as well). However, here's what I understand, At location
Hmm this does look cleaner. I don't mind doing this first, then if we have too many flags, use the list method.
I plan to use the abstract interpreter to create the symbolic expression nodes. The symbolic expressions will use tagged unions (just a field set in a struct indicating its real type). So that can be read as "construct a tagged union of type |
This is for convenience during the analysis phase, where we want to check |
Oh, interesting. It seems we have somewhat conflicting goals for the |
Regarding my question about loops and your answer, I think I get it now. At the point ( The crux here is that we don't write out the Watching a bit more of the video, what strikes me is that the abstract interpretation (if there are loops) may actually iterate an indefinite number of times, until the fixpoints are all found. I think in our case things will be simpler, because (at least initially) we only attempt to optimize one superblock at a time. If this superblock contains It does concern me a bit that the talk has an Open Question on whether their approach can outperform traditional approaches. Although in our case we only need to be faster than PEP 659. :-) Trivia question: what does the "square U" symbol mean? It is used in some slide titles. |
@Fidget-Spinner Another question: How do your current plans compare to the plans that Mark wrote up back in March or so, gathered under the label epic-tier2-optimizer? |
From the paper: "the join operation ⊔ℓ, which merges abstract stores at control-flow join points such as ℓ3".
They are complementary. Our proposal concerns only the "superblock optimization:" partial evaluation and specialization passes. It will generate the same code, if not better code than just traditional partial evaluation. The other parts about superblock management, creation, etc. are unaffected. |
Hi, @Fidget-Spinner told me that you were discussing ideas from my paper here. I am very happy that these ideas could find a widespread use! I took the liberty of jumping in the conversation to introduce myself and add some comments; don't hesitate to ask me if you have further questions about the paper!
Your understanding is correct.
The main potential performance problem in my opinion is the worst-case complexity when you have to do a fixpoint iteration. The traditional approach is more "pessimistic" and thus converges faster (but would be less precise in that it would initially introduce more phi nodes). It seems to me that your loop-free "superblocks" should not suffer from this potential performance issue. |
Done. We have a abstract-interpretation based optimizer on stack bytecode now. |
Sub-issue of #559
Please see our proposal for optimizations to tier 2 uops https://docs.google.com/presentation/d/1jyqyO-OxF53NzPYZqkTzfCfCbzX2BC4Vo-BP5Da3zdk/
Proposed optimizations to be performed:
The text was updated successfully, but these errors were encountered: