-
-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple assignment has too much overhead #2916
Comments
This would explain many of the regressions I'm seeing on my benchmark suite - a lot of them use masgn. |
1.7 could detect the first form but not the second form, but they're both pretty common. |
I made a patch for this simple form a week or two ago and it did optimize well but at that point we(subbu) noticed LocalOptimizationPass was no longer executing at all. With LOP re-enabled it was able to eliminate the array. My optimization was very specific to Masgn and LOP was optimizing this generically. The neat thing about the patch is it reduced specialized instrs to a series of Copy(). The unneat thing is that patch + LOP decreased perf by 10x (don't know why). I will try and get a clean patch for my change. I recently squashed masgn and masgn19 to one class so this is conflicting atm. |
* Speeds up fannkuch-redux by 10% or so.
* Thus far, IR builder was retrieving values from the masgn array one at a time and assigning them right away. This interleaving got in the way of local optimizations whenever the assignment involved a dataflow barrier (attr assigns in a[1],a[2] = a[2],a[1]) * By retrieving all values from the masgn array, the individual retrievals can be short-circuited in lot more scenarios. * The next barrier to fully optimizing this seems to be RAW hazards introduced by temp var optimization that aggressively reuses temps. In the next patch, I'll investigate running local opts before running temp var opt.
Multiple assignment has taken a perf hit in 9k compared to 1.7, and there's other opportunities to go beyond 1.7.
The main issue is that it's always producing the RHS array for some forms, and simply not eliminating it for other forms.
This case should optimize to simple moves. I believe it doesn't because DCE is not running:
This case, where ary is already an array, should not have to dup the array. I believe we dup it now preemptively because it may be used as the return value of the expression. However this won't go away with DCE because ary is opaque (assume it's opaque below).
These cases affect performance on e.g. https://github.com/YorickPeterse/oga because it uses masgn in the second way above. Many other libraries use these forms too.
The text was updated successfully, but these errors were encountered: