New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix target register false dependency of lzcnt, tzcnt, and popcnt #19772

Merged
merged 1 commit into from Aug 31, 2018

Conversation

Projects
None yet
5 participants
@fiigii
Copy link
Collaborator

fiigii commented Aug 30, 2018

@CarolEidt
Copy link
Member

CarolEidt left a comment

I think that doing this unconditionally (rather than attempting to add support for identifying specific microarchitecture targets) is the right thing do do.

@CarolEidt

This comment has been minimized.

Copy link
Member

CarolEidt commented Aug 30, 2018

This raises an important question about optimizing for one class of targets while potentially penalizing another class.

I think it's important to minimize, to the extent possible, the matrix of supported targets. We are already down a relatively costly path of supporting a large number of target ISA's, and I think we must be extremely careful before broadening it further.

In a case like this, where we can avoid a significant penalty on one class of targets by adding a very small penalty on another class, I think that's the right tradeoff. There may be cases where the cost of the mitigation is higher, and at that point we may want to consider whether it's worth broadening the target matrix to address it, but for this case it seems pretty low cost.

@dotnet/jit-contrib - Any other opinions?

@4creators

This comment has been minimized.

Copy link
Collaborator

4creators commented Aug 31, 2018

IMHO @CarolEidt is right that this particular issue is most probably not worth expanding codegen target matrix, however, it may be inevitable in future once we expand optimizations in tier1 or perhaps add even higher optimization tiers. The areas where this may have large performance impact which could depend on targeted processor type are optimizations exploiting instruction level parallelism which heavily relies on given processor type i.e. Arm53 and Arm72 will have different instruction latencies and throughputs despite implementing the same ISA.

@fiigii

This comment has been minimized.

Copy link
Collaborator Author

fiigii commented Aug 31, 2018

Can we merge this PR? The two arm tests get stuck but not related to this change.

@CarolEidt CarolEidt merged commit 4d031d6 into dotnet:master Aug 31, 2018

27 of 29 checks passed

Windows_NT arm Cross Debug Innerloop Build Triggered.
Details
Windows_NT arm64 Cross Debug Innerloop Build Triggered.
Details
CROSS Check Build finished.
Details
CentOS7.1 x64 Checked Innerloop Build and Test Build finished.
Details
CentOS7.1 x64 Debug Innerloop Build Build finished.
Details
Linux-musl x64 Debug Build Build finished.
Details
OSX10.12 x64 Checked Innerloop Build and Test Build finished.
Details
Tizen armel Cross Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm Cross Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm Cross Checked no_tiered_compilation_innerloop Build and Test Build finished.
Details
Ubuntu arm64 Cross Debug Innerloop Build Build finished.
Details
Ubuntu x64 Checked CoreFX Tests Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Ubuntu x64 Formatting Build finished.
Details
WIP ready for review
Details
Windows_NT x64 Checked CoreFX Tests Build finished.
Details
Windows_NT x64 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x64 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 Release CoreFX Tests Build finished.
Details
Windows_NT x64 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x64 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test (Jit - TieredCompilation=0) Build finished.
Details
Windows_NT x86 Release Innerloop Build and Test Build finished.
Details
Windows_NT x86 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
license/cla All CLA requirements met.
Details

@fiigii fiigii deleted the fiigii:insxor branch Aug 31, 2018

@pentp

This comment has been minimized.

Copy link
Collaborator

pentp commented Sep 1, 2018

Does this solution unconditionally preclude lzcnt/tzcnt use with targetReg == sourceReg?
Wouldn't it be better to conditionally insert the xor at codegen time if targetReg isn't any of the sourceRegs and not change anything in LSRA?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment