[RyuJIT/ARM32] low performance compared to amd64 investigation, data memory barrier usage. #13482
Labels
arch-arm32
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
JitUntriaged
CLR JIT issues needing additional triage
tenet-performance
Performance related issue
Milestone
Initial thread was started by @alpencolt #12361
Initial performance test results generated by @alpencolt
https://gist.github.com/alpencolt/0580af0be86e49bb9d89508dabcd8615
During arm32 performance investigation we found, that the one of the point of performance degradation is data memory barrier usage. Note, that in case of arm32 we use it for volatile variables, plus, it present in atomic memory access functions.
For example,
__sync_val_compare_and_swap(value, comp_val, new_val)
implementation for armv7 looks like:in the same time, for arm64 we have
We also compared the results of tests running with a setting flag COMPlus_JitNoMemoryBarriers and without it.
For example:
https://github.com/dotnet/performance/tree/master/src/benchmarks/micro/corefx/System.Collections/Concurrent
System.Collections.Concurrent.Count
Results running with COMPlus_JitNoMemoryBarriers = "":
Results running with COMPlus_JitNoMemoryBarriers = 1:
System.Collections.Concurrent.IsEmpty
Results running with COMPlus_JitNoMemoryBarriers = "":
Results running with COMPlus_JitNoMemoryBarriers = 1:
category:cq
theme:barriers
skill-level:expert
cost:medium
The text was updated successfully, but these errors were encountered: