Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X64] [MichalPetryka] Implement Interlocked for small types. #229

Open
MihuBot opened this issue Jan 24, 2024 · 2 comments
Open

[X64] [MichalPetryka] Implement Interlocked for small types. #229

MihuBot opened this issue Jan 24, 2024 · 2 comments

Comments

@MihuBot
Copy link
Owner

MihuBot commented Jan 24, 2024

Build completed in 2 hours 58 minutes.
dotnet/runtime#92974

CoreLib diffs

Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 6796331
Total bytes of diff: 6796443
Total bytes of delta: 112 (0.00 % of base)
    diff is a regression.


Total byte diff includes 112 bytes from reconciling methods
	Base had    0 unique methods,        0 unique bytes
	Diff had    8 unique methods,      112 unique bytes

Top file regressions (bytes):
         112 : System.Private.CoreLib.dasm (0.00 % of base)

1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.

Top method regressions (bytes):
          19 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts) (0 base, 1 diff methods)
          18 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          14 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts) (0 base, 1 diff methods)
          12 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          10 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
           9 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,short):short (FullOpts) (0 base, 1 diff methods)

Top method regressions (percentages):
          19 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          18 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          14 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts) (0 base, 1 diff methods)
           9 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,short):short (FullOpts) (0 base, 1 diff methods)
          10 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          12 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts) (0 base, 1 diff methods)

8 total methods with Code Size differences (0 improved, 8 regressed), 55632 unchanged.

--------------------------------------------------------------------------------

Frameworks diffs

Diffs
Found 259 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 38179345
Total bytes of diff: 38179457
Total bytes of delta: 112 (0.00 % of base)
Total relative delta: -1.00
    diff is a regression.
    relative diff is an improvement.


Total byte diff includes 483 bytes from reconciling methods
	Base had    0 unique methods,        0 unique bytes
	Diff had    9 unique methods,      483 unique bytes

Top file regressions (bytes):
         112 : System.Private.CoreLib.dasm (0.00 % of base)

1 total files with Code Size differences (0 improved, 1 regressed), 255 unchanged.

Top method regressions (bytes):
         371 (Infinity of base) : System.Net.Sockets.dasm - 
          19 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts) (0 base, 1 diff methods)
          18 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          14 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts) (0 base, 1 diff methods)
          12 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          10 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
           9 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,short):short (FullOpts) (0 base, 1 diff methods)

Top method improvements (bytes):
        -371 (-100.00 % of base) : System.Net.Sockets.dasm - System.Net.Sockets.SocketAsyncEngine:EventLoop():this (FullOpts)

Top method regressions (percentages):
         371 (Infinity of base) : System.Net.Sockets.dasm - 
          19 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts) (0 base, 1 diff methods)
          15 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          18 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts) (0 base, 1 diff methods)
          14 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts) (0 base, 1 diff methods)
           9 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,short):short (FullOpts) (0 base, 1 diff methods)
          10 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts) (0 base, 1 diff methods)
          12 (Infinity of base) : System.Private.CoreLib.dasm - System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts) (0 base, 1 diff methods)

Top method improvements (percentages):
        -371 (-100.00 % of base) : System.Net.Sockets.dasm - System.Net.Sockets.SocketAsyncEngine:EventLoop():this (FullOpts)

10 total methods with Code Size differences (1 improved, 9 regressed), 237452 unchanged.

--------------------------------------------------------------------------------

Artifacts:

@MihuBot
Copy link
Owner Author

MihuBot commented Jan 24, 2024

Top method regressions

19 (Infinity of base) - System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte
+; Assembly listing for method System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )    byte  ->  rsi         single-def
+;  V02 arg2         [V02,T02] (  3,  3   )    byte  ->  rdx         single-def
+;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M18274_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M18274_IG02:
+       movzx    rcx, sil
+       movzx    rax, dl
+       lock     
+       cmpxchg  byte  ptr [rdi], cl
+       movzx    rax, al
+       movsx    rax, al
+						;; size=18 bbWeight=1 PerfScore 19.00
+G_M18274_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 19, prolog size 0, PerfScore 20.00, instruction count 7, allocated bytes for code 19 (MethodHash=86f4b89d) for method System.Threading.Interlocked:CompareExchange(byref,byte,byte):byte (FullOpts)
+; ============================================================
18 (Infinity of base) - System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort
+; Assembly listing for method System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )  ushort  ->  rsi         single-def
+;  V02 arg2         [V02,T02] (  3,  3   )  ushort  ->  rdx         single-def
+;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M17455_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M17455_IG02:
+       movsx    rcx, si
+       movsx    rax, dx
+       lock     
+       cmpxchg  word  ptr [rdi], cx
+       cwde     
+       movzx    rax, ax
+						;; size=17 bbWeight=1 PerfScore 19.00
+G_M17455_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 18, prolog size 0, PerfScore 20.00, instruction count 7, allocated bytes for code 19 (MethodHash=a89abbd0) for method System.Threading.Interlocked:CompareExchange(byref,ushort,ushort):ushort (FullOpts)
+; ============================================================
15 (Infinity of base) - System.Threading.Interlocked:CompareExchange(byref,short,short):short
+; Assembly listing for method System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )   short  ->  rsi         single-def
+;  V02 arg2         [V02,T02] (  3,  3   )   short  ->  rdx         single-def
+;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M48186_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M48186_IG02:
+       movsx    rcx, si
+       movsx    rax, dx
+       lock     
+       cmpxchg  word  ptr [rdi], cx
+       cwde     
+						;; size=14 bbWeight=1 PerfScore 18.75
+G_M48186_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 15, prolog size 0, PerfScore 19.75, instruction count 6, allocated bytes for code 16 (MethodHash=a4c743c5) for method System.Threading.Interlocked:CompareExchange(byref,short,short):short (FullOpts)
+; ============================================================
15 (Infinity of base) - System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte
+; Assembly listing for method System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )   ubyte  ->  rsi         single-def
+;  V02 arg2         [V02,T02] (  3,  3   )   ubyte  ->  rdx         single-def
+;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M32567_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M32567_IG02:
+       movzx    rcx, sil
+       movzx    rax, dl
+       lock     
+       cmpxchg  byte  ptr [rdi], cl
+       movzx    rax, al
+						;; size=14 bbWeight=1 PerfScore 18.75
+G_M32567_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 15, prolog size 0, PerfScore 19.75, instruction count 6, allocated bytes for code 15 (MethodHash=fc5680c8) for method System.Threading.Interlocked:CompareExchange(byref,ubyte,ubyte):ubyte (FullOpts)
+; ============================================================
14 (Infinity of base) - System.Threading.Interlocked:Exchange(byref,byte):byte
+; Assembly listing for method System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )    byte  ->  rsi         single-def
+;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M16771_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M16771_IG02:
+       movzx    rax, sil
+       xchg     byte  ptr [rdi], al
+       movzx    rax, al
+       movsx    rax, al
+						;; size=13 bbWeight=1 PerfScore 20.75
+G_M16771_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 14, prolog size 0, PerfScore 21.75, instruction count 5, allocated bytes for code 14 (MethodHash=5298be7c) for method System.Threading.Interlocked:Exchange(byref,byte):byte (FullOpts)
+; ============================================================
12 (Infinity of base) - System.Threading.Interlocked:Exchange(byref,ushort):ushort
+; Assembly listing for method System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )  ushort  ->  rsi         single-def
+;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M62339_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M62339_IG02:
+       movsx    rax, si
+       xchg     word  ptr [rdi], ax
+       cwde     
+       movzx    rax, ax
+						;; size=11 bbWeight=1 PerfScore 20.75
+G_M62339_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 12, prolog size 0, PerfScore 21.75, instruction count 5, allocated bytes for code 12 (MethodHash=23160c7c) for method System.Threading.Interlocked:Exchange(byref,ushort):ushort (FullOpts)
+; ============================================================
10 (Infinity of base) - System.Threading.Interlocked:Exchange(byref,ubyte):ubyte
+; Assembly listing for method System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )   ubyte  ->  rsi         single-def
+;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M61699_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M61699_IG02:
+       movzx    rax, sil
+       xchg     byte  ptr [rdi], al
+       movzx    rax, al
+						;; size=9 bbWeight=1 PerfScore 20.50
+G_M61699_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 10, prolog size 0, PerfScore 21.50, instruction count 4, allocated bytes for code 10 (MethodHash=ae9b0efc) for method System.Threading.Interlocked:Exchange(byref,ubyte):ubyte (FullOpts)
+; ============================================================
9 (Infinity of base) - System.Threading.Interlocked:Exchange(byref,short):short
+; Assembly listing for method System.Threading.Interlocked:Exchange(byref,short):short (FullOpts)
+; Emitting BLENDED_CODE for X64 with AVX - Unix
+; FullOpts code
+; optimized code
+; rsp based frame
+; partially interruptible
+; No PGO data
+; Final local variable assignments
+;
+;  V00 arg0         [V00,T00] (  3,  3   )   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T01] (  3,  3   )   short  ->  rsi         single-def
+;# V02 OutArgs      [V02    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M6691_IG01:
+						;; size=0 bbWeight=1 PerfScore 0.00
+G_M6691_IG02:
+       movsx    rax, si
+       xchg     word  ptr [rdi], ax
+       cwde     
+						;; size=8 bbWeight=1 PerfScore 20.50
+G_M6691_IG03:
+       ret      
+						;; size=1 bbWeight=1 PerfScore 1.00
+
+; Total bytes of code 9, prolog size 0, PerfScore 21.50, instruction count 4, allocated bytes for code 9 (MethodHash=0111e5dc) for method System.Threading.Interlocked:Exchange(byref,short):short (FullOpts)
+; ============================================================

@MihuBot
Copy link
Owner Author

MihuBot commented Jan 24, 2024

@MichalPetryka

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant