Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

item_cmp_type: code shortcut path #223

Closed
wants to merge 1 commit into from

Conversation

grooverdan
Copy link

The common case for this function is comparisions of the same
type. In this case the result is the same type. As the compiler
sees this enum with a potential value of INVALID_RESULT, the
equality optimization isn't obvious. With this code change, the
compiled code goes from:

0000000000000000 <Z13item_cmp_type11Item_resultS>:
0: 55 push %rbp
1: 89 f9 mov %edi,%ecx
3: 31 c0 xor %eax,%eax
5: 09 f1 or %esi,%ecx
7: 48 89 e5 mov %rsp,%rbp
a: 74 3d je 49 <Z13item_cmp_type11Item_resultS+0x49>
c: 8d 47 fe lea -0x2(%rdi),%eax
f: 83 e0 fd and $0xfffffffd,%eax
12: 89 c2 mov %eax,%edx
14: 8d 46 fe lea -0x2(%rsi),%eax
17: 83 e0 fd and $0xfffffffd,%eax
1a: 83 ff 02 cmp $0x2,%edi
1d: 89 c1 mov %eax,%ecx
1f: 75 0a jne 2b <Z13item_cmp_type11Item_resultS+0x2b>
21: 83 fe 02 cmp $0x2,%esi
24: b8 02 00 00 00 mov $0x2,%eax
29: 74 1e je 49 <Z13item_cmp_type11Item_resultS+0x49>
2b: 83 ff 03 cmp $0x3,%edi
2e: 74 20 je 50 <Z13item_cmp_type11Item_resultS+0x50>
30: 83 fe 03 cmp $0x3,%esi
33: 74 1b je 50 <Z13item_cmp_type11Item_resultS+0x50>
35: 85 d2 test %edx,%edx
37: b8 01 00 00 00 mov $0x1,%eax
3c: 75 0b jne 49 <Z13item_cmp_type11Item_resultS+0x49>
3e: 83 f9 01 cmp $0x1,%ecx
41: 19 c0 sbb %eax,%eax
43: 83 e0 03 and $0x3,%eax
46: 83 c0 01 add $0x1,%eax
49: 5d pop %rbp
4a: c3 retq
4b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
50: b8 03 00 00 00 mov $0x3,%eax
55: 5d pop %rbp
56: c3 retq

to:

0000000000000000 <Z13item_cmp_type11Item_resultS>:
0: 89 f8 mov %edi,%eax
2: 39 f7 cmp %esi,%edi
4: 74 2f je 35 <Z13item_cmp_type11Item_resultS+0x35>
6: 83 ff 03 cmp $0x3,%edi
9: 74 25 je 30 <Z13item_cmp_type11Item_resultS+0x30>
b: 83 fe 03 cmp $0x3,%esi
e: 74 20 je 30 <Z13item_cmp_type11Item_resultS+0x30>
10: 8d 57 fe lea -0x2(%rdi),%edx
13: b8 01 00 00 00 mov $0x1,%eax
18: 83 e2 fd and $0xfffffffd,%edx
1b: 75 18 jne 35 <Z13item_cmp_type11Item_resultS+0x35>
1d: 83 ee 02 sub $0x2,%esi
20: 83 e6 fd and $0xfffffffd,%esi
23: 83 fe 01 cmp $0x1,%esi
26: 19 c0 sbb %eax,%eax
28: 83 e0 03 and $0x3,%eax
2b: 83 c0 01 add $0x1,%eax
2e: c3 retq
2f: 90 nop
30: b8 03 00 00 00 mov $0x3,%eax
35: c3 retq

In addition to the shorted path, there are no stack operations,
and the common case, the branch at 4:, becomes easily predicted
with 4 instuctions usually executed.

The common case for this function is comparisions of the same
type. In this case the result is the same type. As the compiler
sees this enum with a potential value of INVALID_RESULT, the
equality optimization isn't obvious. With this code change, the
compiled code goes from:

0000000000000000 <_Z13item_cmp_type11Item_resultS_>:
   0:	55                   	push   %rbp
   1:	89 f9                	mov    %edi,%ecx
   3:	31 c0                	xor    %eax,%eax
   5:	09 f1                	or     %esi,%ecx
   7:	48 89 e5             	mov    %rsp,%rbp
   a:	74 3d                	je     49 <_Z13item_cmp_type11Item_resultS_+0x49>
   c:	8d 47 fe             	lea    -0x2(%rdi),%eax
   f:	83 e0 fd             	and    $0xfffffffd,%eax
  12:	89 c2                	mov    %eax,%edx
  14:	8d 46 fe             	lea    -0x2(%rsi),%eax
  17:	83 e0 fd             	and    $0xfffffffd,%eax
  1a:	83 ff 02             	cmp    $0x2,%edi
  1d:	89 c1                	mov    %eax,%ecx
  1f:	75 0a                	jne    2b <_Z13item_cmp_type11Item_resultS_+0x2b>
  21:	83 fe 02             	cmp    $0x2,%esi
  24:	b8 02 00 00 00       	mov    $0x2,%eax
  29:	74 1e                	je     49 <_Z13item_cmp_type11Item_resultS_+0x49>
  2b:	83 ff 03             	cmp    $0x3,%edi
  2e:	74 20                	je     50 <_Z13item_cmp_type11Item_resultS_+0x50>
  30:	83 fe 03             	cmp    $0x3,%esi
  33:	74 1b                	je     50 <_Z13item_cmp_type11Item_resultS_+0x50>
  35:	85 d2                	test   %edx,%edx
  37:	b8 01 00 00 00       	mov    $0x1,%eax
  3c:	75 0b                	jne    49 <_Z13item_cmp_type11Item_resultS_+0x49>
  3e:	83 f9 01             	cmp    $0x1,%ecx
  41:	19 c0                	sbb    %eax,%eax
  43:	83 e0 03             	and    $0x3,%eax
  46:	83 c0 01             	add    $0x1,%eax
  49:	5d                   	pop    %rbp
  4a:	c3                   	retq
  4b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  50:	b8 03 00 00 00       	mov    $0x3,%eax
  55:	5d                   	pop    %rbp
  56:	c3                   	retq

to:

0000000000000000 <_Z13item_cmp_type11Item_resultS_>:
   0:	89 f8                	mov    %edi,%eax
   2:	39 f7                	cmp    %esi,%edi
   4:	74 2f                	je     35 <_Z13item_cmp_type11Item_resultS_+0x35>
   6:	83 ff 03             	cmp    $0x3,%edi
   9:	74 25                	je     30 <_Z13item_cmp_type11Item_resultS_+0x30>
   b:	83 fe 03             	cmp    $0x3,%esi
   e:	74 20                	je     30 <_Z13item_cmp_type11Item_resultS_+0x30>
  10:	8d 57 fe             	lea    -0x2(%rdi),%edx
  13:	b8 01 00 00 00       	mov    $0x1,%eax
  18:	83 e2 fd             	and    $0xfffffffd,%edx
  1b:	75 18                	jne    35 <_Z13item_cmp_type11Item_resultS_+0x35>
  1d:	83 ee 02             	sub    $0x2,%esi
  20:	83 e6 fd             	and    $0xfffffffd,%esi
  23:	83 fe 01             	cmp    $0x1,%esi
  26:	19 c0                	sbb    %eax,%eax
  28:	83 e0 03             	and    $0x3,%eax
  2b:	83 c0 01             	add    $0x1,%eax
  2e:	c3                   	retq
  2f:	90                   	nop
  30:	b8 03 00 00 00       	mov    $0x3,%eax
  35:	c3                   	retq

In addition to the shorted path, there are no stack operations,
and the common case, the branch at 4:, becomes easily predicted
with 4 instuctions usually executed.
@grooverdan
Copy link
Author

I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment:
"I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it."
Thanks

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow
bug http://bugs.mysql.com/bug.php?id=92784 for updates.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants