Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf,x64: Use BMI2 for shifts #3726

Closed
wants to merge 4 commits into from

Conversation

kernel-patches-bot
Copy link

Pull request for series with
subject: bpf,x64: Use BMI2 for shifts
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=680036

@kernel-patches-bot
Copy link
Author

Master branch: dbdea9b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=680036
version: 2

@kernel-patches-bot
Copy link
Author

Master branch: 230bf13
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=680036
version: 2

@kernel-patches-bot
Copy link
Author

Master branch: bec2171
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=680036
version: 2

@kernel-patches-bot
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=680036 expired. Closing PR.

@kernel-patches-bot
Copy link
Author

Master branch: 87dbdc2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: aa55dfd
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 8526f0d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 5ee35ab
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 5a8921b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 2ade1cd
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 2efcf69
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=681172
version: 3

@kernel-patches-bot
Copy link
Author

Master branch: 62c69e8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: 79d878f
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: 6c4e777
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: e2ac2a0
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: a526a3c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: a526a3c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: 05ee658
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: 7a698ed
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

Master branch: 01dea95
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

Kernel Patches Daemon and others added 4 commits October 19, 2022 12:12
x64 JIT produces redundant instructions when a shift operation's
destination register is BPF_REG_4/ecx and this patch removes them.

Specifically, when dest reg is BPF_REG_4 but the src isn't, we
needn't push and pop ecx around shift only to get it overwritten
by r11 immediately afterwards.

In the rare case when both dest and src registers are BPF_REG_4,
a single shift instruction is sufficient and we don't need the
two MOV instructions around the shift.

To summarize using shift left as an example, without patch:
-------------------------------------------------
            |   dst == ecx     |    dst != ecx
=================================================
src == ecx  |   mov r11, ecx   |    shl dst, cl
            |   shl r11, ecx   |
            |   mov ecx, r11   |
-------------------------------------------------
src != ecx  |   mov r11, ecx   |    push ecx
            |   push ecx       |    mov ecx, src
            |   mov ecx, src   |    shl dst, cl
            |   shl r11, cl    |    pop ecx
            |   pop ecx        |
            |   mov ecx, r11   |
-------------------------------------------------

With patch:
-------------------------------------------------
            |   dst == ecx     |    dst != ecx
=================================================
src == ecx  |   shl ecx, cl    |    shl dst, cl
-------------------------------------------------
src != ecx  |   mov r11, ecx   |    push ecx
            |   mov ecx, src   |    mov ecx, src
            |   shl r11, cl    |    shl dst, cl
            |   mov ecx, r11   |    pop ecx
-------------------------------------------------

Signed-off-by: Jie Meng <jmeng@fb.com>
BMI2 provides 3 shift instructions (shrx, sarx and shlx) that use VEX
encoding but target general purpose registers [1]. They allow the shift
count in any general purpose register and have the same performance as
non BMI2 shift instructions [2].

Instead of shr/sar/shl that implicitly use %cl (lowest 8 bit of %rcx),
emit their more flexible alternatives provided in BMI2 when advantageous;
keep using the non BMI2 instructions when shift count is already in
BPF_REG_4/%rcx as non BMI2 instructions are shorter.

To summarize, when BMI2 is available:
-------------------------------------------------
            |   arbitrary dst
=================================================
src == ecx  |   shl dst, cl
-------------------------------------------------
src != ecx  |   shlx dst, dst, src
-------------------------------------------------

And no additional register shuffling is needed.

A concrete example between non BMI2 and BMI2 codegen.  To shift %rsi by
%rdi:

Without BMI2:

 ef3:   push   %rcx
        51
 ef4:   mov    %rdi,%rcx
        48 89 f9
 ef7:   shl    %cl,%rsi
        48 d3 e6
 efa:   pop    %rcx
        59

With BMI2:

 f0b:   shlx   %rdi,%rsi,%rsi
        c4 e2 c1 f7 f6

[1] https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set
[2] https://www.agner.org/optimize/instruction_tables.pdf

Signed-off-by: Jie Meng <jmeng@fb.com>
Current tests cover only shifts with an immediate as the source
operand/shift counts; add a new test case to cover register operand.

Signed-off-by: Jie Meng <jmeng@fb.com>
@kernel-patches-bot
Copy link
Author

Master branch: 81bfcc3
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=683811
version: 5

@kernel-patches-bot
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=683811 irrelevant now. Closing PR.

@kernel-patches-bot kernel-patches-bot deleted the series/680036=>bpf-next branch October 20, 2022 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant