Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF optimizer: Optimize redundant loads. #9855

Merged
merged 1 commit into from
Jan 4, 2024
Merged

Conversation

copybara-service[bot]
Copy link

@copybara-service copybara-service bot commented Jan 3, 2024

BPF optimizer: Optimize redundant loads.

This is useful in conjunction cases when a single argument is compared
against multiple values, e.g. for fcntl(2) where args[1] may be F_GETFL,
F_SETFL, or F_GETFD. There is no need to reload the value of args[1]
into the register if it is already the same as the one that was loaded from
the previous check.

This helps with syscall rules where one specific argument may be one of many
specific values, especially ioctl rules (critical for KVM and nvproxy
performance), and the benchmarks seem to agree:

# Time to evaluate seccomp filter for a syscall:
# (Many syscalls for which the rules are not "this argument must be one of
# these values" show no change and are omitted from this list)
                                      │    before    │                   after                   │
                                      │    sec/op    │     sec/op      vs base                   │
SentrySystrap/Postgres/futex            62.39n ± 13%    53.06n ±  39%        ~ (p=0.184 n=19+20)
SentrySystrap/Postgres/nanosleep        237.7n ± 33%    119.3n ± 142%        ~ (p=0.089 n=12)
[...]
SentryKVM/Postgres/ioctl                60.88n ± 17%    56.63n ±  12%   -6.98% (p=0.026 n=20+19)
SentryKVM/Postgres/rt_sigreturn         34.67n ± 32%    29.33n ±  49%        ~ (p=0.443 n=20+18)
[...]
NVProxyIoctl/nvproxy/ioctl_3222292175   58.81n ±  8%    54.85n ±   8%   -6.73% (p=0.015 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3221767894   58.22n ±  5%    54.06n ±   6%   -7.14% (p=0.022 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3222292009   58.33n ±  8%    55.67n ±   3%        ~ (p=0.070 n=28+30)
NVProxyIoctl/nvproxy/ioctl_3223340586   61.57n ±  7%    58.56n ±   7%        ~ (p=0.059 n=27+29)
NVProxyIoctl/nvproxy/ioctl_3223340587   63.10n ±  9%    58.78n ±   7%        ~ (p=0.140 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223864875   63.66n ±  7%    58.25n ±   3%   -8.50% (p=0.030 n=29)
NVProxyIoctl/nvproxy/ioctl_3224389163   64.12n ±  5%    60.96n ±   4%        ~ (p=0.114 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223078452   63.90n ±  9%    58.25n ±   6%   -8.84% (p=0.014 n=29)
NVProxyIoctl/nvproxy/ioctl_3222816309   66.07n ± 10%    58.08n ±   5%  -12.10% (p=0.003 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3233302090   65.55n ±  7%    58.88n ±   8%  -10.17% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3224913486   65.62n ±  6%    61.10n ±   7%        ~ (p=0.072 n=28+29)
NVProxyIoctl/nvproxy/ioctl_3223340623   63.45n ±  7%    59.47n ±   6%   -6.26% (p=0.018 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223864926   66.35n ±  8%    59.45n ±   5%  -10.41% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_805306369    62.98n ±  6%    56.26n ±   7%  -10.66% (p=0.006 n=29+30)
NVProxyIoctl/nvproxy/ioctl_75           69.09n ±  7%    64.87n ±   8%        ~ (p=0.083 n=29+28)
NVProxyIoctl/nvproxy/ioctl_805306370    72.30n ±  8%    57.81n ±   7%  -20.04% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_23           79.59n ±  5%    73.65n ±   4%   -7.46% (p=0.001 n=29)
NVProxyIoctl/nvproxy/ioctl_24           79.70n ±  4%    73.86n ±   3%   -7.33% (p=0.014 n=29+30)
NVProxyIoctl/nvproxy/ioctl_25           79.32n ±  4%    77.69n ±   6%        ~ (p=0.468 n=29)
NVProxyIoctl/nvproxy/ioctl_26           80.15n ±  5%    76.61n ±   6%        ~ (p=0.140 n=29+30)
NVProxyIoctl/nvproxy/ioctl_27           80.02n ±  5%    75.77n ±   7%   -5.31% (p=0.003 n=29+30)
NVProxyIoctl/nvproxy/ioctl_28           80.52n ±  6%    76.79n ±   3%   -4.63% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_33           85.05n ±  8%    78.23n ±   5%   -8.02% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_34           83.57n ±  7%    78.74n ±   4%   -5.77% (p=0.035 n=29+30)
NVProxyIoctl/nvproxy/ioctl_37           82.20n ±  6%    79.85n ±   7%        ~ (p=0.474 n=29+30)
NVProxyIoctl/nvproxy/ioctl_38           85.38n ±  6%    78.23n ±   6%   -8.38% (p=0.001 n=29+30)
NVProxyIoctl/nvproxy/ioctl_39           85.53n ±  4%    79.87n ±   6%   -6.62% (p=0.007 n=29+30)
NVProxyIoctl/nvproxy/ioctl_45           86.29n ±  4%    83.78n ±   8%        ~ (p=0.176 n=29+30)
NVProxyIoctl/nvproxy/ioctl_65           86.89n ±  4%    82.06n ±   5%   -5.56% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_68           88.80n ±  6%    80.78n ±   7%   -9.03% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_72           90.13n ±  3%    82.77n ±   6%   -8.17% (p=0.005 n=29+30)
NVProxyIoctl/nvproxy/ioctl_73           87.23n ±  6%    81.86n ±   4%   -6.16% (p=0.039 n=29+30)
NVProxyIoctl/nvproxy/ioctl_21506        55.90n ±  7%    56.20n ±   6%        ~ (p=0.728 n=28+30)
NVProxyIoctl/nvproxy/ioctl_3224913447   57.26n ±  9%    55.15n ±   7%   -3.68% (p=0.028 n=27+30)
NVProxyIoctl/nvproxy-48                 69.44n ±  7%    64.64n ±   6%   -6.91% (p=0.002 n=10)
geomean                                 61.27n          56.73n          -7.41%

# Number of BPF instructions before optimizations:
                          │   before    │                after                 │
                          │  gen-instr  │  gen-instr   vs base                 │
SentrySystrap/Postgres-48   1.487k ± 0%   1.487k ± 0%       ~ (p=1.000 n=10) ¹
SentryKVM/Postgres-48       1.345k ± 0%   1.345k ± 0%       ~ (p=1.000 n=10) ¹
NVProxyIoctl/nvproxy-48     1.931k ± 0%   1.931k ± 0%       ~ (p=1.000 n=10) ¹
geomean                     1.569k        1.569k       +0.00%
¹ all samples are equal

# Number of BPF instructions after optimizations:
                          │   before   │               after                │
                          │ opt-instr  │ opt-instr   vs base                │
SentrySystrap/Postgres-48   382.0 ± 0%   355.0 ± 0%   -7.07% (p=0.000 n=10)
SentryKVM/Postgres-48       322.0 ± 0%   297.0 ± 0%   -7.76% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48     473.0 ± 0%   409.0 ± 0%  -13.53% (p=0.000 n=10)
geomean                     387.5        350.7        -9.50%

# Ratio of post-optimizer #instructions vs pre-optimizer #instructions:
                          │      before       │                   after                   │
                          │ compression-ratio │ compression-ratio  vs base                │
SentrySystrap/Postgres-48          3.893 ± 0%          4.189 ± 0%   +7.60% (p=0.000 n=10)
SentryKVM/Postgres-48              4.177 ± 0%          4.529 ± 0%   +8.43% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48            4.082 ± 0%          4.721 ± 0%  +15.65% (p=0.000 n=10)
geomean                            4.049               4.474       +10.50%

# Time to run optimizer:
                          │   before    │                after                │
                          │   opt-sec   │   opt-sec    vs base                │
SentrySystrap/Postgres-48   4.699m ± 3%   5.327m ± 4%  +13.34% (p=0.000 n=10)
SentryKVM/Postgres-48       4.130m ± 3%   4.651m ± 3%  +12.60% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48     10.19m ± 3%   10.42m ± 1%   +2.26% (p=0.023 n=10)
geomean                     5.826m        6.367m        +9.28%

Note that this only applies in practice because we extract each 32-bit "half"
of each syscall argument into its own matcher rule. So the conjunction that
this optimizes for isn't exactly over a single argument, but over a single
32-bit half of an argument.

@copybara-service copybara-service bot added the exported Issue was exported automatically label Jan 3, 2024
@copybara-service copybara-service bot force-pushed the test/cl595244732 branch 3 times, most recently from 2d1c2d8 to 1633cd9 Compare January 4, 2024 02:40
This is useful in conjunction cases when a single argument is compared
against multiple values, e.g. for `fcntl(2)` where `args[1]` may be `F_GETFL`,
`F_SETFL`, or `F_GETFD`. There is no need to reload the value of `args[1]`
into the register if it is already the same as the one that was loaded from
the previous check.

This helps with syscall rules where one specific argument may be one of many
specific values, especially `ioctl` rules (critical for KVM and `nvproxy`
performance), and the benchmarks seem to agree:

```
# Time to evaluate seccomp filter for a syscall:
# (Many syscalls for which the rules are not "this argument must be one of
# these values" show no change and are omitted from this list)
                                      │    before    │                   after                   │
                                      │    sec/op    │     sec/op      vs base                   │
SentrySystrap/Postgres/futex            62.39n ± 13%    53.06n ±  39%        ~ (p=0.184 n=19+20)
SentrySystrap/Postgres/nanosleep        237.7n ± 33%    119.3n ± 142%        ~ (p=0.089 n=12)
[...]
SentryKVM/Postgres/ioctl                60.88n ± 17%    56.63n ±  12%   -6.98% (p=0.026 n=20+19)
SentryKVM/Postgres/rt_sigreturn         34.67n ± 32%    29.33n ±  49%        ~ (p=0.443 n=20+18)
[...]
NVProxyIoctl/nvproxy/ioctl_3222292175   58.81n ±  8%    54.85n ±   8%   -6.73% (p=0.015 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3221767894   58.22n ±  5%    54.06n ±   6%   -7.14% (p=0.022 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3222292009   58.33n ±  8%    55.67n ±   3%        ~ (p=0.070 n=28+30)
NVProxyIoctl/nvproxy/ioctl_3223340586   61.57n ±  7%    58.56n ±   7%        ~ (p=0.059 n=27+29)
NVProxyIoctl/nvproxy/ioctl_3223340587   63.10n ±  9%    58.78n ±   7%        ~ (p=0.140 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223864875   63.66n ±  7%    58.25n ±   3%   -8.50% (p=0.030 n=29)
NVProxyIoctl/nvproxy/ioctl_3224389163   64.12n ±  5%    60.96n ±   4%        ~ (p=0.114 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223078452   63.90n ±  9%    58.25n ±   6%   -8.84% (p=0.014 n=29)
NVProxyIoctl/nvproxy/ioctl_3222816309   66.07n ± 10%    58.08n ±   5%  -12.10% (p=0.003 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3233302090   65.55n ±  7%    58.88n ±   8%  -10.17% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3224913486   65.62n ±  6%    61.10n ±   7%        ~ (p=0.072 n=28+29)
NVProxyIoctl/nvproxy/ioctl_3223340623   63.45n ±  7%    59.47n ±   6%   -6.26% (p=0.018 n=29+30)
NVProxyIoctl/nvproxy/ioctl_3223864926   66.35n ±  8%    59.45n ±   5%  -10.41% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_805306369    62.98n ±  6%    56.26n ±   7%  -10.66% (p=0.006 n=29+30)
NVProxyIoctl/nvproxy/ioctl_75           69.09n ±  7%    64.87n ±   8%        ~ (p=0.083 n=29+28)
NVProxyIoctl/nvproxy/ioctl_805306370    72.30n ±  8%    57.81n ±   7%  -20.04% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_23           79.59n ±  5%    73.65n ±   4%   -7.46% (p=0.001 n=29)
NVProxyIoctl/nvproxy/ioctl_24           79.70n ±  4%    73.86n ±   3%   -7.33% (p=0.014 n=29+30)
NVProxyIoctl/nvproxy/ioctl_25           79.32n ±  4%    77.69n ±   6%        ~ (p=0.468 n=29)
NVProxyIoctl/nvproxy/ioctl_26           80.15n ±  5%    76.61n ±   6%        ~ (p=0.140 n=29+30)
NVProxyIoctl/nvproxy/ioctl_27           80.02n ±  5%    75.77n ±   7%   -5.31% (p=0.003 n=29+30)
NVProxyIoctl/nvproxy/ioctl_28           80.52n ±  6%    76.79n ±   3%   -4.63% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_33           85.05n ±  8%    78.23n ±   5%   -8.02% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_34           83.57n ±  7%    78.74n ±   4%   -5.77% (p=0.035 n=29+30)
NVProxyIoctl/nvproxy/ioctl_37           82.20n ±  6%    79.85n ±   7%        ~ (p=0.474 n=29+30)
NVProxyIoctl/nvproxy/ioctl_38           85.38n ±  6%    78.23n ±   6%   -8.38% (p=0.001 n=29+30)
NVProxyIoctl/nvproxy/ioctl_39           85.53n ±  4%    79.87n ±   6%   -6.62% (p=0.007 n=29+30)
NVProxyIoctl/nvproxy/ioctl_45           86.29n ±  4%    83.78n ±   8%        ~ (p=0.176 n=29+30)
NVProxyIoctl/nvproxy/ioctl_65           86.89n ±  4%    82.06n ±   5%   -5.56% (p=0.028 n=29+30)
NVProxyIoctl/nvproxy/ioctl_68           88.80n ±  6%    80.78n ±   7%   -9.03% (p=0.000 n=29+30)
NVProxyIoctl/nvproxy/ioctl_72           90.13n ±  3%    82.77n ±   6%   -8.17% (p=0.005 n=29+30)
NVProxyIoctl/nvproxy/ioctl_73           87.23n ±  6%    81.86n ±   4%   -6.16% (p=0.039 n=29+30)
NVProxyIoctl/nvproxy/ioctl_21506        55.90n ±  7%    56.20n ±   6%        ~ (p=0.728 n=28+30)
NVProxyIoctl/nvproxy/ioctl_3224913447   57.26n ±  9%    55.15n ±   7%   -3.68% (p=0.028 n=27+30)
NVProxyIoctl/nvproxy-48                 69.44n ±  7%    64.64n ±   6%   -6.91% (p=0.002 n=10)
geomean                                 61.27n          56.73n          -7.41%

# Number of BPF instructions before optimizations:
                          │   before    │                after                 │
                          │  gen-instr  │  gen-instr   vs base                 │
SentrySystrap/Postgres-48   1.487k ± 0%   1.487k ± 0%       ~ (p=1.000 n=10) ¹
SentryKVM/Postgres-48       1.345k ± 0%   1.345k ± 0%       ~ (p=1.000 n=10) ¹
NVProxyIoctl/nvproxy-48     1.931k ± 0%   1.931k ± 0%       ~ (p=1.000 n=10) ¹
geomean                     1.569k        1.569k       +0.00%
¹ all samples are equal

# Number of BPF instructions after optimizations:
                          │   before   │               after                │
                          │ opt-instr  │ opt-instr   vs base                │
SentrySystrap/Postgres-48   382.0 ± 0%   355.0 ± 0%   -7.07% (p=0.000 n=10)
SentryKVM/Postgres-48       322.0 ± 0%   297.0 ± 0%   -7.76% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48     473.0 ± 0%   409.0 ± 0%  -13.53% (p=0.000 n=10)
geomean                     387.5        350.7        -9.50%

# Ratio of post-optimizer #instructions vs pre-optimizer #instructions:
                          │      before       │                   after                   │
                          │ compression-ratio │ compression-ratio  vs base                │
SentrySystrap/Postgres-48          3.893 ± 0%          4.189 ± 0%   +7.60% (p=0.000 n=10)
SentryKVM/Postgres-48              4.177 ± 0%          4.529 ± 0%   +8.43% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48            4.082 ± 0%          4.721 ± 0%  +15.65% (p=0.000 n=10)
geomean                            4.049               4.474       +10.50%

# Time to run optimizer:
                          │   before    │                after                │
                          │   opt-sec   │   opt-sec    vs base                │
SentrySystrap/Postgres-48   4.699m ± 3%   5.327m ± 4%  +13.34% (p=0.000 n=10)
SentryKVM/Postgres-48       4.130m ± 3%   4.651m ± 3%  +12.60% (p=0.000 n=10)
NVProxyIoctl/nvproxy-48     10.19m ± 3%   10.42m ± 1%   +2.26% (p=0.023 n=10)
geomean                     5.826m        6.367m        +9.28%
```

Note that this only applies in practice because we extract each 32-bit "half"
of each syscall argument into its own matcher rule. So the conjunction that
this optimizes for isn't exactly over a single argument, but over a single
32-bit half of an argument.

PiperOrigin-RevId: 595563115
@copybara-service copybara-service bot merged commit c7e3ea3 into master Jan 4, 2024
@copybara-service copybara-service bot deleted the test/cl595244732 branch January 4, 2024 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exported Issue was exported automatically
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant