Skip to content

chore: Small CI fixes#3566

Merged
larseggert merged 1 commit intomozilla:mainfrom
larseggert:chore-ci-fixes
Apr 18, 2026
Merged

chore: Small CI fixes#3566
larseggert merged 1 commit intomozilla:mainfrom
larseggert:chore-ci-fixes

Conversation

@larseggert
Copy link
Copy Markdown
Collaborator

@larseggert larseggert commented Apr 17, 2026

  • Kill old processes hogging needed ports.
  • Fix pkgin.
  • Turn off qlog for benches (oops).
  • Generate a separate profile for each bench variant.

Copilot AI review requested due to automatic review settings April 17, 2026 15:06
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall — small, targeted CI fixes. Two inline comments follow.

Summary: The pkgin path fix for NetBSD and the stale-process cleanup for port 4433 are straightforward. The kill_port helper in perfcompare.py is clean. One formatting/style issue on the NetBSD line worth addressing.

Comment thread .github/actions/check-vm/action.yml
Comment thread .github/workflows/bench.yml
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes small CI reliability fixes by cleaning up leftover processes that can block required ports and by correcting NetBSD pkgin usage in the VM setup action.

Changes:

  • Kill stale processes holding port 4433 before running the bench workflow.
  • Add a kill_port() helper in perfcompare.py and invoke it during setup to avoid port conflicts.
  • Fix NetBSD package installation by calling pkgin via its full path and adding missing build tools.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
.github/workflows/bench.yml Proactively frees port 4433 prior to running benchmarks on the self-hosted runner.
.github/scripts/perfcompare.py Adds port cleanup in the perf-compare setup to avoid failures from stale listeners.
.github/actions/check-vm/action.yml Fixes NetBSD VM dependency installation by using /usr/pkg/bin/pkgin and installing additional build tools.

Comment thread .github/scripts/perfcompare.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.42%. Comparing base (2862462) to head (2905bc9).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3566      +/-   ##
==========================================
- Coverage   94.55%   94.42%   -0.14%     
==========================================
  Files         128      133       +5     
  Lines       39736    40163     +427     
  Branches    39736    40163     +427     
==========================================
+ Hits        37573    37924     +351     
- Misses       1327     1393      +66     
- Partials      836      846      +10     
Flag Coverage Δ
freebsd 93.48% <ø> (-0.13%) ⬇️
linux 94.52% <ø> (+<0.01%) ⬆️
macos 94.47% <ø> (+<0.01%) ⬆️
windows 94.52% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
neqo-common 98.61% <ø> (ø)
neqo-crypto 87.08% <ø> (ø)
neqo-http3 93.92% <ø> (ø)
neqo-qpack 95.14% <ø> (ø)
neqo-transport 95.61% <ø> (+<0.01%) ⬆️
neqo-udp 84.90% <ø> (ø)
mtu 86.61% <ø> (ø)

- Kill old processes hogging needed ports.
- Fix `pkgin`.
- Turn off qlog for benches (oops).
- Generate a separate profile for each bench variant.
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean CI maintenance PR. Four independent fixes, all straightforward.

The test-fixture/bench feature propagation is the most important change here. Without it, new_neqo_qlog() in test-fixture/src/lib.rs:553 was creating live qlog instances during benchmarks (the cfg!(feature = "bench") guard was never true for test-fixture), adding I/O noise to measurements. The fix correctly threads the feature through neqo-http3/bench and neqo-transport/bench. This does create a mutual feature cycle (neqo-http3[bench]test-fixture[bench] and neqo-transport[bench]test-fixture[bench]), but Cargo handles this correctly since the back-edge is through a dev-dependency.

The per-variant perf profiling loop in bench.yml is a nice improvement — individual flamegraphs per benchmark variant will be much more useful than a single combined profile. The fallback to whole-binary profiling for non-criterion benchmarks (e.g., min_bandwidth) is handled correctly.

The NetBSD ;; formatting issue noted by an earlier reviewer is still present on this commit.

Comment on lines +108 to +113
for proto in ("udp", "tcp"):
subprocess.run(
["sudo", "fuser", "-k", f"{port}/{proto}"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

subprocess.run defaults to check=False, so this works correctly. However, since fuser -k returns non-zero when no matching process exists, adding check=False explicitly would make the intent clearer — consistent with how sh() is defined elsewhere in this file.

(Re: Copilot's suggestion to add shutil.which guards and timeouts — I'd push back on that. This script only runs on the self-hosted benchmark runner where sudo, fuser, and passwordless sudo are guaranteed. Those guards add complexity for an environment that doesn't need them.)

Suggested change
for proto in ("udp", "tcp"):
subprocess.run(
["sudo", "fuser", "-k", f"{port}/{proto}"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
for proto in ("udp", "tcp"):
subprocess.run(
["sudo", "fuser", "-k", f"{port}/{proto}"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
)

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark results

Significant performance differences relative to a1331f5.

transfer/walltime/pacing-false/varying-seeds: 💚 Performance has improved by -71.505%.
       time:   [23.089 ms 23.107 ms 23.127 ms]
       change: [-71.533% -71.505% -71.475] (p = 0.00 < 0.05)
       Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-true/varying-seeds: 💚 Performance has improved by -71.676%.
       time:   [23.422 ms 23.453 ms 23.503 ms]
       change: [-71.723% -71.676% -71.610] (p = 0.00 < 0.05)
       Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
5 (5.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-false/same-seed: 💚 Performance has improved by -71.459%.
       time:   [23.213 ms 23.241 ms 23.281 ms]
       change: [-71.504% -71.459% -71.405] (p = 0.00 < 0.05)
       Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-true/same-seed: 💚 Performance has improved by -71.283%.
       time:   [23.833 ms 23.852 ms 23.872 ms]
       change: [-71.347% -71.283% -71.236] (p = 0.00 < 0.05)
       Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
5 (5.00%) high mild
All results
transfer/1-conn/1-100mb-resp (aka. Download): No change in performance detected.
       time:   [207.53 ms 207.87 ms 208.29 ms]
       thrpt:  [480.10 MiB/s 481.06 MiB/s 481.85 MiB/s]
change:
       time:   [-0.1756% +0.0447% +0.2784] (p = 0.71 > 0.05)
       thrpt:  [-0.2776% -0.0447% +0.1759]
       No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
transfer/1-conn/10_000-parallel-1b-resp (aka. RPS): Change within noise threshold.
       time:   [288.00 ms 289.66 ms 291.34 ms]
       thrpt:  [34.324 Kelem/s 34.523 Kelem/s 34.722 Kelem/s]
change:
       time:   [-2.0491% -1.3367% -0.5871] (p = 0.00 < 0.05)
       thrpt:  [+0.5906% +1.3548% +2.0920]
       Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
transfer/1-conn/1-1b-resp (aka. HPS): No change in performance detected.
       time:   [38.626 ms 38.787 ms 38.970 ms]
       thrpt:  [25.661   B/s 25.782   B/s 25.889   B/s]
change:
       time:   [-0.3990% +0.2171% +0.7923] (p = 0.50 > 0.05)
       thrpt:  [-0.7861% -0.2166% +0.4006]
       No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
transfer/1-conn/1-100mb-req (aka. Upload): Change within noise threshold.
       time:   [208.98 ms 209.34 ms 209.73 ms]
       thrpt:  [476.81 MiB/s 477.68 MiB/s 478.51 MiB/s]
change:
       time:   [+0.2296% +0.4689% +0.7234] (p = 0.00 < 0.05)
       thrpt:  [-0.7182% -0.4667% -0.2291]
       Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
streams/walltime/1-streams/each-1000-bytes: No change in performance detected.
       time:   [589.05 µs 591.07 µs 593.52 µs]
       change: [-0.5011% +0.0241% +0.5607] (p = 0.93 > 0.05)
       No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) high mild
7 (7.00%) high severe
streams/walltime/1000-streams/each-1-bytes: No change in performance detected.
       time:   [12.272 ms 12.290 ms 12.309 ms]
       change: [-0.4312% -0.2115% +0.0096] (p = 0.06 > 0.05)
       No change in performance detected.
streams/walltime/1000-streams/each-1000-bytes: Change within noise threshold.
       time:   [44.056 ms 44.133 ms 44.237 ms]
       change: [-0.5031% -0.2755% -0.0037] (p = 0.02 < 0.05)
       Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-false/varying-seeds: 💚 Performance has improved by -71.505%.
       time:   [23.089 ms 23.107 ms 23.127 ms]
       change: [-71.533% -71.505% -71.475] (p = 0.00 < 0.05)
       Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-true/varying-seeds: 💚 Performance has improved by -71.676%.
       time:   [23.422 ms 23.453 ms 23.503 ms]
       change: [-71.723% -71.676% -71.610] (p = 0.00 < 0.05)
       Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
5 (5.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-false/same-seed: 💚 Performance has improved by -71.459%.
       time:   [23.213 ms 23.241 ms 23.281 ms]
       change: [-71.504% -71.459% -71.405] (p = 0.00 < 0.05)
       Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-true/same-seed: 💚 Performance has improved by -71.283%.
       time:   [23.833 ms 23.852 ms 23.872 ms]
       change: [-71.347% -71.283% -71.236] (p = 0.00 < 0.05)
       Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
5 (5.00%) high mild

Download data for profiler.firefox.com or download performance comparison data.

@github-actions
Copy link
Copy Markdown
Contributor

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to main at a1331f5.

neqo-pr as clientneqo-pr as server
neqo-pr vs. aioquic: ⚠️C1
neqo-pr vs. go-x-net: BP BA
neqo-pr vs. haproxy: ⚠️C1 BP BA
neqo-pr vs. kwik: S L1 C1 BP BA
neqo-pr vs. lsquic: baseline result missing
neqo-pr vs. msquic: A L1 C1 ⚠️C2
neqo-pr vs. mvfst: A BP BA
neqo-pr vs. neqo: Z A
neqo-pr vs. nginx: 🚀L1 BP BA
neqo-pr vs. ngtcp2: CM
neqo-pr vs. picoquic: A
neqo-pr vs. quic-go: A
neqo-pr vs. quic-zig: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
neqo-pr vs. quiche: BP BA
neqo-pr vs. s2n-quic: ⚠️BA CM
neqo-pr vs. tquic: S BP BA
neqo-pr vs. xquic: A ⚠️C1
aioquic vs. neqo-pr: CM
go-x-net vs. neqo-pr: CM
kwik vs. neqo-pr: BP BA CM
msquic vs. neqo-pr: CM
mvfst vs. neqo-pr: Z A L1 C1 CM
neqo vs. neqo-pr: A
openssl vs. neqo-pr: LR M A CM
quic-go vs. neqo-pr: CM
quic-zig vs. neqo-pr: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
quiche vs. neqo-pr: run cancelled after 20 min
quinn vs. neqo-pr: ⚠️C1 V2 CM
s2n-quic vs. neqo-pr: BA CM
tquic vs. neqo-pr: CM
xquic vs. neqo-pr: M CM
All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

@github-actions
Copy link
Copy Markdown
Contributor

Client/server transfer results

Performance differences relative to a1331f5.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ baseline Δ baseline
neqo-neqo-newreno-nopacing 91.0 ± 2.9 84.7 99.6 351.6 ± 11.0 💔 0.8 0.9%
neqo-quiche-cubic 189.7 ± 2.9 184.7 205.6 168.6 ± 11.0 💔 1.4 0.7%

Table above only shows statistically significant changes. See all results below.

All results

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ baseline Δ baseline
google-google-nopacing 455.1 ± 1.7 450.0 458.7 70.3 ± 18.8
google-neqo-cubic 262.6 ± 2.8 257.7 273.7 121.8 ± 11.4 0.2 0.1%
msquic-msquic-nopacing 130.7 ± 35.9 112.9 370.9 244.8 ± 0.9
msquic-neqo-cubic 156.6 ± 33.1 118.8 328.8 204.4 ± 1.0 1.3 0.9%
neqo-google-cubic 759.0 ± 2.6 752.0 766.6 42.2 ± 12.3 0.4 0.1%
neqo-msquic-cubic 149.7 ± 1.7 147.9 157.2 213.8 ± 18.8 -0.1 -0.0%
neqo-neqo-cubic 90.5 ± 2.7 85.8 99.4 353.4 ± 11.9 0.3 0.3%
neqo-neqo-cubic-nopacing 89.0 ± 2.6 84.6 98.0 359.4 ± 12.3 -0.4 -0.5%
neqo-neqo-newreno 90.7 ± 2.3 86.7 97.4 352.9 ± 13.9 0.4 0.4%
neqo-neqo-newreno-nopacing 91.0 ± 2.9 84.7 99.6 351.6 ± 11.0 💔 0.8 0.9%
neqo-quiche-cubic 189.7 ± 2.9 184.7 205.6 168.6 ± 11.0 💔 1.4 0.7%
neqo-s2n-cubic 214.5 ± 2.5 207.8 224.1 149.2 ± 12.8 0.2 0.1%
quiche-neqo-cubic 173.4 ± 4.6 165.9 199.7 184.6 ± 7.0 0.7 0.4%
quiche-quiche-nopacing 140.3 ± 3.7 134.4 158.6 228.2 ± 8.6
s2n-neqo-cubic 214.3 ± 3.4 208.0 222.1 149.3 ± 9.4 0.5 0.2%
s2n-s2n-nopacing 292.5 ± 29.2 276.9 459.7 109.4 ± 1.1

Download data for profiler.firefox.com or download performance comparison data.

@larseggert larseggert merged commit c96fc52 into mozilla:main Apr 18, 2026
191 checks passed
@larseggert larseggert deleted the chore-ci-fixes branch April 18, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants