tests(benchmarks): expire/persist coverage + group metadata by fcostaoliveira · Pull Request #9380 · RediSearch/RediSearch

fcostaoliveira · 2026-04-30T09:42:30Z

Summary

Extend the expire/persist benchmark group so PR #9356 (doc-expiration fast path) — and any future change to the expire_cmd / persist_cmd notification branches — can be evaluated for both improvements and regressions on the existing on-the-fly EC2 infrastructure.

Adds 4 new specs (all `oss-standalone` only)

Spec	Why it exists
`search-persist-doc-1000-seconds.yml`	`PERSIST` was previously unbenched; this exercises the PERSIST branch of `OnKeySpaceNotification` and the matching fast path.
`search-expire-doc-multi-index-10-milliseconds.yml`	Three FT indexes on the same prefix — amplifies the per-spec fan-out delta between full reindex and the metadata-only path.
`search-expire-doc-50-50-10-milliseconds.yml`	50/50 write/read ratio so PEXPIRE-driven work dominates the workload — clean signal well above the variance floor.
`search-expire-doc-json-10-milliseconds.yml`	JSON path coverage for `Document_LoadSchemaFieldJson` and the shared `GetKeyExpirationTime` helper.

All four:

use a deterministic catch-all FT.SEARCH * NOCONTENT LIMIT 0 1 so query latency is tight and re-triggers on the same branch produce stable throughput,
run with DEBUG SET-ACTIVE-EXPIRE 0 so the dataset never evicts during the test (the fast path is exercised purely as a metadata update),
use -c 16 -t 4 (or -c 32 -t 4 for the 10K JSON dataset) to sustain ≥1000 QPS.

Adds metadata grouping (8 files)

Adds metadata.group: "expire-persist" and a per-spec use_case string to all eight expire/persist specs (the four new ones plus the four existing search-expire-{doc,numeric-field}-{10-milliseconds,1000-seconds}.yml). No behavioral changes to the existing specs — only metadata. The numeric-field specs serve as negative controls for the doc-expiration fast path (they exercise hexpire_cmd, which is not on the changed code path).

Why a separate PR from #9356

These specs are evaluation infrastructure — they need to land independently of the optimization being measured so baseline numbers can be produced from master before #9356 merges.

Test plan

redisbench-admin run-remote against this branch on oss-standalone for each new spec — confirm ≥1000 QPS and <5% CV across 3 datapoints.
redisbench-admin compare master-baseline vs PR [MOD-14930] refactor expire handling without full reindexing #9356 head on the four doc-level specs (existing + new) — should show improvement on the doc-expiration fast path.
Same comparison on the two numeric-field specs — should stay flat (negative-control sanity check).
--enable-profilers PROFILE=1 on search-expire-doc-multi-index-10-milliseconds to verify the per-spec write-lock pattern in Indexes_UpdateMatchingDocExpiration is not contended.

Note

Low Risk
Low risk: changes are confined to benchmark YAML specs (metadata and new workloads) with no production code impact.

Overview
Adds metadata.group: "expire-persist" and a descriptive use_case field to the existing expire benchmarks so they can be consistently grouped and understood in reporting.

Introduces four new oss-standalone benchmark specs to broaden expire/persist coverage: a PERSIST notification workload, a high write-ratio PEXPIRE workload, a multi-index fan-out PEXPIRE workload, and a JSON document PEXPIRE workload (all using deterministic FT.SEARCH patterns and disabled active expiration for stable signal).

^{Reviewed by Cursor Bugbot for commit 8d14d1a. Bugbot is set up for automated code reviews on this repo. Configure here.}

Add four new oss-standalone-only benchmark specs targeting the doc-expiration fast path introduced in PR #9356: - search-persist-doc-1000-seconds: covers the PERSIST keyspace-notification branch (previously unbenched). - search-expire-doc-multi-index-10-milliseconds: three FT indexes on the same prefix to amplify the per-spec fan-out delta between full reindex and the new metadata-only path. - search-expire-doc-50-50-10-milliseconds: high (50/50) write-ratio variant of the existing 5/95 PEXPIRE bench for clean signal above the m5/m7i variance floor. - search-expire-doc-json-10-milliseconds: JSON variant covering the Document_LoadSchemaFieldJson side of the GetKeyExpirationTime helper. All four use a deterministic catch-all FT.SEARCH query, disable active expiration, and bump per-thread connection count so they sustain >1000 QPS with low re-trigger variance. Also tag the existing four expire specs (and the four new ones) with a shared metadata.group: "expire-persist" plus a per-spec use_case string so this evaluation group can be selected as a unit.

jit-ci · 2026-04-30T09:43:39Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit e64afd9. Configure here.}

cursor · 2026-04-30T09:50:01Z

+clientconfig:
+  benchmark_type: "mixed"
+  tool: memtier_benchmark
+  arguments: "--test-time 180 -c 32 -t 4 --hide-histogram --key-prefix 'doc:single' --key-minimum 1 --key-maximum 10000 --command 'FT.SEARCH idx:single * NOCONTENT LIMIT 0 1' --command-ratio 95 --command 'PEXPIRE __key__ 10' --command-ratio 5"


JSON benchmark key-prefix missing trailing colon separator

High Severity

The --key-prefix 'doc:single' in memtier arguments generates keys like doc:single1, doc:single2, etc. However, the dataset loaded from the CSV almost certainly uses doc:single:N format (with a colon separator before the number), as evidenced by the test code in test_json_multi_numeric.py which consistently creates keys as doc:single:{N}. All other benchmarks in this PR use a colon-terminated prefix ('idx10:'), matching the idx10:N key format. The missing trailing colon means every PEXPIRE targets a non-existent key (returning 0), so no keyspace notification fires and the doc-expiration fast path is never exercised — completely defeating the benchmark's purpose.

^{Reviewed by Cursor Bugbot for commit e64afd9. Configure here.}

codecov · 2026-04-30T10:11:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.67%. Comparing base (fae9b71) to head (8d14d1a).
⚠️ Report is 46 commits behind head on master.

⚠️ Current head 8d14d1a differs from pull request most recent head 484bced

Please upload reports for the commit 484bced to get more accurate results.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #9380      +/-   ##
==========================================
+ Coverage   81.30%   81.67%   +0.36%     
==========================================
  Files         492      501       +9     
  Lines       66927    68114    +1187     
  Branches    23562    24625    +1063     
==========================================
+ Hits        54414    55630    +1216     
+ Misses      12274    12246      -28     
+ Partials      239      238       -1

Flag	Coverage Δ
flow	`83.73% <ø> (+0.10%)`	⬆️
unit	`50.90% <ø> (+0.78%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fcostaoliveira · 2026-04-30T10:17:36Z

Automated performance analysis summary

This comment was automatically generated given there is performance data available.

Environment:

Triggering env: circleci

Architecture: `x86_64` — branch-over-branch

Deployment: oss-standalone

In summary:

Detected a total of 23 stable tests between versions.
Detected a total of 8 highly unstable benchmarks (8 baseline).
Latency analysis confirmed regressions in 3 of the unstable tests:
- hybrid-arxiv-titles-384-angular-linear-numeric-vector: FT.HYBRID +21.1% ⚠️
- hybrid-arxiv-titles-384-angular-rrf-text-vector: FT.HYBRID +19.0% ⚠️
- search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-non-sortable: FT.SEARCH +28.5% 🔴
Detected a total of 14 improvements above the improvement water line.
Detected a total of 8 regressions bellow the regression water line 8.0%.

You can check a comparison in detail via the grafana link

Performance Improvements - Comparison between master and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-aggregate-post-filter-simple.yml	12666 +- 1.0% (20 datapoints)	17820	40.7%	IMPROVEMENT
ftsb-10K-enwiki_abstract-hashes-fulltext-sortby	1130 +- 3.1% (20 datapoints)	1514	34.0%	IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query	8167 +- 4.3% (20 datapoints)	10272	25.8%	IMPROVEMENT
search-numeric-sortby-optimize	424 +- 6.6% (20 datapoints)	534	25.8%	IMPROVEMENT
ftsb-1M-nyc_taxis-hashes-load	27683 +- 3.1% (20 datapoints)	34723	25.4%	IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query-non-sortable	1209 +- 5.8% (20 datapoints)	1479	22.3%	IMPROVEMENT
ftsb-10K-enwiki_abstract-hashes-term-prefix	9461 +- 2.2% (20 datapoints)	11221	18.6%	IMPROVEMENT
search-ftsb-1M-enwiki_abstract-hashes-gc	118 +- 9.9% (20 datapoints)	140	18.5%	waterline=9.9%. IMPROVEMENT
search-high-cardinality-negation-term-baseline	305 +- 1.1% (20 datapoints)	353	15.5%	IMPROVEMENT
search-ftsb-370K-docs-union-iterators-q4	34 +- 1.2% (20 datapoints)	39	14.8%	IMPROVEMENT
search-numeric	16244 +- 8.2% (20 datapoints)	18013	10.9%	waterline=8.2%. IMPROVEMENT
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-numeric-filter	2625 +- 5.6% (20 datapoints)	2875	9.5%	IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query	4350 +- 5.8% (20 datapoints)	4727	8.7%	IMPROVEMENT
search-expire-numeric-field-10-milliseconds	14 +- 1.8% (20 datapoints)	15	8.3%	IMPROVEMENT

Performance Regressions and Issues - Comparison between master and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-ftsb-arxiv-titles-384-angular-filters-m16-ef-128-json-load	5066 +- 5.0% (20 datapoints)	3679.0	-27.4%	REGRESSION
ftsb-10K-multivalue-numeric-json	7090 +- 6.7% (20 datapoints)	5227.0	-26.3%	REGRESSION
search-ftsb-10K-enwiki_abstract-hashes-term-withsuffix-trie	13769 +- 5.2% (20 datapoints)	10246.0	-25.6%	REGRESSION
search-numeric-sortby	16677 +- 9.1% (20 datapoints)	13293.0	-20.3%	waterline=9.1%. REGRESSION
ftsb-1M-nyc_taxis-ftadd-load	33137 +- 5.4% (20 datapoints)	27066.0	-18.3%	REGRESSION
hybrid-arxiv-titles-384-angular-linear-numeric-vector	1577 +- 10.2% UNSTABLE (20 datapoints)	1381.0	-12.4%	UNSTABLE (baseline high variance); server: FT.HYBRID p50 increased 21.1% (baseline CV=15.8%); client: Latency increased 14.2% (baseline CV=10.2%); confidence=LOW (FT.HYBRID baseline CV=12.4%; FT.HYBRID p99 +4.2% (stable baseline, minor change); CV=coefficient of variation (data stability: <30% stable, 30-50% moderate, >50% unstable))
search-ftsb-10K-enwiki_abstract-hashes-fulltext-aggregate-sortby-limit-0-100	5705 +- 4.2% (20 datapoints)	5002.0	-12.3%	REGRESSION
ftsb-10K-enwiki_pages-hashes-fulltext-mixed_simple-1word-query_write_1_to_read_20.yml	1681 +- 4.5% (20 datapoints)	1487.0	-11.5%	REGRESSION
hybrid-arxiv-titles-384-angular-rrf-text-vector	1590 +- 10.7% UNSTABLE (20 datapoints)	1418.0	-10.8%	UNSTABLE (baseline high variance); server: FT.HYBRID p50 increased 19.0% (baseline CV=15.8%); client: Latency increased 14.7% (baseline CV=10.8%); confidence=LOW (FT.HYBRID baseline CV=13.3%; FT.HYBRID p99 +5.3% (stable baseline, minor change); CV=coefficient of variation (data stability: <30% stable, 30-50% moderate, >50% unstable))
ftsb-1K-enwiki_abstract-hashes-term-contains	11258 +- 4.8% (20 datapoints)	10150.0	-9.8%	REGRESSION
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-non-sortable	180 +- 11.6% UNSTABLE (20 datapoints)	174.0	-3.6%	UNSTABLE (baseline high variance); server: FT.SEARCH p50 increased 28.5% (baseline CV=13.2%); client: OverallQuantiles.allCommands.q50 increased 26.6% (baseline CV=12.8%)
hybrid-arxiv-titles-384-angular-rrf-tag-range	4.2 +- 12.5% UNSTABLE (20 datapoints)	4.1	-3.3%	UNSTABLE (baseline high variance); server: p50 latency stable; client: client latency stable; neither server nor client side confirms regression
search-high-cardinality-negation-term-comparison_union_all_other_terms	211 +- 14.7% UNSTABLE (20 datapoints)	207.0	-1.9%	UNSTABLE (baseline high variance); server: FT.SEARCH p50 decreased 28.0% (baseline CV=1.1%); client: client latency stable; neither server nor client side confirms regression
search-numeric-sortby-desc-optimize	524 +- 10.4% UNSTABLE (20 datapoints)	520.0	-0.7%	UNSTABLE (baseline high variance); server: FT.SEARCH p50 decreased 6.3% (baseline CV=7.0%); client: Latency decreased 5.4% (baseline CV=6.9%); neither server nor client side confirms regression
search-expire-doc-1000-seconds	17 +- 10.4% UNSTABLE (20 datapoints)	19.0	14.3%	UNSTABLE (baseline high variance); server: p50 latency stable; client: Latency decreased 20.2% (baseline CV=1.5%); neither server nor client side confirms regression
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query-non-sortable	34 +- 24.7% UNSTABLE (20 datapoints)	52.0	52.0%	UNSTABLE (baseline high variance); server: p50 latency stable; client: OverallQuantiles.allCommands.q50 decreased 6.1% (baseline CV=2.2%); neither server nor client side confirms regression

Tests with No Significant Changes (23 tests)

Tests with No Significant Changes

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
ftsb-10K-enwiki_abstract-hashes-term-suffix	13949 +- 5.6% (20 datapoints)	13798.0	-1.1%	No Change
ftsb-10K-enwiki_abstract-hashes-term-suffix-withsuffixtrie	16311 +- 5.4% (20 datapoints)	15188.0	-6.9%	potential REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-wildcard	14028 +- 5.7% (20 datapoints)	13060.0	-6.9%	potential REGRESSION
ftsb-10K-enwiki_pages-hashes-load	66491 +- 8.5% (20 datapoints)	61457.0	-7.6%	waterline=8.5%. potential REGRESSION
ftsb-10K-singlevalue-numeric-json	3803 +- 4.6% (20 datapoints)	3877.0	1.9%	No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query	9408 +- 1.7% (20 datapoints)	9748.0	3.6%	potential IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-load	23668 +- 7.1% (20 datapoints)	23895.0	1.0%	No Change
hybrid-arxiv-titles-384-angular-linear-text-range	3.9 +- 9.1% (20 datapoints)	3.9	-1.3%	waterline=9.1%. No Change
search-expire-doc-10-milliseconds	17 +- 8.8% (20 datapoints)	16.0	-8.7%	waterline=8.8%. potential REGRESSION
search-expire-numeric-field-1000-seconds	14 +- 2.4% (20 datapoints)	14.0	3.5%	potential IMPROVEMENT
search-filtering-tag-numeric	4097 +- 9.4% (20 datapoints)	3785.0	-7.6%	waterline=9.4%. potential REGRESSION
search-filtering-tag-numeric-filter-pipeline	11269 +- 4.9% (20 datapoints)	11222.0	-0.4%	No Change
search-ftsb-10K-enwiki_abstract-hashes-fulltext-search-sortby-limit-0-100	5608 +- 4.2% (20 datapoints)	5564.0	-0.8%	No Change
search-ftsb-10K-enwiki_abstract-hashes-term-withoutsuffix-trie	13718 +- 5.6% (20 datapoints)	13152.0	-4.1%	potential REGRESSION
search-ftsb-1700K-docs-union-iterators-q3	32 +- 0.5% (20 datapoints)	35.0	7.5%	potential IMPROVEMENT
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-one-indexed-field	13202 +- 7.6% (20 datapoints)	12499.0	-5.3%	potential REGRESSION
search-ftsb-5200K-docs-union-iterators-q1	4.2 +- 6.7% (20 datapoints)	3.9	-7.4%	potential REGRESSION
search-ftsb-5500K-docs-union-iterators-q2	5.6 +- 6.0% (20 datapoints)	5.8	2.3%	No Change
search-geo	3237 +- 4.1% (20 datapoints)	3225.0	-0.4%	No Change
search-numeric-optimize	8888 +- 9.3% (20 datapoints)	8075.0	-9.1%	waterline=9.3%. potential REGRESSION
search-numeric-sortby-desc	12400 +- 4.3% (20 datapoints)	13052.0	5.3%	potential IMPROVEMENT
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-fulltext-filter	7391 +- 2.6% (20 datapoints)	7562.0	2.3%	No Change
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-tag-filter	15211 +- 6.2% (20 datapoints)	14466.0	-4.9%	potential REGRESSION

Architecture: `aarch64` — branch-over-branch

Deployment: oss-standalone

In summary:

Detected a total of 40 stable tests between versions.
Detected a total of 2 highly unstable benchmarks (2 baseline).
Detected a total of 3 improvements above the improvement water line.
Detected a total of 1 regressions bellow the regression water line 8.0%.

You can check a comparison in detail via the grafana link

Performance Improvements - Comparison between master and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-ftsb-1M-enwiki_abstract-hashes-gc	118 +- 9.9% (20 datapoints)	157	32.8%	waterline=9.9%. IMPROVEMENT
search-numeric-sortby-optimize	424 +- 6.6% (20 datapoints)	488	14.9%	IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query-non-sortable	1209 +- 5.8% (20 datapoints)	1346	11.3%	IMPROVEMENT

Performance Regressions and Issues - Comparison between master and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-numeric-sortby-desc-optimize	492 +- 7.0% (20 datapoints)	443	-9.9%	REGRESSION
search-filtering-tag-numeric	3600 +- 10.6% UNSTABLE (20 datapoints)	4161	15.6%	UNSTABLE (baseline high variance); server: FT.AGGREGATE p50 decreased 13.5% (baseline CV=13.5%); client: Latency decreased 13.5% (baseline CV=9.8%); neither server nor client side confirms regression
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query-non-sortable	34 +- 24.7% UNSTABLE (20 datapoints)	46	34.5%	UNSTABLE (baseline high variance); server: p50 latency stable; client: client latency stable; neither server nor client side confirms regression

Tests with No Significant Changes (40 tests)

Tests with No Significant Changes

Test Case	Baseline master (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
ftsb-10K-enwiki_abstract-hashes-fulltext-sortby	1130 +- 3.1% (20 datapoints)	1165.0	3.1%	potential IMPROVEMENT
ftsb-10K-enwiki_abstract-hashes-term-prefix	9461 +- 2.2% (20 datapoints)	9893.0	4.6%	potential IMPROVEMENT
ftsb-10K-enwiki_abstract-hashes-term-suffix	10715 +- 1.2% (20 datapoints)	10929.0	2.0%	No Change
ftsb-10K-enwiki_abstract-hashes-term-suffix-withsuffixtrie	11541 +- 0.9% (20 datapoints)	11549.0	0.1%	No Change
ftsb-10K-enwiki_abstract-hashes-term-wildcard	10849 +- 1.2% (20 datapoints)	11098.0	2.3%	No Change
ftsb-10K-enwiki_pages-hashes-fulltext-mixed_simple-1word-query_write_1_to_read_20.yml	1681 +- 4.5% (20 datapoints)	1677.0	-0.2%	No Change
ftsb-10K-enwiki_pages-hashes-load	60396 +- 6.5% (20 datapoints)	61457.0	1.8%	No Change
ftsb-10K-multivalue-numeric-json	5110 +- 1.7% (20 datapoints)	5227.0	2.3%	No Change
ftsb-10K-singlevalue-numeric-json	3015 +- 0.9% (20 datapoints)	3014.0	-0.1%	No Change
ftsb-1K-enwiki_abstract-hashes-term-contains	9144 +- 1.7% (20 datapoints)	9328.0	2.0%	No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query	4350 +- 5.8% (20 datapoints)	4490.0	3.2%	potential IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query	9408 +- 1.7% (20 datapoints)	9748.0	3.6%	potential IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query	8167 +- 4.3% (20 datapoints)	8332.0	2.0%	No Change
ftsb-1M-enwiki_abstract-hashes-load	23668 +- 7.1% (20 datapoints)	23895.0	1.0%	No Change
ftsb-1M-nyc_taxis-ftadd-load	25940 +- 3.0% (20 datapoints)	27066.0	4.3%	potential IMPROVEMENT
ftsb-1M-nyc_taxis-hashes-load	27683 +- 3.1% (20 datapoints)	28629.0	3.4%	potential IMPROVEMENT
search-aggregate-post-filter-simple.yml	12666 +- 1.0% (20 datapoints)	12723.0	0.5%	No Change
search-expire-doc-10-milliseconds	14 +- 2.2% (20 datapoints)	14.0	-1.0%	No Change
search-expire-doc-1000-seconds	14 +- 1.5% (20 datapoints)	14.0	-3.2%	potential REGRESSION
search-expire-numeric-field-10-milliseconds	14 +- 1.8% (20 datapoints)	14.0	-0.3%	No Change
search-expire-numeric-field-1000-seconds	14 +- 2.4% (20 datapoints)	14.0	3.5%	potential IMPROVEMENT
search-filtering-tag-numeric-filter-pipeline	9302 +- 0.9% (20 datapoints)	9397.0	1.0%	No Change
search-ftsb-10K-enwiki_abstract-hashes-fulltext-aggregate-sortby-limit-0-100	4968 +- 2.9% (20 datapoints)	5002.0	0.7%	No Change
search-ftsb-10K-enwiki_abstract-hashes-fulltext-search-sortby-limit-0-100	4897 +- 2.3% (20 datapoints)	5026.0	2.6%	No Change
search-ftsb-10K-enwiki_abstract-hashes-term-withoutsuffix-trie	10396 +- 1.2% (20 datapoints)	10375.0	-0.2%	No Change
search-ftsb-10K-enwiki_abstract-hashes-term-withsuffix-trie	10351 +- 1.2% (20 datapoints)	10246.0	-1.0%	No Change
search-ftsb-1700K-docs-union-iterators-q3	32 +- 0.5% (20 datapoints)	32.0	0.4%	No Change
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-non-sortable	167 +- 7.1% (20 datapoints)	174.0	4.1%	potential IMPROVEMENT
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-one-indexed-field	10339 +- 1.6% (20 datapoints)	10632.0	2.8%	No Change
search-ftsb-370K-docs-union-iterators-q4	34 +- 1.2% (20 datapoints)	34.0	-0.5%	No Change
search-ftsb-5200K-docs-union-iterators-q1	3.5 +- 1.8% (20 datapoints)	3.5	1.3%	No Change
search-ftsb-5500K-docs-union-iterators-q2	5.3 +- 3.2% (20 datapoints)	5.1	-4.6%	potential REGRESSION
search-ftsb-arxiv-titles-384-angular-filters-m16-ef-128-json-load	3552 +- 3.3% (20 datapoints)	3679.0	3.6%	potential IMPROVEMENT
search-geo	2649 +- 5.9% (20 datapoints)	2661.0	0.4%	No Change
search-high-cardinality-negation-term-baseline	305 +- 1.1% (20 datapoints)	299.0	-2.0%	No Change
search-high-cardinality-negation-term-comparison_union_all_other_terms	158 +- 1.2% (20 datapoints)	155.0	-1.8%	No Change
search-numeric	12380 +- 4.0% (20 datapoints)	12370.0	-0.1%	No Change
search-numeric-optimize	7418 +- 1.2% (20 datapoints)	7452.0	0.5%	No Change
search-numeric-sortby	13153 +- 4.0% (20 datapoints)	13293.0	1.1%	No Change
search-numeric-sortby-desc	12400 +- 4.3% (20 datapoints)	13052.0	5.3%	potential IMPROVEMENT

Cross-arch delta on `bench/expire-persist-coverage` (`x86_64` → `aarch64`)

Same commit (bench/expire-persist-coverage) compared across architectures. Positive deltas = aarch64 outperforms x86_64.

In summary:

Detected a total of 17 stable tests between versions.
Detected a total of 6 highly unstable benchmarks (6 baseline).
Latency analysis confirmed regressions in 1 of the unstable tests:
- ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query: FT.SEARCH +19.3% 🔴
Detected a total of 1 improvements above the improvement water line.
Detected a total of 26 regressions bellow the regression water line 8.0%.

You can check a comparison in detail via the grafana link

Performance Improvements - Comparison between bench/expire-persist-coverage and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline bench/expire-persist-coverage (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-ftsb-1M-enwiki_abstract-hashes-gc	140 +- 4.1% (4 datapoints)	157	11.8%	IMPROVEMENT

Performance Regressions and Issues - Comparison between bench/expire-persist-coverage and bench/expire-persist-coverage.

Time Period from a month ago. (environment used: oss-standalone)

Test Case	Baseline bench/expire-persist-coverage (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
search-numeric	17861 +- 7.9% (4 datapoints)	12370.0	-30.7%	REGRESSION
search-expire-doc-1000-seconds	19 +- 12.4% UNSTABLE (3 datapoints)	14.0	-29.7%	UNSTABLE (baseline high variance); server: p50 latency stable; client: Latency increased 29.4% (baseline CV=8.3%); only client side confirms regression (server side stable) - insufficient evidence
search-aggregate-post-filter-simple.yml	17928 +- 3.6% (4 datapoints)	12723.0	-29.0%	REGRESSION
search-expire-doc-json-10-milliseconds	18368 +- 3.6% (4 datapoints)	13290.0	-27.6%	REGRESSION
ftsb-10K-enwiki_abstract-hashes-fulltext-sortby	1606 +- 5.7% (4 datapoints)	1165.0	-27.5%	REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-suffix-withsuffixtrie	15873 +- 2.7% (4 datapoints)	11549.0	-27.2%	REGRESSION
search-high-cardinality-negation-term-comparison_union_all_other_terms	207 +- 6.4% (4 datapoints)	155.0	-25.1%	REGRESSION
search-expire-numeric-field-10-milliseconds	18 +- 10.6% UNSTABLE (3 datapoints)	14.0	-23.6%	UNSTABLE (baseline high variance); server: p50 latency stable; client: client latency stable; neither server nor client side confirms regression
search-numeric-sortby-desc-optimize	571 +- 8.2% (3 datapoints)	443.0	-22.4%	waterline=8.2%. REGRESSION
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query	5776 +- 11.1% UNSTABLE (4 datapoints)	4490.0	-22.3%	UNSTABLE (baseline high variance); server: FT.SEARCH p50 increased 19.3% (baseline CV=15.4%); client: OverallQuantiles.allCommands.q50 increased 28.0% (baseline CV=11.8%); confidence=HIGH (FT.SEARCH baseline CV=20.7%; FT.SEARCH p99 +22.6% (stable baseline); CV=coefficient of variation (data stability: <30% stable, 30-50% moderate, >50% unstable))
search-ftsb-10K-enwiki_abstract-hashes-term-withoutsuffix-trie	13361 +- 4.9% (4 datapoints)	10375.0	-22.3%	REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-suffix	14031 +- 3.4% (4 datapoints)	10929.0	-22.1%	REGRESSION
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-one-indexed-field	13535 +- 8.0% (4 datapoints)	10632.0	-21.5%	REGRESSION
ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query	10476 +- 5.3% (4 datapoints)	8332.0	-20.5%	REGRESSION
ftsb-10K-singlevalue-numeric-json	3742 +- 3.8% (4 datapoints)	3014.0	-19.5%	REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-wildcard	13730 +- 3.0% (4 datapoints)	11098.0	-19.2%	REGRESSION
ftsb-1M-nyc_taxis-hashes-load	35412 +- 3.1% (4 datapoints)	28629.0	-19.2%	REGRESSION
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query-non-sortable	56 +- 28.0% UNSTABLE (4 datapoints)	46.0	-18.1%	UNSTABLE (baseline high variance); server: p50 latency stable; client: client latency stable; neither server nor client side confirms regression
search-geo	3232 +- 6.4% (4 datapoints)	2661.0	-17.7%	REGRESSION
search-filtering-tag-numeric-filter-pipeline	11349 +- 4.2% (4 datapoints)	9397.0	-17.2%	REGRESSION
ftsb-1K-enwiki_abstract-hashes-term-contains	10973 +- 6.1% (4 datapoints)	9328.0	-15.0%	REGRESSION
search-expire-doc-10-milliseconds	16 +- 3.0% (3 datapoints)	14.0	-14.4%	REGRESSION
search-ftsb-5200K-docs-union-iterators-q1	4.1 +- 3.1% (4 datapoints)	3.5	-14.0%	REGRESSION
search-ftsb-1700K-docs-union-iterators-q3	37 +- 4.4% (4 datapoints)	32.0	-13.8%	REGRESSION
search-high-cardinality-negation-term-baseline	347 +- 5.9% (4 datapoints)	299.0	-13.8%	REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-prefix	11387 +- 7.0% (4 datapoints)	9893.0	-13.1%	REGRESSION
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query-non-sortable	1534 +- 14.0% UNSTABLE (4 datapoints)	1346.0	-12.3%	UNSTABLE (baseline high variance); server: p50 latency stable; client: OverallQuantiles.allCommands.q50 increased 16.8% (baseline CV=13.0%); only client side confirms regression (server side stable) - insufficient evidence
search-numeric-optimize	8473 +- 6.8% (3 datapoints)	7452.0	-12.1%	REGRESSION
search-ftsb-5500K-docs-union-iterators-q2	5.8 +- 2.8% (4 datapoints)	5.1	-11.9%	REGRESSION
search-ftsb-370K-docs-union-iterators-q4	37 +- 5.9% (4 datapoints)	34.0	-10.5%	REGRESSION
search-ftsb-10K-enwiki_abstract-hashes-fulltext-search-sortby-limit-0-100	5610 +- 2.9% (4 datapoints)	5026.0	-10.4%	REGRESSION
search-filtering-tag-numeric	3671 +- 11.2% UNSTABLE (4 datapoints)	4161.0	13.3%	UNSTABLE (baseline high variance); server: FT.AGGREGATE p50 decreased 11.4% (baseline CV=15.6%); client: Latency decreased 11.9% (baseline CV=10.3%); neither server nor client side confirms regression

Tests with No Significant Changes (17 tests)

Tests with No Significant Changes

Test Case	Baseline bench/expire-persist-coverage (median obs. +- std.dev)	Comparison bench/expire-persist-coverage (median obs. +- std.dev)	% change (higher-better)	Note
ftsb-10K-enwiki_pages-hashes-fulltext-mixed_simple-1word-query_write_1_to_read_20.yml	1750 +- 9.0% (4 datapoints)	1677	-4.1%	waterline=9.0%. potential REGRESSION
ftsb-10K-enwiki_pages-hashes-load	57930 +- 6.3% (4 datapoints)	61457	6.1%	potential IMPROVEMENT
ftsb-10K-multivalue-numeric-json	5065 +- 2.6% (4 datapoints)	5227	3.2%	potential IMPROVEMENT
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query	9566 +- 2.6% (4 datapoints)	9748	1.9%	No Change
ftsb-1M-enwiki_abstract-hashes-load	22821 +- 5.1% (4 datapoints)	23895	4.7%	potential IMPROVEMENT
ftsb-1M-nyc_taxis-ftadd-load	26382 +- 4.9% (4 datapoints)	27066	2.6%	No Change
search-expire-doc-50-50-10-milliseconds	50 +- 0.4% (3 datapoints)	50	0.3%	No Change
search-expire-doc-multi-index-10-milliseconds	32 +- 0.1% (4 datapoints)	32	-0.1%	No Change
search-expire-numeric-field-1000-seconds	14 +- 1.9% (3 datapoints)	14	1.8%	No Change
search-ftsb-10K-enwiki_abstract-hashes-fulltext-aggregate-sortby-limit-0-100	4934 +- 2.6% (4 datapoints)	5002	1.4%	No Change
search-ftsb-10K-enwiki_abstract-hashes-term-withsuffix-trie	10277 +- 0.6% (4 datapoints)	10246	-0.3%	No Change
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-non-sortable	165 +- 5.1% (4 datapoints)	174	5.4%	potential IMPROVEMENT
search-ftsb-arxiv-titles-384-angular-filters-m16-ef-128-json-load	3638 +- 1.4% (4 datapoints)	3679	1.1%	No Change
search-numeric-sortby	12679 +- 4.7% (4 datapoints)	13293	4.8%	potential IMPROVEMENT
search-numeric-sortby-desc	12631 +- 4.6% (4 datapoints)	13052	3.3%	potential IMPROVEMENT
search-numeric-sortby-optimize	466 +- 8.9% (4 datapoints)	488	4.6%	waterline=8.9%. potential IMPROVEMENT
search-persist-doc-1000-seconds	35 +- 0.4% (3 datapoints)	35	0.1%	No Change

Brings the four new oss-standalone specs onto this branch so master baseline and this PR head exercise the same set, enabling direct master-vs-#9356 comparisons on the doc-expiration fast path: - search-persist-doc-1000-seconds: PERSIST notification branch (previously unbenched) - search-expire-doc-multi-index-10-milliseconds: 3-index fan-out signal amplifier - search-expire-doc-50-50-10-milliseconds: 50/50 write-ratio variant - search-expire-doc-json-10-milliseconds: JSON branch (Document_LoadSchemaFieldJson) No metadata changes to the four pre-existing expire/numeric-field specs.

The four expire/persist benchmark specs added in this PR push to RedisTimeSeries via redisbench-admin's exporter, which forwards metadata.use_case as a TS.CREATE label value verbatim. The server-side LABELS parser rejects values containing parentheses, em-dashes, forward slashes paired with comparators, and similar punctuation — producing "TSDB: Couldn't parse LABELS" and aborting the headline Ops/sec timeseries push for the affected tests. Server-side metrics (commandstats / latencystats) push through a different path and made it; the headline Ops/sec series did not, leaving redisbench-admin compare with 0 comparison points for these specs. Replace the offending characters with plain ASCII while preserving the intent of each description. No code or workload change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cherry-pick from joan-expire-not-fully-reindex covered the 4 specs that also live on PR #9356; this commit applies the same character-class sanitization to the 4 specs that only exist here (-10-milliseconds, -1000-seconds, -numeric-field-10-milliseconds, -numeric-field-1000-seconds). Same root cause: parens, em-dashes, forward slashes and commas in metadata.use_case make the RTS LABELS parser reject the TS.CREATE call and abort the headline Ops/sec push. No code or workload change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CLAassistant · 2026-05-07T17:46:35Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ fcostaoliveira
❌ paulorsousa
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

sonarqubecloud · 2026-05-08T10:29:14Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions Bot added the size:M label Apr 30, 2026

cursor Bot reviewed Apr 30, 2026

View reviewed changes

fcostaoliveira added the action:run-benchmark label Apr 30, 2026

paulorsousa and others added 2 commits May 7, 2026 18:45

paulorsousa force-pushed the bench/expire-persist-coverage branch from 8d14d1a to 484bced Compare May 8, 2026 10:20

paulorsousa mentioned this pull request May 8, 2026

[MOD-14930] refactor expire handling without full reindexing #9356

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests(benchmarks): expire/persist coverage + group metadata#9380

tests(benchmarks): expire/persist coverage + group metadata#9380
fcostaoliveira wants to merge 3 commits into
masterfrom
bench/expire-persist-coverage

fcostaoliveira commented Apr 30, 2026 •

edited by cursor Bot

Loading

Uh oh!

jit-ci Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 30, 2026

Uh oh!

codecov Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

fcostaoliveira commented Apr 30, 2026 •

edited

Loading

Tests with No Significant Changes

Tests with No Significant Changes

Tests with No Significant Changes

Uh oh!

CLAassistant commented May 7, 2026

Uh oh!

sonarqubecloud Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fcostaoliveira commented Apr 30, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Adds 4 new specs (all oss-standalone only)

Adds metadata grouping (8 files)

Why a separate PR from #9356

Test plan

Uh oh!

jit-ci Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Jit Security Scan Results

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 30, 2026

Choose a reason for hiding this comment

JSON benchmark key-prefix missing trailing colon separator

Uh oh!

codecov Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fcostaoliveira commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Automated performance analysis summary

Architecture: x86_64 — branch-over-branch

Performance Improvements - Comparison between master and bench/expire-persist-coverage.

Performance Regressions and Issues - Comparison between master and bench/expire-persist-coverage.

Tests with No Significant Changes

Architecture: aarch64 — branch-over-branch

Performance Improvements - Comparison between master and bench/expire-persist-coverage.

Performance Regressions and Issues - Comparison between master and bench/expire-persist-coverage.

Tests with No Significant Changes

Cross-arch delta on bench/expire-persist-coverage (x86_64 → aarch64)

Performance Improvements - Comparison between bench/expire-persist-coverage and bench/expire-persist-coverage.

Performance Regressions and Issues - Comparison between bench/expire-persist-coverage and bench/expire-persist-coverage.

Tests with No Significant Changes

Uh oh!

CLAassistant commented May 7, 2026

Uh oh!

sonarqubecloud Bot commented May 8, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fcostaoliveira commented Apr 30, 2026 •

edited by cursor Bot

Loading

Adds 4 new specs (all `oss-standalone` only)

jit-ci Bot commented Apr 30, 2026 •

edited

Loading

codecov Bot commented Apr 30, 2026 •

edited

Loading

fcostaoliveira commented Apr 30, 2026 •

edited

Loading

Architecture: `x86_64` — branch-over-branch

Architecture: `aarch64` — branch-over-branch

Cross-arch delta on `bench/expire-persist-coverage` (`x86_64` → `aarch64`)