Fix incoming edges ghost memory leak (MOD-13761) by meiravgri · Pull Request #920 · RedisAI/VectorSimilarity

meiravgri · 2026-03-23T17:23:28Z

Problem

When vectors are deleted from an HNSW index, the incomingUnidirectionalEdges vectors on neighboring nodes grow during insertion but never shrink after deletion. This causes "ghost memory" — allocated but unused capacity that accumulates over time.

In stress scenarios (e.g. COSINE metric with hub nodes), incoming edge vectors can grow to thousands of entries and retain all that capacity after the edges are removed.

Fix

Add amortized shrink_to_fit() inside removeIncomingUnidirectionalEdgeIfExists() in graph_data.h:

constexpr size_t INCOMING_EDGES_SHRINK_RATIO = 2;

bool removeIncomingUnidirectionalEdgeIfExists(idType node_id) {
    bool result = this->incomingUnidirectionalEdges->remove(node_id);

    if (result) {
        auto &vec = *this->incomingUnidirectionalEdges;
        if (vec.capacity() > INCOMING_EDGES_SHRINK_RATIO * vec.size()) {
            vec.shrink_to_fit();
        }
    }

    return result;
}

The ratio threshold ensures amortized O(1) cost — shrink_to_fit() only fires when capacity exceeds 2× the actual size, avoiding reallocation churn on every removal.

Results

Measured with a stress scenario: 40K random + 50K zero vectors, COSINE metric, M=16, EF_C=200, 1 background thread.

Metric	Before Fix	After Fix (Ratio=2)
Ghost memory after deletion	5,339 KB	116 KB (98% reduction)
Async delete latency	3,031 ms	3,034 ms (+0.1%)
InPlace delete latency	5,462 ms	5,575 ms (+2.1%)
Insert latency	95 ms	95 ms (0%)

Full benchmark details and raw results: Confluence page

Benchmarks added

This PR also adds a benchmark suite (bm_incoming_edges) to measure the fix:

DeleteZeroVectorsAsync — async deletion path (production default)
DeleteZeroVectorsInPlace — synchronous deletion (worst-case latency)
InsertZeroVectorsTimed — insertion path (heuristic pruning triggers shrink too)

make benchmark BM_FILTER=bm-hnsw-internals-incoming-edges

Add three benchmarks to measure performance and memory impact of the incoming edges shrink_to_fit fix: 1. DeleteZeroVectorsAsync - async deletion path (production default) 2. DeleteZeroVectorsInPlace - in-place deletion path (worst-case latency) 3. InsertZeroVectorsTimed - insertion path (heuristic pruning cost) Stress scenario: 40K random + 50K zero vectors with COSINE metric, which forces hub nodes with large incoming edge vectors. Each benchmark measures ghost memory (wasted capacity) before and after shrink_to_fit, with detailed stats (percentiles, top-10, mean). Run with: make benchmark BM_FILTER=bm-index-internals-incoming-edges

jit-ci · 2026-03-23T17:24:59Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

results_before.txt

codecov · 2026-03-23T17:46:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.01%. Comparing base (3dd2dc2) to head (874e2fa).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #920      +/-   ##
==========================================
+ Coverage   96.98%   97.01%   +0.02%     
==========================================
  Files         129      129              
  Lines        7567     7572       +5     
==========================================
+ Hits         7339     7346       +7     
+ Misses        228      226       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

There are 4 total unresolved issues (including 1 from previous review).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

src/VecSim/algorithms/hnsw/graph_data.h

alonre24

Very Nice!!

.github/workflows/benchmark.yml

alonre24 · 2026-03-24T17:58:37Z

tests/benchmark/index_internals/bm_incoming_edges.h

+        // Sort for percentiles and top-10
+        std::vector<size_t> sorted_sizes(all_sizes);
+        std::vector<size_t> sorted_caps(all_caps);
+        std::sort(sorted_sizes.begin(), sorted_sizes.end());
+        std::sort(sorted_caps.begin(), sorted_caps.end());
+
+        // Percentile helper (nearest-rank method)
+        auto percentile = [](const std::vector<size_t> &sorted, double p) -> size_t {
+            if (sorted.empty())
+                return 0;
+            size_t idx = static_cast<size_t>(p / 100.0 * sorted.size());
+            if (idx >= sorted.size())
+                idx = sorted.size() - 1;
+            return sorted[idx];
+        };
+
+        size_t p50_size = percentile(sorted_sizes, 50);
+        size_t p90_size = percentile(sorted_sizes, 90);
+        size_t p99_size = percentile(sorted_sizes, 99);
+        size_t max_size = sorted_sizes.empty() ? 0 : sorted_sizes.back();
+
+        size_t p50_cap = percentile(sorted_caps, 50);
+        size_t p90_cap = percentile(sorted_caps, 90);
+        size_t p99_cap = percentile(sorted_caps, 99);
+        size_t max_cap = sorted_caps.empty() ? 0 : sorted_caps.back();
+
+        // --- Report index memory via benchmark counter ---
+        state.counters["index_memory"] = hnsw_->getAllocationSize();
+
+        // --- Print detailed distribution to stdout ---
+        std::cout << "\n=== Incoming Edges Stats"
+                  << (iteration >= 0 ? " (iter=" + std::to_string(iteration) + ")" : "")
+                  << " ===" << std::endl;
+        std::cout << "  Nodes: " << num_elements << "  Level entries: " << total_vectors
+                  << "  Non-empty: " << non_empty_count << std::endl;
+        std::cout << "  Wasted bytes: " << wasted_bytes << "  (used=" << total_used_bytes
+                  << ", alloc=" << total_alloc_bytes << ")" << std::endl;
+        std::cout << "  Size  - mean: " << mean_size << "  p50: " << p50_size
+                  << "  p90: " << p90_size << "  p99: " << p99_size << "  max: " << max_size
+                  << std::endl;
+        std::cout << "  Cap   - mean: " << mean_cap << "  p50: " << p50_cap << "  p90: " << p90_cap
+                  << "  p99: " << p99_cap << "  max: " << max_cap << std::endl;


Consider removing all of these stats and keeping only the most relevant stats to be output in state.counters

example output:

--- Async iteration 0: After deletion (before shrink) --- === Incoming Edges Stats (iter=0) === Nodes: 40000 Level entries: 42595 Non-empty: 14867 Wasted bytes: 60188 (used=277196, alloc=337384) Size - mean: 1.62693 p50: 0 p90: 6 p99: 14 max: 43 Cap - mean: 1.98019 p50: 0 p90: 7 p99: 20 max: 50 Top-10 by size: [43, 37, 34, 31, 31, 31, 31, 29, 29, 29] Top-10 by cap: [50, 46, 46, 44, 44, 44, 43, 42, 42, 40] --- Async iteration 0: After shrink (baseline) --- === Incoming Edges Stats (iter=0) === Nodes: 40000 Level entries: 42595 Non-empty: 14867 Wasted bytes: 0 (used=277196, alloc=277196) Size - mean: 1.62693 p50: 0 p90: 6 p99: 14 max: 43 Cap - mean: 1.62693 p50: 0 p90: 6 p99: 14 max: 43 Top-10 by size: [43, 37, 34, 31, 31, 31, 31, 29, 29, 29] Top-10 by cap: [43, 37, 34, 31, 31, 31, 31, 29, 29, 29]

This is a diagnostic benchmark for investigating memory issues, not a regression benchmark. When run manually, seeing the wasted bytes breakdown, node count, and top-10 by capacity directly in the console gives the full picture without needing a separate analysis script. state.counters only supports scalars and can't capture this.

I'll remove the size/cap distribution lines (mean, p50, p90, etc.) and add wasted_bytes to state.counters so the most important metric is also in the JSON output.

better output

meiravgri · 2026-03-25T12:07:48Z

/backport

github-actions · 2026-03-25T12:08:06Z

Git push to origin failed for 8.2 with exitcode 1

meiravgri · 2026-03-25T12:08:36Z

/backport

github-actions · 2026-03-25T12:08:51Z

Git push to origin failed for 8.2 with exitcode 1

* Add incoming edges ghost memory benchmarks (MOD-13761) Add three benchmarks to measure performance and memory impact of the incoming edges shrink_to_fit fix: 1. DeleteZeroVectorsAsync - async deletion path (production default) 2. DeleteZeroVectorsInPlace - in-place deletion path (worst-case latency) 3. InsertZeroVectorsTimed - insertion path (heuristic pruning cost) Stress scenario: 40K random + 50K zero vectors with COSINE metric, which forces hub nodes with large incoming edge vectors. Each benchmark measures ghost memory (wasted capacity) before and after shrink_to_fit, with detailed stats (percentiles, top-10, mean). Run with: make benchmark BM_FILTER=bm-index-internals-incoming-edges * results before * shrinking logic * fix uncoditionally shrink * add bm-index-internals-incoming-edges * use 1 thread * use ratio = 2, remove min * remove results before * rename to bm-hnsw-internals-incoming-edges better output

Fix incoming edges ghost memory leak (MOD-13761) (#920) * Add incoming edges ghost memory benchmarks (MOD-13761) Add three benchmarks to measure performance and memory impact of the incoming edges shrink_to_fit fix: 1. DeleteZeroVectorsAsync - async deletion path (production default) 2. DeleteZeroVectorsInPlace - in-place deletion path (worst-case latency) 3. InsertZeroVectorsTimed - insertion path (heuristic pruning cost) Stress scenario: 40K random + 50K zero vectors with COSINE metric, which forces hub nodes with large incoming edge vectors. Each benchmark measures ghost memory (wasted capacity) before and after shrink_to_fit, with detailed stats (percentiles, top-10, mean). Run with: make benchmark BM_FILTER=bm-index-internals-incoming-edges * results before * shrinking logic * fix uncoditionally shrink * add bm-index-internals-incoming-edges * use 1 thread * use ratio = 2, remove min * remove results before * rename to bm-hnsw-internals-incoming-edges better output

results before

7b14ed8

meiravgri added bm-basics-fp32-single bm-updated-fp32-single labels Mar 23, 2026

cursor bot reviewed Mar 23, 2026

View reviewed changes

results_before.txt Outdated Show resolved Hide resolved

shrinking logic

4e3ea6f

cursor bot reviewed Mar 24, 2026

View reviewed changes

src/VecSim/algorithms/hnsw/graph_data.h Outdated Show resolved Hide resolved

src/VecSim/algorithms/hnsw/graph_data.h Show resolved Hide resolved

src/VecSim/algorithms/hnsw/graph_data.h Outdated Show resolved Hide resolved

meiravgri removed the bm-updated-fp32-single label Mar 24, 2026

fix uncoditionally shrink

3b0115e

meiravgri added the bm-hnsw-internals-incoming-edges label Mar 24, 2026

meiravgri added 4 commits March 24, 2026 06:14

add bm-index-internals-incoming-edges

8b59c6d

use 1 thread

4c39de5

use ratio = 2, remove min

dfa9ed9

remove results before

a4c616c

meiravgri changed the title ~~Add incoming edges ghost memory benchmarks (MOD-13761)~~ Fix incoming edges ghost memory leak (MOD-13761) Mar 24, 2026

meiravgri requested a review from alonre24 March 24, 2026 11:06

alonre24 reviewed Mar 24, 2026

View reviewed changes

rename to bm-hnsw-internals-incoming-edges

874e2fa

better output

meiravgri requested a review from alonre24 March 25, 2026 06:26

alonre24 approved these changes Mar 25, 2026

View reviewed changes

meiravgri added this pull request to the merge queue Mar 25, 2026

Merged via the queue into main with commit 2f21358 Mar 25, 2026
30 checks passed

meiravgri deleted the meiravgri/mod-13761-incoming-edges-benchmarks branch March 25, 2026 11:39

meiravgri added backport 8.2 and removed bm-basics-fp32-single bm-hnsw-internals-incoming-edges labels Mar 25, 2026

meiravgri mentioned this pull request Mar 25, 2026

[8.2] Fix incoming edges ghost memory leak (MOD-13761) #922

Merged

Conversation

meiravgri commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Results

Benchmarks added

Uh oh!

jit-ci bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Jit Security Scan Results

Uh oh!

Uh oh!

codecov bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alonre24 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alonre24 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

meiravgri Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

meiravgri commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

meiravgri commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

meiravgri commented Mar 23, 2026 •

edited

Loading

jit-ci bot commented Mar 23, 2026 •

edited

Loading

codecov bot commented Mar 23, 2026 •

edited

Loading