Problem
Extends #227. When indexFile is called on a path that already exists, removeFile is called first to clear old trigrams. But removeFile deletes the path_to_id entry and leaves the id_to_path slot as a dead entry — it does not compact the array.
On the next indexFile call, getOrCreateDocId misses in path_to_id (entry was deleted) and appends a new slot to id_to_path. Over N re-indexes, id_to_path.items.len grows by 1 each time, wasting memory proportional to O(re-index count × files).
This is the same root cause as #227 but manifests even for a single file re-indexed repeatedly.
Failing Test
test "issue-247: TrigramIndex.id_to_path does not grow on re-index of same file" {
var idx = TrigramIndex.init(testing.allocator);
defer idx.deinit();
const src = "fn alpha() void {} fn beta() void {} const X = 1;";
var i: usize = 0;
while (i < 5) : (i += 1) {
try idx.indexFile("f.zig", src);
}
// Currently FAILS: id_to_path.items.len == 5 (grows by 1 per re-index).
try testing.expectEqual(@as(usize, 1), idx.id_to_path.items.len);
}
Expected
After N re-indexes of the same file, id_to_path.items.len equals the number of unique files indexed, not the number of indexFile calls.
Fix
Option A: make removeFile also remove the id_to_path slot and compact (swap-remove), updating path_to_id for the moved entry.
Option B: make getOrCreateDocId reuse existing slots — check path_to_id AND id_to_path before allocating a new id.
Problem
Extends #227. When
indexFileis called on a path that already exists,removeFileis called first to clear old trigrams. ButremoveFiledeletes thepath_to_identry and leaves theid_to_pathslot as a dead entry — it does not compact the array.On the next
indexFilecall,getOrCreateDocIdmisses inpath_to_id(entry was deleted) and appends a new slot toid_to_path. Over N re-indexes,id_to_path.items.lengrows by 1 each time, wasting memory proportional toO(re-index count × files).This is the same root cause as #227 but manifests even for a single file re-indexed repeatedly.
Failing Test
Expected
After N re-indexes of the same file,
id_to_path.items.lenequals the number of unique files indexed, not the number ofindexFilecalls.Fix
Option A: make
removeFilealso remove theid_to_pathslot and compact (swap-remove), updatingpath_to_idfor the moved entry.Option B: make
getOrCreateDocIdreuse existing slots — checkpath_to_idANDid_to_pathbefore allocating a new id.