Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panics during highly concurrent tombstone cleanup #4411

Merged
merged 3 commits into from
Mar 8, 2024

Conversation

abdelr
Copy link
Contributor

@abdelr abdelr commented Mar 6, 2024

What's being changed:

If the entry point is marked as deleted and we have a high concurrency level for the tombstones cleanup the server panics.

fixes #4410

Review checklist

  • Documentation has been updated, if necessary. Link to changed documentation:
  • Chaos pipeline run or not necessary. Link to pipeline:
  • All new code is covered by tests where it is reasonable.
  • Performance tests have been run or not necessary.

@abdelr abdelr self-assigned this Mar 6, 2024
@rthiiyer82
Copy link

2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | 
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | goroutine 994700 [running]:
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).searchLayerByVectorWithDistancer(0x4002236c80, {0x403898c700, 0x20, 0x20}, 0x0?, 0x1, 0x2, {0x0?, 0x0?}, {0x0, ...})
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/search.go:271 +0x7c8
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).findBestEntrypointForNode(0x4002236c80, 0x18a6b70?, 0x0, 0x403ba69980?, {0x403898c700, 0x20, 0x20}, {0x0?, 0x0?})
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/index.go:431 +0x1d8
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).reassignNeighbor(0x4002236c80, 0xa562, {0x18b2480, 0x4002a0e880}, 0x73616c43223a2273?)
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/delete.go:409 +0x41c
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).reassignNeighborsOf.func1()
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/delete.go:334 +0x144
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/pkg/mod/golang.org/x/sync@v0.6.0/errgroup/errgroup.go:78 +0x58
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 994218
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/pkg/mod/golang.org/x/sync@v0.6.0/errgroup/errgroup.go:75 +0x98
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | panic: runtime error: invalid memory address or nil pointer dereference
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x9d07e8]
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | 
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | goroutine 994829 [running]:
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).searchLayerByVectorWithDistancer(0x4002236c80, {0x4010790900, 0x20, 0x20}, 0x0?, 0x1, 0x2, {0x0?, 0x0?}, {0x0, ...})
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/search.go:271 +0x7c8
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).findBestEntrypointForNode(0x4002236c80, 0x18a6b70?, 0x0, 0x404e871400?, {0x4010790900, 0x20, 0x20}, {0x0?, 0x0?})
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/index.go:431 +0x1d8
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).reassignNeighbor(0x4002236c80, 0xa473, {0x18b2480, 0x4002a0e880}, 0x31353631332e302c?)
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/delete.go:409 +0x41c
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw.(*hnsw).reassignNeighborsOf.func1()
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/src/github.com/weaviate/weaviate/adapters/repos/db/vector/hnsw/delete.go:334 +0x144
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  | golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-03-07 11:30:34 weaviate-weaviate-node-1-1  |       /go/pkg/mod/golang.org/x/sync@v0.6.0/errgr

Above Panic observed in the pr branch.

@abdelr abdelr changed the title fix panic when removing entryPoint with high concurrent tombstones cleanups fixes #4410 Mar 8, 2024
@abdelr abdelr force-pushed the fix_concurrent_tombstones_cleanup branch from 3a42c61 to ffb715c Compare March 8, 2024 10:09
Copy link

sonarcloud bot commented Mar 8, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
46.5% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@parkerduckworth parkerduckworth changed the title fixes #4410 Fix panics during highly concurrent tombstone cleanup Mar 8, 2024
@antas-marcin antas-marcin merged commit a73fd35 into stable/v1.24 Mar 8, 2024
33 of 35 checks passed
@antas-marcin antas-marcin deleted the fix_concurrent_tombstones_cleanup branch March 8, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic findNewLocalEntrypoint called on an empty hnsw graph with deletion concurrency > 1
5 participants