fix: DROP TABLE no longer leaks across checkpoint boundary (#293, #398)#399
Conversation
Issue alibaba#293 reported two bugs from DROP TABLE: 1. DROP TABLE x; CREATE TABLE x; SELECT returned stale data. 2. DROP TABLE x; CHECKPOINT crashed the server. Root cause: VertexTable/EdgeTable lifecycle was not separated from the on-disk checkpoint state. Re-CREATEing a same-name label re-mmapped the prior label's leftover tmp files, and Drop() left the in-memory smart pointers populated, breaking the next Dump(). Design constraints adopted: - Drop() at every layer releases in-memory state only; it never touches the filesystem. checkpoint_dir is owned exclusively by Dump(). This preserves the implicit crash-rollback semantics: a crash between DROP and the next CHECKPOINT re-reads the prior checkpoint and effectively undoes the DROP. - Open() is split into Open() (restore from checkpoint) and Initialize() (fresh table). Initialize() sweeps tmp_dir for files keyed on the same label name/triple before opening, which is what fixes the same-session DROP+CREATE stale-data path (schema reuses the numeric label_t for a same-name CREATE, but tmp filenames key on the string name). - Tombstone slots in vertex_tables_ keep the label_t -> vector index mapping stable across DROP. - DeleteVertexType / DeleteEdgeType call Drop() before erasing the edge table map entries, so the two paths are symmetric. The previously proposed virtual IDataContainer::Drop() and per-container Drop() overrides are not introduced: in kInMemory / kHugePagePreferred modes a container's path_ points directly at the checkpoint file, so a generic Drop()-deletes-path_ contract would silently wipe checkpoint state. Keeping the file-system exit out of the container interface removes the footgun entirely. Tests: - tests/storage/test_checkpoint.cc: DropTableCheckpointTest cases for drop+checkpoint, drop+reopen+recreate, and the edge-table mirrors. - tools/python_bind/tests/test_db_query.py: end-to-end coverage for DROP TABLE + recreate visibility.
f41bc1c to
2b9d507
Compare
There was a problem hiding this comment.
Pull request overview
This PR tightens the DROP TABLE vs CHECKPOINT contract to prevent dropped tables from leaking persisted state across a checkpoint boundary (fixing #293) and implements the refactor tracked in #398 by separating “restore from checkpoint” vs “fresh initialize” behavior for storage tables.
Changes:
- Split table bring-up into
Open()(restore from checkpoint) vsInitialize()(fresh table) for vertex/edge tables, with tmp-dir sweeping on re-create to avoid same-session stale mmaps. - Adjust graph open semantics to tolerate tombstoned (dropped) label slots while preserving
label_t→ slot mapping. - Add/expand regression coverage in both Python bindings and C++ storage checkpoint tests, including typed coverage over multiple memory levels.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/python_bind/tests/test_db_query.py | Adds Python regression tests for DROP+reCREATE and delete-all-then-reinsert visibility. |
| tests/storage/test_checkpoint.cc | Refactors checkpoint tests into typed suites (memory-level coverage) and adds DROP/CHECKPOINT regression scenarios. |
| src/storages/graph/vertex_table.cc | Adds Initialize() + tmp-file sweeping and routes Open()/Initialize() through a shared openImpl. |
| src/storages/graph/property_graph.cc | Uses Initialize() for CREATE paths, calls Drop() before erasing edge tables, and keeps tombstone slots on open. |
| src/storages/graph/edge_table.cc | Adds Initialize() + tmp sweeping, refactors open path, and introduces Close()/Drop(). |
| src/storages/csr/mutable_csr.cc | Makes CSR open tolerate missing snapshot directories by conditionally building the snapshot prefix. |
| include/neug/storages/graph/vertex_table.h | Adds default ctor + Initialize() + shared openImpl declaration. |
| include/neug/storages/graph/edge_table.h | Adds Initialize(), Close(), Drop(), and shared openImpl declaration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
be6192f to
7551bc3
Compare
| meta_->dst_label_name, meta_->edge_label_name, | ||
| capacity_, table_idx_); | ||
| if (table_cap != capacity_.load()) { | ||
| THROW_INVALID_ARGUMENT_EXCEPTION( |
There was a problem hiding this comment.
is INVALID_ARGUMENT_EXCEPTION appropriate here
There was a problem hiding this comment.
Replace with Internal error
| // DROP). Do not make this function delete files; doing so requires writing | ||
| // DROP to the WAL first or that rollback invariant will break. | ||
| void EdgeTable::Drop() { | ||
| out_csr_.reset(); |
There was a problem hiding this comment.
is there a difference in usage between Drop and Close?
There was a problem hiding this comment.
It seems that for LFIndexer, Table, VertexTable, EdgeTable, the logic of drop and close are actually the same. Maybe we could unify two method into one.
bc4e3f9 to
99953c7
Compare
99953c7 to
d44a607
Compare
469c00f to
9a83b43
Compare
Summary
Fixes #293 and implements the refactor tracked in #398. Replaces the earlier PR #306 approach with a tighter Drop/checkpoint contract that keeps file-system mutation out of the container layer.
checkpoint_diris touched exclusively byDump(). This preserves the implicit crash-rollback semantics for DROP-before-CHECKPOINT.Open()split intoOpen()(restore from checkpoint) andInitialize()(fresh table). Replaces the leakybool load_from_checkpoint = trueparameter.Initialize()sweepstmp_dirfor files keyed on the same label name/triple before opening fresh containers. This is what unsticks the same-session DROP+CREATE stale-data path — schema reuses the numericlabel_tfor a same-name CREATE, but tmp filenames key on the string name.DeleteVertexTypeandDeleteEdgeTypecallDrop()before erasing, making the two paths symmetric.IDataContainer::Drop()is intentionally NOT added. InkInMemory/kHugePagePreferredmodes a container'spath_points at the checkpoint file, so a generic Drop-deletes-path contract would silently corrupt checkpoint state. See Refactor: tighten DROP TABLE / checkpoint isolation contract #398 for the full rationale.