Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix index corruption on reindex #170

Merged
merged 8 commits into from
Jan 23, 2023
Merged

Fix index corruption on reindex #170

merged 8 commits into from
Jan 23, 2023

Conversation

arkpar
Copy link
Member

@arkpar arkpar commented Jan 23, 2023

An interrupted reindexing process would result in duplicated index entries on the next attempt. The check for existing entries when reindexing is incorrect, as it checks for full key when only partial key is available.

Copy link
Collaborator

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 didn't thought about this.

_ => Ok(PlanOutcome::Written),
let mut outcome = PlanOutcome::Written;
while let PlanOutcome::NeedReindex =
tables.index.write_insert_plan(key, address, None, log)?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure.
Looping is just in case there is some concurrent reindexing leading to a multiple reindex trigger (could happen , but probably rare in practice).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it should never happen realistically. Insertion into an empty index should always be successful. This code had a small bug. An entry in the value table was created twice if reindex was triggered.

}
Ok(false)
}

fn search_all_indexes<'a>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me think search_all_indexes could be call search_index and search_index something a bit technical, so first idea would be to search all indexes.
But really not needed.

@@ -620,14 +642,20 @@ impl HashColumn {
} else if let Some(table) = reindex.queue.iter().find(|r| r.id == record.table) {
table.validate_plan(record.index, log)?;
} else {
if record.table.index_bits() < tables.index.id.index_bits() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember deriving ord the other day when looking at this, just realize it may have been plain wrong (logic to ordering between different index may just be bad).

src/db.rs Outdated
f: impl FnMut(IterState) -> bool,
) -> Result<()> {
/// Iterate a column and call a function for each value. Iteration order is unspecified. Note
/// that for hash columns the key is the hash of the original key.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add that this iteration skip the commit overlay (iirc this is on dbinner so user should not be calling it, but still would be good to note).
Could also say that this is only for Hash indexed column (btree is unimplemented, since we got iterator, but could actually probably be).

src/error.rs Show resolved Hide resolved
Copy link
Contributor

@Tpt Tpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have nothing meaningful to add to @cheme review

@arkpar arkpar merged commit a6a0c90 into master Jan 23, 2023
@arkpar arkpar deleted the a-fix-reindex branch January 23, 2023 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants