Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent create dataset gives bad error, doesn't retry #2403

Open
wjones127 opened this issue May 29, 2024 · 0 comments
Open

Concurrent create dataset gives bad error, doesn't retry #2403

wjones127 opened this issue May 29, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@wjones127
Copy link
Contributor

If a user tries to append to the same dataset concurrently, one writer will fail with a confusing internal error:

Internal { message: "Commit conflict", location: Location { file: "rust/lance-table/src/io/commit.rs", line: 424, column: 27 } }

This is because in commit_new_dataset, we don't retry to catch the CommitError::CommitConflict:

write_manifest_file(
object_store,
commit_handler,
base_path,
&mut manifest,
if indices.is_empty() {
None
} else {
Some(indices.clone())
},
write_config,
)
.await?;

For comparison, in commit_transaction(), we have a retry loop and a nicer error message:

Err(crate::Error::CommitConflict {
version: target_version,
source: format!(
"Failed to commit the transaction after {} retries.",
commit_config.num_retries
)
.into(),
location: location!(),
})

We should change this to see if the transactions are compatible and, if so, have commit_new_dataset load the dataset and call into commit_transaction.

@wjones127 wjones127 added the bug Something isn't working label May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant