Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact with many thread could fail with commit conflict #2397

Open
chebbyChefNEQ opened this issue May 26, 2024 · 1 comment
Open

Compact with many thread could fail with commit conflict #2397

chebbyChefNEQ opened this issue May 26, 2024 · 1 comment
Labels
bug Something isn't working rust Rust related tasks

Comments

@chebbyChefNEQ
Copy link
Contributor

repro

very_large_dataset.optimize.compact_files(num_threads=a_lot)
@chebbyChefNEQ chebbyChefNEQ added bug Something isn't working rust Rust related tasks labels May 26, 2024
@wjones127
Copy link
Contributor

I think the problem is that each rewrite task calls reserve_fragment_ids itself, which can cause contention on the commit if there are a lot of threads. Each task needs to do this so that it can construct the row id map. (We don't have accurate target row ids until the fragment ids are established.)

The stable row ids will be able to move the reserve fragments to the end, removing this contention.

For existing row ids, maybe we can do something more clever to make this possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rust Rust related tasks
Projects
None yet
Development

No branches or pull requests

2 participants