Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge issue about large data part with lightweight delete #56728

Closed
jewelzqiu opened this issue Nov 14, 2023 · 4 comments · Fixed by #57648 or #58223
Closed

Merge issue about large data part with lightweight delete #56728

jewelzqiu opened this issue Nov 14, 2023 · 4 comments · Fixed by #57648 or #58223

Comments

@jewelzqiu
Copy link
Contributor

Use case

When a data part is larger than max_bytes_to_merge_at_max_space_in_pool, lightweight deleted data inside the data part will never be really deleted unless OPTIMIZE FINAL is executed.

Describe the solution you'd like

When selecting parts to merge, use estimated existed rows size
(bytes_on_disk * (rows_count - lwd_rows_count) / rows_count) as the source part size instead of bytes_on_disk.
We can enable this mode only when deleted rows ratio exceeds a certain value (maybe 25%).

@lqhl
Copy link

lqhl commented Nov 15, 2023

@alexey-milovidov We plan to implement this feature. What are your thoughts on it? Do you think it's a good solution?

@melvynator
Copy link
Member

Adding @alesapin as he also might have some thought about the topic.

@alexey-milovidov
Copy link
Member

Yes, good idea - it will be appreciated.

@den-crane
Copy link
Contributor

reverted #58097

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment