Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a new ratio config instead of table_data_deserialized_data_bytes #14896

Merged
merged 4 commits into from Mar 11, 2024

Conversation

chienguo
Copy link
Contributor

@chienguo chienguo commented Mar 10, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Mar 10, 2024
src/query/config/src/config.rs Outdated Show resolved Hide resolved
src/query/config/src/inner.rs Outdated Show resolved Hide resolved
@dantengsky
Copy link
Member

dantengsky commented Mar 11, 2024

@chienguo

Thanks for your contribution!

There's a concern that hoping you can consider in this PR:

In some scenarios (such as testing and deploying on cloud services), retaining the table_data_deserialized_data_bytes setting alongside table_data_deserialized_memory_ratio would be more convenient.

Furthermore, if both settings are non-zero, the table_data_deserialized_data_bytes setting should have higher priority.

Similar handling in the settings for spilling may serve as a reference (however, in the scenario of this PR, there's no need to consider settings at the per_core level).

https://github.com/datafuselabs/databend/blob/345cba5aa73ba3e93f4c6372a75fb956812243e4/src/query/service/src/pipelines/builders/builder_sort.rs#L212C1-L239C5

cc @BohuTANG

@BohuTANG
Copy link
Member

@chienguo

Thanks for your contribution!

There's a concern that hoping you can consider in this PR:

In some scenarios (such as testing and deploying on cloud services), retaining the table_data_deserialized_data_bytes setting alongside table_data_deserialized_memory_ratio would be more convenient.

Furthermore, if both settings are non-zero, the table_data_deserialized_data_bytes setting should have higher priority.

Similar handling in the settings for spilling may serve as a reference (however, in the scenario of this PR, there's no need to consider settings at the perf_core level).

https://github.com/datafuselabs/databend/blob/345cba5aa73ba3e93f4c6372a75fb956812243e4/src/query/service/src/pipelines/builders/builder_sort.rs#L212C1-L239C5

cc @BohuTANG

@dantengsky Your suggestion looks great to me.
@chienguo Sorry for issue #14866 , it seems it wasn't clear enough, @dantengsky added clarity.

chienguo and others added 3 commits March 11, 2024 11:25
Co-authored-by: dantengsky <dantengsky@gmail.com>
Co-authored-by: dantengsky <dantengsky@gmail.com>
@chienguo
Copy link
Contributor Author

@dantengsky, already made the original one back, and have dealt with the priority

@dantengsky
Copy link
Member

@chienguo LGTM & thanks

@BohuTANG
Copy link
Member

@chienguo 👍 . Thanks for your contribution.

@BohuTANG BohuTANG merged commit 038d522 into datafuselabs:main Mar 11, 2024
71 checks passed
@chienguo
Copy link
Contributor Author

@BohuTANG, @dantengsky, thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants