-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: merge change sets on the dev branch #1423
Conversation
## Rationale When channel capacity < read_parallelism, we will pass 0 to channel, which will cause panic ``` 2023-12-05 20:31:32.974 ERRO [components/panic_ext/src/lib.rs:54] thread 'ceres-read' panicked 'mpsc bounded channel requires buffer > 0' at "analytic_engine/src/sst/parquet/async_reader.rs:736" ``` ## Detailed Changes - Ensure channel size non zero ## Test Plan No need.
## Rationale Close apache#1178 ## Detailed Changes - Use string to represent request_id, which is uuid v4, random string like `575c02e1-cd92-4c35-a5f3-353781163e93` - https://docs.rs/uuid/1.6.1/uuid/struct.Uuid.html#method.new_v4 ## Test Plan Manually.
## Rationale Because the logic related to create table has been reconstructed in `horaemeta` apache/incubator-horaedb-meta#288, the results returned when reporting errors during create table have changed. Therefore, we need to correct the results of table creation in the integration test. ## Detailed Changes * Update create table result in integration test. ## Test Plan No need.
## Rationale In this PR apache#1344, the results of table creation in the integration test were updated, but it was found that the integration test failed to pass due to an extra blank row. Fix this problem. ## Detailed Changes * Fix create table result. ## Test Plan No need.
## Rationale ## Detailed Changes ## Test Plan
## Rationale <del>This issue happens quite a lot in CI, and developer can do nothing beside retry, which is a very annoying thing. Since this test is cased by race condition, and there is no easy way to fix it, so I suggest we disable it for now. </del> ## Detailed Changes Increase wait time to 10s. ## Test Plan CI
## Rationale The flush failure before table close may lead to shard close failure. However, such failure is tolerable because the unflushed can still be recovered during the following table open. ## Detailed Changes Ignore the flush failure before closing table. ## Test Plan Pass all the tests in the ci.
…apache#1363) ## Rationale The metrics about the fetched data from the object storage helps estimate the load of query on one specific table, and the table's load distribution can help build a cluster topology with better load balance. ## Detailed Changes Collect metrics for bytes fetched from object storage. ## Test Plan Check the added metrics manually.
## Rationale Metrics about fetched sst bytes with `get_ranges` is not collected. ## Detailed Changes Collect the missing metrics. ## Test Plan Query and check the metrics after executing a query.
## Rationale ## Detailed Changes ## Test Plan Pass CI
… once (apache#1369) ## Rationale The stats about the number of bytes fetched from object store should not include the low-frequency reading, e.g. compaction because such stats are used to show the query load distribution across the tables. ## Detailed Changes Ignore collecting the fetched bytes stats in the low-frequency reading. ## Test Plan The ci's tests should pass.
## Rationale HTTP API is mainly used for debugging, and should not be blocked. ## Detailed Changes ## Test Plan Pass CI
…che#1365) ## Rationale Close apache#1105 ## Detailed Changes - Reduce the parameters in the sst write path - Avoid building dictionary for massive unique column values ## Test Plan - New unit test for the changeset - Observe the metrics for disable/enable dictionary encoding
## Rationale The codes about deciding whether to do metrics collection according to the read frequency are duplicate. ## Detailed Changes Remove the duplicate codes. ## Test Plan CI.
…pache#1372) ## Rationale The column value set has been summarized into the meta data if the column is low distinct. With such information, the sampling for the columns can be skipped. ## Detailed Changes Skip sampling over the low-cardinality columns. ## Test Plan Updated the unit tests.
## Rationale apache#1003 tries to avoid frequent flush requests which may generate massive small ssts, but the write stall is also removed in the normal write path. ## Detailed Changes Introduce the `min_flush_interval` to avoid frequent flush requests and recover the write stall mechanism. ## Test Plan Add unit tests for the frequent flush check.
## Rationale Some logs about query are verbose and some key logs about opening shard are missing. ## Detailed Changes Remove verbose logs and add missing key logs. ## Test Plan CI.
…le (apache#1307) ## Rationale Closes apache#1302 The pulling arrow record batches are ensured to include primary key columns, however the pulled primary key columns are unused for append mode tables' queries. I refactor the whole record batches pulling path in this pr for readability and enhancement for avoiding pulling primary key columns even they are unused. ## Detailed Changes + Refactor `RowProjector` to `RecordFetchingContext` holding just the needed information, and pass it to `ScanRequest` & `SstReadOptions` rather than the too heavy `ProjectedSchema`. + Refactor `RecordBatchWithKey` to `FetchingRecordBatch` which holds the primary indexes on demand. ## Test Plan Test by exist and new added tests.
## Rationale Reduce two `match`to only one. ## Detailed Changes ## Test Plan CI
## Rationale When data wal is disable, data is still encoded, which waste cpu usage. ## Detailed Changes Skip encode when data wal is disabled. ## Test Plan CI.
## Rationale Currently there are no error log for remote server, this make it's hard to debug. ## Detailed Changes ## Test Plan No need.
## Rationale See apache#1405 ## Detailed Changes Disable percentile functions ## Test Plan CI
## Rationale Currently, the analyze sql can not obtain detailed metrics of partitioned table. ## Detailed Changes Return metrics of partitioned table to remote client and then collect metrics in client. ## Test Plan - Existing tests - add new integration tests for explain analyze
## Rationale Close apache#1299 ## Detailed Changes - Add PriorityRuntime component, and use in read API - In normal query, its plan will be executed in higher runtime by default, when executor decide query is expensive, then it will spawn `stream.poll` in another lower runtime. - In distributed query, a priority field is added in remote query request, so it can decide which runtime to run on. ## Test Plan Newly added UT
## Rationale In apache#1260, we implemented distributed analyze, but for query that are not analyze, metrics will be returned, which will lead to a decrease in query performance. Therefore, we will fix it in this PR, and metrics will not be returned for normal queries. ## Detailed Changes - Add is_analyze field to determine whether it is analyze ## Test Plan Existing tests --------- Co-authored-by: jiacai2050 <dev@liujiacai.net>
## Rationale Query with long time range usually cost too much resources, which affect stable of the whole cluster ## Detailed Changes - Support block query by query range ## Test Plan Manually ```bash curl 0:5000/admin/block -H 'content-type: application/json' -d ' { "operation": "Set", "write_block_list": [], "read_block_list": [], "block_rules": [ {"type": "QueryRange", "content": "24h"} ] }' ```
## Rationale ## Detailed Changes - Attach endpoint to remote error ## Test Plan CI
## Rationale See apache#1040 ## Detailed Changes - Try load page indexes ## Test Plan CI
## Rationale Refer to this issue apache#1386, currently, if the status of the shard is abnormal, we cannot get any valid exception information from the error message `table not found`. ## Detailed Changes * Add `TableStatus` in `cluster`, you can use it to get the status of the table in the current cluster.. * Add `SchemaWithCluster`, It wraps the schema inside the cluster, through which the state of the cluster and schema can be combined. ## Test Plan Pass CI.
…pache#1271) ## Rationale Conversion from row format in memtable to record batch in datafusion has been found a cpu bottleneck in production. For reduce the cpu cost, I impl the layered memtable framework to support gradually conversion during normal write path(and before flush). ## Detailed Changes + Impl layered memtable framework + Integrate it into the write path. ## Test Plan Test by new ut and it.
## Rationale ## Detailed Changes - Disable seq check when wal is disabled - Fix request id in remote query. ## Test Plan
…ery process (apache#1431) ## Rationale When there is a cache miss in disk cache, it will 1. Fetch data from remote 2. Insert data to cache, which will incur disk IO 3. Return the data for query. We can move the second step to another thread to avoid it blocking the normal query process. ## Detailed Changes - Make write disk nonblocking - Block on test explicitly, otherwise it will throw errors below > Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context. ## Test Plan CI
## Rationale Make it compatible for old table options. ## Detailed Changes When `layered_memtable_opts` not found in `TableOptions`, we disable `layered_memtable`. ## Test Plan Test manually.
da030c6
to
56e7d6d
Compare
@jiacai2050 When resolving the conflicts, a field called |
2dc43e4
to
8675544
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Shouldn't approve until we can choose not to squash merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Rationale
For #1319, some breaking changes are introduced, and for fast development,
dev
branch is chosen as the base branch for merging all the change sets. And currently, #1319 has been fixed, and the change sets can be merged into the new main branch.Detailed Changes
Resolve the conflicts and merge the change sets into the main branch.
Test Plan
Should pass the CI and the manual compatibility test.