Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): store all column descs in cell based table #3344

Merged
merged 16 commits into from
Jun 21, 2022

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Jun 20, 2022

What's changed and what's your intention?

This PR stores all column descs in cell based table and add another field of output_ids to control which column to ser/de.
This is a prerequisite of supporting point get or range scan with vnode encoded in the key.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@BugenZhao BugenZhao changed the title feat(storage): store pk indices & full column descs in cell based table feat(storage): store pk indices & full column descs in cell based table [WIP] Jun 20, 2022
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@BugenZhao BugenZhao marked this pull request as ready for review June 21, 2022 06:42
@BugenZhao BugenZhao changed the title feat(storage): store pk indices & full column descs in cell based table [WIP] feat(storage): store pk indices & full column descs in cell based table Jun 21, 2022
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@codecov
Copy link

codecov bot commented Jun 21, 2022

Codecov Report

Merging #3344 (39709ec) into main (bc74ff4) will increase coverage by 0.00%.
The diff coverage is 72.22%.

@@           Coverage Diff           @@
##             main    #3344   +/-   ##
=======================================
  Coverage   73.63%   73.63%           
=======================================
  Files         760      760           
  Lines      104231   104285   +54     
=======================================
+ Hits        76752    76795   +43     
- Misses      27479    27490   +11     
Flag Coverage Δ
rust 73.63% <72.22%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/batch/src/executor/row_seq_scan.rs 19.25% <0.00%> (-4.60%) ⬇️
...ntend/src/optimizer/plan_node/stream_index_scan.rs 23.47% <0.00%> (+3.02%) ⬆️
src/stream/src/from_proto/batch_query.rs 0.00% <0.00%> (ø)
...ecutor/managed_state/top_n/top_n_bottom_n_state.rs 82.70% <50.00%> (ø)
src/storage/src/cell_based_row_serializer.rs 65.51% <62.50%> (+9.70%) ⬆️
src/storage/src/table/cell_based_table.rs 75.81% <98.24%> (+2.94%) ⬆️
src/common/src/catalog/physical_table.rs 48.14% <100.00%> (+18.14%) ⬆️
src/common/src/util/ordered/serde.rs 91.33% <100.00%> (+0.39%) ⬆️
...frontend/src/optimizer/plan_node/batch_seq_scan.rs 95.32% <100.00%> (-0.44%) ⬇️
...c/frontend/src/optimizer/plan_node/logical_scan.rs 100.00% <100.00%> (ø)
... and 15 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@BugenZhao BugenZhao changed the title feat(storage): store pk indices & full column descs in cell based table feat: store pk indices & full column descs in cell based table Jun 21, 2022
@BugenZhao BugenZhao changed the title feat: store pk indices & full column descs in cell based table feat: store pk indices & all column descs in cell based table Jun 21, 2022
@BugenZhao BugenZhao changed the title feat: store pk indices & all column descs in cell based table feat: store all column descs in cell based table Jun 21, 2022
@BugenZhao BugenZhao changed the title feat: store all column descs in cell based table feat(storage): store all column descs in cell based table Jun 21, 2022
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@BugenZhao BugenZhao requested a review from xx01cyx June 21, 2022 07:58
@@ -52,7 +52,8 @@ message ColumnCatalog {

message CellBasedTableDesc {
uint32 table_id = 1;
repeated OrderedColumnDesc order_key = 2;
repeated ColumnDesc columns = 2;
repeated OrderedColumnDesc order_key = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also store pk and distribution key here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are very forward-looking! Plan to do this in next PRs. cc @st1page

let order_types: Vec<OrderType> = pk_descs.iter().map(|desc| desc.order).collect();

let pk_indices = pk_descs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass pk_indices from frontend? just like top_n or join.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are very forward-looking! Plan to do this in next PRs. cc @st1page

Copy link
Contributor

@skyzh skyzh Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥵🥵 forward 🥵🥵 looking!

src/storage/src/table/cell_based_table.rs Outdated Show resolved Hide resolved
src/storage/src/table/cell_based_table.rs Outdated Show resolved Hide resolved
src/storage/src/table/cell_based_table.rs Outdated Show resolved Hide resolved
/// not required.
/// Indices of distribution keys for computing value meta. None if value meta is not required.
/// Note that the index is based on the all columns of the table, instead of the output ones.
// FIXME: revisit constructions and usages.
dist_key_indices: Option<Vec<usize>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is better, Option<Vec<usize>> or Vec<usize>? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I think both are acceptable.

Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>
Copy link
Contributor

@wcy-fdu wcy-fdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Will the next PR change Vnode encoding?

@BugenZhao BugenZhao added the mergify/can-merge Indicates that the PR can be added to the merge queue label Jun 21, 2022
@BugenZhao
Copy link
Member Author

LGTM!

Will the next PR change Vnode encoding?

Hard working on this!

@mergify mergify bot merged commit 78861e3 into main Jun 21, 2022
@mergify mergify bot deleted the bz/vnode-in-key-part-2 branch June 21, 2022 18:19
cnissnzg added a commit that referenced this pull request Jun 22, 2022
commit aac69d2
Author: William Wen <44139337+wenym1@users.noreply.github.com>
Date:   Wed Jun 22 17:23:31 2022 +0800

    fix(storage): fix ignoring delete record when getting from sst (#3405)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit ebfbd0c
Author: Alex Chi <iskyzh@gmail.com>
Date:   Wed Jun 22 17:01:53 2022 +0800

    chore(storage): move read lock acquire out of lock (#3400)

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit c5b4fe2
Author: lmatz <lmatz823@gmail.com>
Date:   Wed Jun 22 01:49:23 2022 -0700

    fix(expr): trim (#3402)

commit 28cf5ff
Author: Wallace <bupt2013211450@gmail.com>
Date:   Wed Jun 22 16:29:03 2022 +0800

    feat(compaction): compress data with dynamic level (#3388)

    * do not compress data in high level

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 7602285
Author: Lee Zong Yu <65748142+marvenlee2486@users.noreply.github.com>
Date:   Wed Jun 22 15:58:31 2022 +0800

    fix(grafana): Edit update.sh (#3399)

    * Edit update.sh

    * Edit format error

    * Add payload to gitignore

    * Revert risedev.yml modification

    * Fix wrong gitignore

    * Fix wrong gitignore

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 4e66ca3
Author: congyi <58715567+wcy-fdu@users.noreply.github.com>
Date:   Wed Jun 22 13:16:48 2022 +0800

    feat(docs): add docs for the relational table layer (#3313)

    * add doc for relational layer

commit 7c44a15
Author: Bugen Zhao <i@bugenzhao.com>
Date:   Wed Jun 22 12:51:59 2022 +0800

    fix: build failure of cell based table with release profile (#3395)

commit 93c62a3
Author: Li0k <yuli@singularity-data.com>
Date:   Wed Jun 22 11:50:18 2022 +0800

    chore(frontend): fix handle_with_properties ctx of explain (#3394)

commit 233ee5d
Author: Li0k <yuli@singularity-data.com>
Date:   Wed Jun 22 11:32:05 2022 +0800

    feat(frontend): catalog add properties for ttl (#3382)

    * feat(frontend): catlog add properties for materialized_view and materialized_source to support ttl

    * fix(frontend): fix test_runner

    * chore(frontend): unify handle_with_properties logic for handler

    * chore(frontend): explain introduce handle_with_properties to replace WithProperties

    * chore(sqlparser): remove unused logic of WithProperties

commit 78861e3
Author: Bugen Zhao <i@bugenzhao.com>
Date:   Wed Jun 22 02:19:37 2022 +0800

    feat(storage): store all column descs in cell based table (#3344)

    * add pk indices

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * extract mapping

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * distinguish parital table

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refactor proto

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix clippy

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix output schema

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * store column ids in serializer

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refine docs

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix plan test

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refine docs

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * write check

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * Apply suggestions from code review

    Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>

    * trigger ci

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>
    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit bc74ff4
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 22:59:22 2022 +0800

    feat(streaming): use tokio channels and better coop scheduling (#3374)

    * feat(streaming): use tokio channels and better coop scheduling

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * more info when deploy

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * remove manual coop scheduling

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

commit 4b33d95
Author: Huangjw <1223644280@qq.com>
Date:   Tue Jun 21 18:32:36 2022 +0800

    chore(ci): update ci image vesion and remove changelog.json (#3384)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit ed1de15
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 18:19:02 2022 +0800

    feat(risedev): fix meta args and support healthcheck in docker (#3378)

    * feat(risedev): support healthcheck in docker

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * pin version

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 5a7e115
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 18:05:56 2022 +0800

    chore(ci): manual CLA check (#3386)

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

commit 2688491
Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com>
Date:   Tue Jun 21 18:03:02 2022 +0800

    feat(pgwire): support row limit in extended query mode (#3354)

    * * add row_end flag in PgResponse
    * add values interface in PgResponse

    * * add result_cache
    * add execute interface

    * fix complile error of row_end

    * * split process_query_msg into process_query_msg_simple and process_query_response

    execute flow:

    simple mode:                        extended query mode:
    process_query_msg_simple              portal.execute()
        |   (response)             (response)  |
    process_query_response ---------------------
        |
    process_query_with_results

    * fix clippy

    * add process at Execute

    * add PortalSuspended msg

    * add row_limit process at process_query_with_results

    * fix small problem and add some comment

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit dc8631e
Author: Tao Wu <wutao@singularity-data.com>
Date:   Tue Jun 21 17:35:36 2022 +0800

    test: sqlsmith supports generating unary func (#3370)

commit cb0d7c1
Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com>
Date:   Tue Jun 21 17:16:10 2022 +0800

    refactor: deparallel hash agg apply batch + remove Arc Mutex (#3377)

    * refactor: deparallel hash agg apply batch

    * remove Arc Mutex

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
mergify bot pushed a commit that referenced this pull request Jun 27, 2022
…3363)

* implement cascade && restrict by add relation map

* fmt

* implement cascade && restrict by add relation map

* fmt

* correct pre unit test error

* correct misc test error

* fix

* update proto && fix

* change grantor to granted by

* implement recursive privilege relations && fix

* Squashed commit of the following:

commit aac69d2
Author: William Wen <44139337+wenym1@users.noreply.github.com>
Date:   Wed Jun 22 17:23:31 2022 +0800

    fix(storage): fix ignoring delete record when getting from sst (#3405)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit ebfbd0c
Author: Alex Chi <iskyzh@gmail.com>
Date:   Wed Jun 22 17:01:53 2022 +0800

    chore(storage): move read lock acquire out of lock (#3400)

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit c5b4fe2
Author: lmatz <lmatz823@gmail.com>
Date:   Wed Jun 22 01:49:23 2022 -0700

    fix(expr): trim (#3402)

commit 28cf5ff
Author: Wallace <bupt2013211450@gmail.com>
Date:   Wed Jun 22 16:29:03 2022 +0800

    feat(compaction): compress data with dynamic level (#3388)

    * do not compress data in high level

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 7602285
Author: Lee Zong Yu <65748142+marvenlee2486@users.noreply.github.com>
Date:   Wed Jun 22 15:58:31 2022 +0800

    fix(grafana): Edit update.sh (#3399)

    * Edit update.sh

    * Edit format error

    * Add payload to gitignore

    * Revert risedev.yml modification

    * Fix wrong gitignore

    * Fix wrong gitignore

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 4e66ca3
Author: congyi <58715567+wcy-fdu@users.noreply.github.com>
Date:   Wed Jun 22 13:16:48 2022 +0800

    feat(docs): add docs for the relational table layer (#3313)

    * add doc for relational layer

commit 7c44a15
Author: Bugen Zhao <i@bugenzhao.com>
Date:   Wed Jun 22 12:51:59 2022 +0800

    fix: build failure of cell based table with release profile (#3395)

commit 93c62a3
Author: Li0k <yuli@singularity-data.com>
Date:   Wed Jun 22 11:50:18 2022 +0800

    chore(frontend): fix handle_with_properties ctx of explain (#3394)

commit 233ee5d
Author: Li0k <yuli@singularity-data.com>
Date:   Wed Jun 22 11:32:05 2022 +0800

    feat(frontend): catalog add properties for ttl (#3382)

    * feat(frontend): catlog add properties for materialized_view and materialized_source to support ttl

    * fix(frontend): fix test_runner

    * chore(frontend): unify handle_with_properties logic for handler

    * chore(frontend): explain introduce handle_with_properties to replace WithProperties

    * chore(sqlparser): remove unused logic of WithProperties

commit 78861e3
Author: Bugen Zhao <i@bugenzhao.com>
Date:   Wed Jun 22 02:19:37 2022 +0800

    feat(storage): store all column descs in cell based table (#3344)

    * add pk indices

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * extract mapping

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * distinguish parital table

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refactor proto

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix clippy

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix output schema

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * store column ids in serializer

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refine docs

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * fix plan test

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * refine docs

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * write check

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    * Apply suggestions from code review

    Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>

    * trigger ci

    Signed-off-by: Bugen Zhao <i@bugenzhao.com>

    Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>
    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit bc74ff4
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 22:59:22 2022 +0800

    feat(streaming): use tokio channels and better coop scheduling (#3374)

    * feat(streaming): use tokio channels and better coop scheduling

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * more info when deploy

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * remove manual coop scheduling

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

commit 4b33d95
Author: Huangjw <1223644280@qq.com>
Date:   Tue Jun 21 18:32:36 2022 +0800

    chore(ci): update ci image vesion and remove changelog.json (#3384)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit ed1de15
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 18:19:02 2022 +0800

    feat(risedev): fix meta args and support healthcheck in docker (#3378)

    * feat(risedev): support healthcheck in docker

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * fix

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    * pin version

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 5a7e115
Author: Alex Chi <iskyzh@gmail.com>
Date:   Tue Jun 21 18:05:56 2022 +0800

    chore(ci): manual CLA check (#3386)

    Signed-off-by: Alex Chi <iskyzh@gmail.com>

commit 2688491
Author: ZENOTME <43447882+ZENOTME@users.noreply.github.com>
Date:   Tue Jun 21 18:03:02 2022 +0800

    feat(pgwire): support row limit in extended query mode (#3354)

    * * add row_end flag in PgResponse
    * add values interface in PgResponse

    * * add result_cache
    * add execute interface

    * fix complile error of row_end

    * * split process_query_msg into process_query_msg_simple and process_query_response

    execute flow:

    simple mode:                        extended query mode:
    process_query_msg_simple              portal.execute()
        |   (response)             (response)  |
    process_query_response ---------------------
        |
    process_query_with_results

    * fix clippy

    * add process at Execute

    * add PortalSuspended msg

    * add row_limit process at process_query_with_results

    * fix small problem and add some comment

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit dc8631e
Author: Tao Wu <wutao@singularity-data.com>
Date:   Tue Jun 21 17:35:36 2022 +0800

    test: sqlsmith supports generating unary func (#3370)

commit cb0d7c1
Author: Bowen <36908971+BowenXiao1999@users.noreply.github.com>
Date:   Tue Jun 21 17:16:10 2022 +0800

    refactor: deparallel hash agg apply batch + remove Arc Mutex (#3377)

    * refactor: deparallel hash agg apply batch

    * remove Arc Mutex

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* add unit test

* fmt

* correct pre test

* some fix

Co-authored-by: August <pin@singularity-data.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mergify/can-merge Indicates that the PR can be added to the merge queue type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants