New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Pack inserted value #20713
Labels
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
Comments
spolitov
added
area/docdb
YugabyteDB core features
status/awaiting-triage
Issue awaiting triage
labels
Jan 20, 2024
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
Jan 20, 2024
Note, inserting 2 rows with 10 column values:
|
spolitov
added a commit
that referenced
this issue
Jan 23, 2024
Summary: When inserting row with multiple values we send every column as a separate protobuf with complex structure and many other duplicated fields. As result, during bulk load we spent significant amount of time parsing and analysing those protobufs. Instead of that we could pack all values in postgres layer, and than insert them directly to docdb. Controlled by newly added preview flag - ysql_pack_inserted_value. Currently row is packed using v1 encoding. Because v2 not yet in release state. Performance comparison using PgSingleTServerTest.ScanWithPackedRow insert time: master (fac37c6) - 30.08s this diff - 25.30s Bulk load comparison 30M rows using the following script: ``` drop table if exists test_table; create extension if not exists pgcrypto; create table test_table(k INT, v1 INT, v2 INT, v3 INT, v4 INT, v5 INT, PRIMARY KEY(k ASC)); do $$ begin for counter in 1..30000 loop if counter % 1000 = 0 then raise notice 'counter: %', (counter * 1000); end if; insert into test_table (select i + counter*1000, random()*1000000000, random()*1000000000, random()*1000000000, random()*1000000000, random()*1000000000 from generate_series(1, 1000) i); commit; end loop; end $$; ``` master - 14m51.963s this diff - 12m19.418s Jira: DB-9716 Test Plan: PgPackedInsertTest Reviewers: tnayak, mbautin Reviewed By: mbautin Subscribers: yql, mbautin, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31602
spolitov
added a commit
that referenced
this issue
Feb 22, 2024
Summary: Currently, while inserting multiple rows into a single table we generate independent operations. There is a lot of duplicated information in those operations. So we have to repeat the same steps while processing them, such as resolving table, checking schema version etc. And also we have to allocate this repeated data on the sender side. This diff changes protocol to use a single write operation in such case. The feature is not completely ready, because it works only for inserts to the same tablet. Should be addressed in follow-up diffs. To enable/disable feature recently added preview gflag could be used: ysql_pack_inserted_value. Performance comparison using PgSingleTServerTest.ScanWithPackedRow: ``` this diff Insert full time - 17.66s Insert TServer time - 11.37s master (d3fca95, packed inserted disabled): Insert full time - 32.44s Insert TServer time - 15.00s master (d3fca95, packed inserted enabled): Insert full time - 26.49s Insert TServer time - 13.38s ``` Jira: DB-9716 Test Plan: Jenkins Reviewers: mbautin, tnayak Reviewed By: mbautin, tnayak Subscribers: yql, bogdan, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31930
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/docdb
YugabyteDB core features
kind/bug
This issue is a bug
priority/medium
Medium priority issue
Jira Link: DB-9716
Description
When inserting row with multiple values we send every column as a separate protobuf with complex structure and many other duplicated fields.
As result, during bulk load we spent significant amount of time parsing and analysing those protobufs.
Instead of that we could pack all values in postgres layer, and than insert them directly to docdb.
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: