Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Use lightweight protobuf for writing to tserver #11720

Closed
lnguyen-yugabyte opened this issue Mar 10, 2022 · 2 comments
Closed

[YSQL] Use lightweight protobuf for writing to tserver #11720

lnguyen-yugabyte opened this issue Mar 10, 2022 · 2 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL)
Projects

Comments

@lnguyen-yugabyte
Copy link
Contributor

lnguyen-yugabyte commented Mar 10, 2022

Description

Currently when running bulk insert / copy command, in the postgres backend for, about 15 percent of CPU time is spent on memory allocation / deallocation.

     6.12%  postgres         libtcmalloc.so.4.5.6         [.] operator new[]
     5.42%  postgres         postgres                     [.] CopyReadLine
     2.48%  postgres         postgres                     [.] NextCopyFromRawFields
     2.46%  postgres         libtcmalloc.so.4.5.6         [.] operator delete[]
     2.32%  postgres         libyb_pggate.so              [.] yb::pggate::PgTableDesc::GetColumnInfo
     2.25%  postgres         libyb_pggate.so              [.] yb::pggate::PgApiImpl::NewConstant
     2.05%  postgres         postgres                     [.] nocachegetattr
     1.88%  postgres         libc-2.23.so                 [.] __memcpy_avx_unaligned
     1.78%  postgres         libtcmalloc.so.4.5.6         [.] tcmalloc::CentralFreeList::FetchFromOneSpansSafe
     1.48%  postgres         postgres                     [.] pg_verify_mbstr_len
     1.37%  postgres         libyb_common.so              [.] yb::QLType::Create
     1.35%  postgres         postgres                     [.] GetTablePrimaryKeyBms
     1.21%  rpc_tp_pggate_y  libtcmalloc.so.4.5.6         [.] operator delete[]
     1.11%  postgres         postgres                     [.] YBCExecuteInsertInternal
     1.10%  postgres         libyb_common_proto.so        [.] yb::QLValuePB::ByteSizeLong
     1.08%  rpc_tp_pggate_y  libtcmalloc.so.4.5.6         [.] tcmalloc::CentralFreeList::ReleaseToSpans
     1.03%  postgres         libyb_pggate.so              [.] yb::pggate::PgExpr::internal_type
     1.00%  postgres         libyb_pggate.so              [.] boost::unordered::detail::table<boost::unordered::detail::map<std::__1::allocator<std::__1::pair<yb::PgsqlExpressionPB* const, yb::pggate::PgExpr*> >, yb::PgsqlExpressionPB*, yb::pggate::PgExpr*, boost::hash<yb::PgsqlExpressionPB*>, std::__1::equal_to<yb::PgsqlExpressionPB*> > >::try_emplace_unique<yb::PgsqlExpressionPB* const&>
     0.98%  postgres         libtcmalloc.so.4.5.6         [.] tcmalloc::CentralFreeList::RemoveRange
     0.98%  postgres         libyb_common_proto.so        [.] yb::PgsqlExpressionPB::ByteSizeLong
     0.95%  pggate_ybclient  libcrypto.so.1.1             [.] _aesni_ctr32_ghash_6x
     0.92%  rpc_tp_pggate_y  libyb_common_proto.so        [.] yb::PgsqlWriteRequestPB::~PgsqlWriteRequestPB
     0.92%  postgres         libyb_common_proto.so        [.] yb::QLValuePB::clear_value

Hence we could see good reduction of copy time if we optimize for this allocation/deallocation process. One of such idea is to use Protobuf's arena: https://developers.google.com/protocol-buffers/docs/reference/arenas. This will optimize for allocating & deallocation the protobufs itself.

We'd like to make this change for the copy/write path to the tserver and test how much time reduction it would take with this change enabled.

@lnguyen-yugabyte lnguyen-yugabyte added the area/ysql Yugabyte SQL (YSQL) label Mar 10, 2022
@lnguyen-yugabyte lnguyen-yugabyte self-assigned this Mar 10, 2022
@lnguyen-yugabyte lnguyen-yugabyte added this to To do in Performance via automation Mar 10, 2022
spolitov added a commit that referenced this issue Mar 25, 2022
Summary:
We use google protobuf for cummunicating between nodes or postgres and tserver.
Native protobuf implementation allocates a lot of memory while generating requests.
This diff switches Perform RPC to our lightweight protobuf implementation.
What uses arena for allocation and tries to avoid allocation as much as possible.

Test Plan: Jenkins

Reviewers: lnguyen

Reviewed By: lnguyen

Subscribers: smishra, ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D16088
@lnguyen-yugabyte
Copy link
Contributor Author

Initial performance test (on copying 10M rows):

Baseline: 5:27.483
Lightweight protobuf: 5:05.505

(e.g. copy time reduced by 6.8% compared to baseline).

spolitov added a commit that referenced this issue Mar 28, 2022
Summary: This diff adds arena to PgStatement and switches PgConstant to LWQLValuePB to avoid superfluous conversions.

Test Plan: Jenkins

Reviewers: lnguyen

Reviewed By: lnguyen

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D16220
@lnguyen-yugabyte
Copy link
Contributor Author

lnguyen-yugabyte commented Mar 28, 2022

Arena on PGConstant: 4:43.265 (13.5% time reduction compared to baseline).

spolitov added a commit that referenced this issue Mar 29, 2022
…protobufs

Summary:
Currenly we could evaluate expression represented as google protobufs.
This diff adds ability to evaluate expression represented as lightweight protobufs.

Test Plan: Jenkins

Reviewers: bogdan, dmitry

Reviewed By: bogdan, dmitry

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D16221
Performance automation moved this from To do to Done May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL)
Projects
Performance
  
Done
Development

No branches or pull requests

3 participants