Make copy fetcher more async #5362

erimatnor · 2023-02-24T17:40:15Z

Make the copy fetcher more asynchronous by separating the sending of the request for data from the receiving of the response. By doing that, the async append node can send the request to each data node before it starts reading the first response. This can massively improve the performance because the response isn't returned until the remote node has finished executing the query and is ready to return the first tuple.

codecov · 2023-02-24T17:52:24Z

Codecov Report

Merging #5362 (e31bd33) into main (750e69e) will decrease coverage by 0.03%.
The diff coverage is 68.00%.

@@            Coverage Diff             @@
##             main    #5362      +/-   ##
==========================================
- Coverage   90.67%   90.64%   -0.03%     
==========================================
  Files         226      226              
  Lines       52524    52533       +9     
==========================================
- Hits        47626    47619       -7     
- Misses       4898     4914      +16

Impacted Files	Coverage Δ
tsl/src/remote/copy_fetcher.c	`85.22% <68.00%> (-1.79%)`	⬇️
src/loader/bgw_launcher.c	`89.51% <0.00%> (-2.55%)`	⬇️
src/loader/bgw_message_queue.c	`86.36% <0.00%> (-2.28%)`	⬇️
tsl/src/bgw_policy/job.c	`87.54% <0.00%> (-0.05%)`	⬇️
src/compat/compat.h	`96.61% <0.00%> (+6.13%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pmwkaa

Looks good

Make the copy fetcher more asynchronous by separating the sending of the request for data from the receiving of the response. By doing that, the async append node can send the request to each data node before it starts reading the first response. This can massively improve the performance because the response isn't returned until the remote node has finished executing the query and is ready to return the first tuple.

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5362 Make copy fetcher more async * timescale#5336 Use NameData and namestrcpy for names * timescale#5317 Fix some incorrect memory handling * timescale#5367 Rename columns in old-style continuous aggregates * timescale#5336 Use NameData and namestrcpy for names * timescale#5343 Set PortalContext when starting job * timescale#5360 Fix uninitialized bucket_info variable * timescale#5362 Make copy fetcher more async * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5367 Fix column name handling in old-style continuous aggregates * timescale#5378 Fix multinode DML HA performance regression * timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size * timescale#5153 Fix concurrent locking with chunk_data_node table **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5362 Make copy fetcher more async * timescale#5336 Use NameData and namestrcpy for names * timescale#5317 Fix some incorrect memory handling * timescale#5367 Rename columns in old-style continuous aggregates * timescale#5336 Use NameData and namestrcpy for names * timescale#5343 Set PortalContext when starting job * timescale#5360 Fix uninitialized bucket_info variable * timescale#5362 Make copy fetcher more async * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5367 Fix column name handling in old-style continuous aggregates * timescale#5378 Fix multinode DML HA performance regression * timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size * timescale#5153 Fix concurrent locking with chunk_data_node table **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#5159 Support Continuous Aggregates names in hypertable_(detailed_)size * timescale#5226 Fix concurrent locking with chunk_data_node table * timescale#5317 Fix some incorrect memory handling * timescale#5336 Use NameData and namestrcpy for names * timescale#5343 Set PortalContext when starting job * timescale#5360 Fix uninitialized bucket_info variable * timescale#5362 Make copy fetcher more async * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5367 Fix column name handling in old-style continuous aggregates * timescale#5378 Fix multinode DML HA performance regression * timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #5159 Support Continuous Aggregates names in hypertable_(detailed_)size * #5226 Fix concurrent locking with chunk_data_node table * #5317 Fix some incorrect memory handling * #5336 Use NameData and namestrcpy for names * #5343 Set PortalContext when starting job * #5360 Fix uninitialized bucket_info variable * #5362 Make copy fetcher more async * #5364 Fix num_chunks inconsistency in hypertables view * #5367 Fix column name handling in old-style continuous aggregates * #5378 Fix multinode DML HA performance regression * #5384 Fix Hierarchical Continuous Aggregates chunk_interval_size **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * timescale#5159 Support Continuous Aggregates names in hypertable_(detailed_)size * timescale#5226 Fix concurrent locking with chunk_data_node table * timescale#5317 Fix some incorrect memory handling * timescale#5336 Use NameData and namestrcpy for names * timescale#5343 Set PortalContext when starting job * timescale#5360 Fix uninitialized bucket_info variable * timescale#5362 Make copy fetcher more async * timescale#5364 Fix num_chunks inconsistency in hypertables view * timescale#5367 Fix column name handling in old-style continuous aggregates * timescale#5378 Fix multinode DML HA performance regression * timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

@justinozavala

This release contains bug fixes since the 2.10.0 release. We recommend that you upgrade at the next available opportunity. **Bugfixes** * #5159 Support Continuous Aggregates names in hypertable_(detailed_)size * #5226 Fix concurrent locking with chunk_data_node table * #5317 Fix some incorrect memory handling * #5336 Use NameData and namestrcpy for names * #5343 Set PortalContext when starting job * #5360 Fix uninitialized bucket_info variable * #5362 Make copy fetcher more async * #5364 Fix num_chunks inconsistency in hypertables view * #5367 Fix column name handling in old-style continuous aggregates * #5378 Fix multinode DML HA performance regression * #5384 Fix Hierarchical Continuous Aggregates chunk_interval_size **Thanks** * @justinozavala for reporting an issue with PL/Python procedures in the background worker * @Medvecrab for discovering an issue with copying NameData when forming heap tuples. * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns * @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns

erimatnor added the multinode label Feb 24, 2023

erimatnor requested a review from akuzm February 24, 2023 17:40

erimatnor self-assigned this Feb 24, 2023

erimatnor marked this pull request as ready for review February 27, 2023 10:58

erimatnor requested review from pmwkaa and nikkhils February 27, 2023 10:59

erimatnor force-pushed the async-copy-fetcher branch from ccc0946 to 7f051d5 Compare February 27, 2023 11:08

pmwkaa approved these changes Feb 28, 2023

View reviewed changes

akuzm approved these changes Feb 28, 2023

View reviewed changes

vineethapai added this to the TimescaleDB 2.10.1 milestone Mar 1, 2023

erimatnor force-pushed the async-copy-fetcher branch 3 times, most recently from af918a3 to 853ed85 Compare March 2, 2023 08:56

vineethapai removed this from the TimescaleDB 2.10.1 milestone Mar 2, 2023

erimatnor force-pushed the async-copy-fetcher branch from 853ed85 to e31bd33 Compare March 2, 2023 13:20

jfjoly added this to the TimescaleDB 2.11 milestone Mar 2, 2023

erimatnor enabled auto-merge (rebase) March 2, 2023 13:42

erimatnor merged commit 386d31b into timescale:main Mar 2, 2023

svenklemm mentioned this pull request Mar 6, 2023

Release 2.10.1 #5386

Merged

timescale-automation added the backported-2.10.x label Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make copy fetcher more async #5362

Make copy fetcher more async #5362

erimatnor commented Feb 24, 2023

codecov bot commented Feb 24, 2023 •

edited

pmwkaa left a comment

Make copy fetcher more async #5362

Make copy fetcher more async #5362

Conversation

erimatnor commented Feb 24, 2023

codecov bot commented Feb 24, 2023 • edited

Codecov Report

pmwkaa left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 24, 2023 •

edited