Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make copy fetcher more async #5362

Merged
merged 1 commit into from Mar 2, 2023
Merged

Conversation

erimatnor
Copy link
Contributor

Make the copy fetcher more asynchronous by separating the sending of the request for data from the receiving of the response. By doing that, the async append node can send the request to each data node before it starts reading the first response. This can massively improve the performance because the response isn't returned until the remote node has finished executing the query and is ready to return the first tuple.

@erimatnor erimatnor requested a review from akuzm February 24, 2023 17:40
@erimatnor erimatnor self-assigned this Feb 24, 2023
@codecov
Copy link

codecov bot commented Feb 24, 2023

Codecov Report

Merging #5362 (e31bd33) into main (750e69e) will decrease coverage by 0.03%.
The diff coverage is 68.00%.

@@            Coverage Diff             @@
##             main    #5362      +/-   ##
==========================================
- Coverage   90.67%   90.64%   -0.03%     
==========================================
  Files         226      226              
  Lines       52524    52533       +9     
==========================================
- Hits        47626    47619       -7     
- Misses       4898     4914      +16     
Impacted Files Coverage Δ
tsl/src/remote/copy_fetcher.c 85.22% <68.00%> (-1.79%) ⬇️
src/loader/bgw_launcher.c 89.51% <0.00%> (-2.55%) ⬇️
src/loader/bgw_message_queue.c 86.36% <0.00%> (-2.28%) ⬇️
tsl/src/bgw_policy/job.c 87.54% <0.00%> (-0.05%) ⬇️
src/compat/compat.h 96.61% <0.00%> (+6.13%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@erimatnor erimatnor marked this pull request as ready for review February 27, 2023 10:58
Copy link
Contributor

@pmwkaa pmwkaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@vineethapai vineethapai added this to the TimescaleDB 2.10.1 milestone Mar 1, 2023
@erimatnor erimatnor force-pushed the async-copy-fetcher branch 3 times, most recently from af918a3 to 853ed85 Compare March 2, 2023 08:56
@vineethapai vineethapai removed this from the TimescaleDB 2.10.1 milestone Mar 2, 2023
Make the copy fetcher more asynchronous by separating the sending of
the request for data from the receiving of the response. By doing
that, the async append node can send the request to each data node
before it starts reading the first response. This can massively
improve the performance because the response isn't returned until the
remote node has finished executing the query and is ready to return
the first tuple.
@jfjoly jfjoly added this to the TimescaleDB 2.11 milestone Mar 2, 2023
@erimatnor erimatnor enabled auto-merge (rebase) March 2, 2023 13:42
@erimatnor erimatnor merged commit 386d31b into timescale:main Mar 2, 2023
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Mar 6, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5362 Make copy fetcher more async
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5317 Fix some incorrect memory handling
* timescale#5367 Rename columns in old-style continuous aggregates
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5343 Set PortalContext when starting job
* timescale#5360 Fix uninitialized bucket_info variable
* timescale#5362 Make copy fetcher more async
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5367 Fix column name handling in old-style continuous aggregates
* timescale#5378 Fix multinode DML HA performance regression
* timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size
* timescale#5153 Fix concurrent locking with chunk_data_node table

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
@svenklemm svenklemm mentioned this pull request Mar 6, 2023
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Mar 6, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5362 Make copy fetcher more async
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5317 Fix some incorrect memory handling
* timescale#5367 Rename columns in old-style continuous aggregates
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5343 Set PortalContext when starting job
* timescale#5360 Fix uninitialized bucket_info variable
* timescale#5362 Make copy fetcher more async
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5367 Fix column name handling in old-style continuous aggregates
* timescale#5378 Fix multinode DML HA performance regression
* timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size
* timescale#5153 Fix concurrent locking with chunk_data_node table

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Mar 6, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#5159 Support Continuous Aggregates names in hypertable_(detailed_)size
* timescale#5226 Fix concurrent locking with chunk_data_node table
* timescale#5317 Fix some incorrect memory handling
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5343 Set PortalContext when starting job
* timescale#5360 Fix uninitialized bucket_info variable
* timescale#5362 Make copy fetcher more async
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5367 Fix column name handling in old-style continuous aggregates
* timescale#5378 Fix multinode DML HA performance regression
* timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
svenklemm added a commit that referenced this pull request Mar 7, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #5159 Support Continuous Aggregates names in hypertable_(detailed_)size
* #5226 Fix concurrent locking with chunk_data_node table
* #5317 Fix some incorrect memory handling
* #5336 Use NameData and namestrcpy for names
* #5343 Set PortalContext when starting job
* #5360 Fix uninitialized bucket_info variable
* #5362 Make copy fetcher more async
* #5364 Fix num_chunks inconsistency in hypertables view
* #5367 Fix column name handling in old-style continuous aggregates
* #5378 Fix multinode DML HA performance regression
* #5384 Fix Hierarchical Continuous Aggregates chunk_interval_size

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Mar 7, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* timescale#5159 Support Continuous Aggregates names in hypertable_(detailed_)size
* timescale#5226 Fix concurrent locking with chunk_data_node table
* timescale#5317 Fix some incorrect memory handling
* timescale#5336 Use NameData and namestrcpy for names
* timescale#5343 Set PortalContext when starting job
* timescale#5360 Fix uninitialized bucket_info variable
* timescale#5362 Make copy fetcher more async
* timescale#5364 Fix num_chunks inconsistency in hypertables view
* timescale#5367 Fix column name handling in old-style continuous aggregates
* timescale#5378 Fix multinode DML HA performance regression
* timescale#5384 Fix Hierarchical Continuous Aggregates chunk_interval_size

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
svenklemm added a commit that referenced this pull request Mar 7, 2023
This release contains bug fixes since the 2.10.0 release.
We recommend that you upgrade at the next available opportunity.

**Bugfixes**
* #5159 Support Continuous Aggregates names in hypertable_(detailed_)size
* #5226 Fix concurrent locking with chunk_data_node table
* #5317 Fix some incorrect memory handling
* #5336 Use NameData and namestrcpy for names
* #5343 Set PortalContext when starting job
* #5360 Fix uninitialized bucket_info variable
* #5362 Make copy fetcher more async
* #5364 Fix num_chunks inconsistency in hypertables view
* #5367 Fix column name handling in old-style continuous aggregates
* #5378 Fix multinode DML HA performance regression
* #5384 Fix Hierarchical Continuous Aggregates chunk_interval_size

**Thanks**
* @justinozavala for reporting an issue with PL/Python procedures in the background worker
* @Medvecrab for discovering an issue with copying NameData when forming heap tuples.
* @pushpeepkmonroe for discovering an issue in upgrading old-style
  continuous aggregates with renamed columns
* @pushpeepkmonroe for discovering an issue in upgrading old-style continuous aggregates with renamed columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants