Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix partitionwise agg crash due to uninitialized memory #2874

Merged

Conversation

erimatnor
Copy link
Contributor

When building the partition information for hypertables, the memory
for partition bound information wasn't properly initialized. The
uninitialized memory could make the planner believe it could plan an
ordered append (PostgreSQL's built-in version) when the contents of
the memory happened to have the "right" value (specifically the
strategy char field of the partition bound info).

This change fixes the crash by properly initializing the partition
bound info so that PostgreSQL's ordered append plans are not triggered
on hypertables.

Due to the randomness of the uninitialized memory, it isn't possible
to reproduce the crash reliably and create a test for it. But the
crash was manually reproduced by initializing the strategy field of
the partition bound info to a value that triggered the crash in the
same place reported by the issue below.

Fixes #2873

When building the partition information for hypertables, the memory
for partition bound information wasn't properly initialized. The
uninitialized memory could make the planner believe it could plan an
ordered append (PostgreSQL's built-in version) when the contents of
the memory happened to have the "right" value (specifically the
`strategy` char field of the partition bound info).

This change fixes the crash by properly initializing the partition
bound info so that PostgreSQL's ordered append plans are not triggered
on hypertables.

Due to the randomness of the uninitialized memory, it isn't possible
to reproduce the crash reliably and create a test for it. But the
crash was manually reproduced by initializing the `strategy` field of
the partition bound info to a value that triggered the crash in the
same place reported by the issue below.

Fixes timescale#2873
@erimatnor erimatnor added the bug label Jan 28, 2021
@codecov
Copy link

codecov bot commented Jan 28, 2021

Codecov Report

Merging #2874 (03ceb24) into master (126f1c8) will increase coverage by 0.13%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2874      +/-   ##
==========================================
+ Coverage   90.07%   90.20%   +0.13%     
==========================================
  Files         212      212              
  Lines       34772    34725      -47     
==========================================
+ Hits        31322    31325       +3     
+ Misses       3450     3400      -50     
Impacted Files Coverage Δ
tsl/src/nodes/gapfill/planner.c 96.89% <ø> (-0.02%) ⬇️
src/plan_expand_hypertable.c 94.32% <100.00%> (+0.06%) ⬆️
src/plan_partialize.c 97.91% <100.00%> (+0.04%) ⬆️
src/planner.c 93.54% <100.00%> (ø)
tsl/src/nodes/decompress_chunk/decompress_chunk.c 94.05% <100.00%> (+0.50%) ⬆️
tsl/src/nodes/decompress_chunk/qual_pushdown.c 91.33% <100.00%> (+0.72%) ⬆️
src/loader/bgw_message_queue.c 84.51% <0.00%> (-2.59%) ⬇️
src/import/planner.c 70.30% <0.00%> (+11.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f98337c...03ceb24. Read the comment docs.

@erimatnor erimatnor marked this pull request as ready for review January 28, 2021 09:21
@erimatnor erimatnor requested a review from a team as a code owner January 28, 2021 09:21
@erimatnor erimatnor requested review from pmwkaa, svenklemm and gayyappan and removed request for a team January 28, 2021 09:21
@erimatnor erimatnor merged commit 0e86bbe into timescale:master Jan 28, 2021
@erimatnor erimatnor mentioned this pull request Jan 28, 2021
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Jan 28, 2021
This maintenance release contains bugfixes since the 2.0.0 release. We deem it
high priority for upgrading.

In particular the fixes contained in this maintenance release address issues
in continuous aggregates, compression, JOINs with hypertables and when
upgrading from previous versions.

**Bugfixes**
* timescale#2772 Always validate existing database and extension
* timescale#2780 Fix config enum entries for remote data fetcher
* timescale#2806 Add check for dropped chunk on update
* timescale#2828 Improve cagg watermark caching
* timescale#2842 Do not mark job as started when setting next_start field
* timescale#2845 Fix continuous aggregate privileges during upgrade
* timescale#2851 Fix nested loop joins that involve compressed chunks
* timescale#2860 Fix projection in ChunkAppend nodes
* timescale#2861 Remove compression stat update from update script
* timescale#2865 Apply volatile function quals at decompresschunk node
* timescale#2866 Avoid partitionwise planning of partialize_agg
* timescale#2868 Fix corruption in gapfill plan
* timescale#2874 Fix partitionwise agg crash due to uninitialized memory

**Thanks**
* @alex88 for reporting an issue with joined hypertables
* @brian-from-quantrocket for reporting an issue with extension update and dropped chunks
* @dhodyn for reporting an issue when joining compressed chunks
* @markatosi for reporting a segfault with partitionwise aggregates enabled
* @PhilippJust for reporting an issue with add_job and initial_start
* @sgorsh for reporting an issue when using pgAdmin on windows
* @WarriorOfWire for reporting the bug with gapfill queries not being
  able to find pathkey item to sort
@erimatnor erimatnor deleted the fix-partition-bound-info-crash branch January 28, 2021 13:23
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Jan 28, 2021
This maintenance release contains bugfixes since the 2.0.0 release. We deem it
high priority for upgrading.

In particular the fixes contained in this maintenance release address issues
in continuous aggregates, compression, JOINs with hypertables and when
upgrading from previous versions.

**Bugfixes**
* timescale#2772 Always validate existing database and extension
* timescale#2780 Fix config enum entries for remote data fetcher
* timescale#2806 Add check for dropped chunk on update
* timescale#2828 Improve cagg watermark caching
* timescale#2842 Do not mark job as started when setting next_start field
* timescale#2845 Fix continuous aggregate privileges during upgrade
* timescale#2851 Fix nested loop joins that involve compressed chunks
* timescale#2860 Fix projection in ChunkAppend nodes
* timescale#2861 Remove compression stat update from update script
* timescale#2865 Apply volatile function quals at decompresschunk node
* timescale#2866 Avoid partitionwise planning of partialize_agg
* timescale#2868 Fix corruption in gapfill plan
* timescale#2874 Fix partitionwise agg crash due to uninitialized memory

**Thanks**
* @alex88 for reporting an issue with joined hypertables
* @brian-from-quantrocket for reporting an issue with extension update and dropped chunks
* @dhodyn for reporting an issue when joining compressed chunks
* @markatosi for reporting a segfault with partitionwise aggregates enabled
* @PhilippJust for reporting an issue with add_job and initial_start
* @sgorsh for reporting an issue when using pgAdmin on windows
* @WarriorOfWire for reporting the bug with gapfill queries not being
  able to find pathkey item to sort
svenklemm added a commit to svenklemm/timescaledb that referenced this pull request Jan 28, 2021
This maintenance release contains bugfixes since the 2.0.0 release. We deem it
high priority for upgrading.

In particular the fixes contained in this maintenance release address issues
in continuous aggregates, compression, JOINs with hypertables and when
upgrading from previous versions.

**Bugfixes**
* timescale#2772 Always validate existing database and extension
* timescale#2780 Fix config enum entries for remote data fetcher
* timescale#2806 Add check for dropped chunk on update
* timescale#2828 Improve cagg watermark caching
* timescale#2842 Do not mark job as started when setting next_start field
* timescale#2845 Fix continuous aggregate privileges during upgrade
* timescale#2851 Fix nested loop joins that involve compressed chunks
* timescale#2860 Fix projection in ChunkAppend nodes
* timescale#2861 Remove compression stat update from update script
* timescale#2865 Apply volatile function quals at decompresschunk node
* timescale#2866 Avoid partitionwise planning of partialize_agg
* timescale#2868 Fix corruption in gapfill plan
* timescale#2874 Fix partitionwise agg crash due to uninitialized memory

**Thanks**
* @alex88 for reporting an issue with joined hypertables
* @brian-from-quantrocket for reporting an issue with extension update and dropped chunks
* @dhodyn for reporting an issue when joining compressed chunks
* @markatosi for reporting a segfault with partitionwise aggregates enabled
* @PhilippJust for reporting an issue with add_job and initial_start
* @sgorsh for reporting an issue when using pgAdmin on windows
* @WarriorOfWire for reporting the bug with gapfill queries not being
  able to find pathkey item to sort
svenklemm added a commit that referenced this pull request Jan 28, 2021
This maintenance release contains bugfixes since the 2.0.0 release. We deem it
high priority for upgrading.

In particular the fixes contained in this maintenance release address issues
in continuous aggregates, compression, JOINs with hypertables and when
upgrading from previous versions.

**Bugfixes**
* #2772 Always validate existing database and extension
* #2780 Fix config enum entries for remote data fetcher
* #2806 Add check for dropped chunk on update
* #2828 Improve cagg watermark caching
* #2842 Do not mark job as started when setting next_start field
* #2845 Fix continuous aggregate privileges during upgrade
* #2851 Fix nested loop joins that involve compressed chunks
* #2860 Fix projection in ChunkAppend nodes
* #2861 Remove compression stat update from update script
* #2865 Apply volatile function quals at decompresschunk node
* #2866 Avoid partitionwise planning of partialize_agg
* #2868 Fix corruption in gapfill plan
* #2874 Fix partitionwise agg crash due to uninitialized memory

**Thanks**
* @alex88 for reporting an issue with joined hypertables
* @brian-from-quantrocket for reporting an issue with extension update and dropped chunks
* @dhodyn for reporting an issue when joining compressed chunks
* @markatosi for reporting a segfault with partitionwise aggregates enabled
* @PhilippJust for reporting an issue with add_job and initial_start
* @sgorsh for reporting an issue when using pgAdmin on windows
* @WarriorOfWire for reporting the bug with gapfill queries not being
  able to find pathkey item to sort
svenklemm added a commit that referenced this pull request Jan 28, 2021
This maintenance release contains bugfixes since the 2.0.0 release. We deem it
high priority for upgrading.

In particular the fixes contained in this maintenance release address issues
in continuous aggregates, compression, JOINs with hypertables and when
upgrading from previous versions.

**Bugfixes**
* #2772 Always validate existing database and extension
* #2780 Fix config enum entries for remote data fetcher
* #2806 Add check for dropped chunk on update
* #2828 Improve cagg watermark caching
* #2838 Fix catalog repair in update script
* #2842 Do not mark job as started when setting next_start field
* #2845 Fix continuous aggregate privileges during upgrade
* #2851 Fix nested loop joins that involve compressed chunks
* #2860 Fix projection in ChunkAppend nodes
* #2861 Remove compression stat update from update script
* #2865 Apply volatile function quals at decompresschunk node
* #2866 Avoid partitionwise planning of partialize_agg
* #2868 Fix corruption in gapfill plan
* #2874 Fix partitionwise agg crash due to uninitialized memory

**Thanks**
* @alex88 for reporting an issue with joined hypertables
* @brian-from-quantrocket for reporting an issue with extension update and dropped chunks
* @dhodyn for reporting an issue when joining compressed chunks
* @markatosi for reporting a segfault with partitionwise aggregates enabled
* @PhilippJust for reporting an issue with add_job and initial_start
* @sgorsh for reporting an issue when using pgAdmin on windows
* @WarriorOfWire for reporting the bug with gapfill queries not being
  able to find pathkey item to sort
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation fault
4 participants