Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when querying continuous aggregate #3248

Closed
svenklemm opened this issue May 19, 2021 · 4 comments · Fixed by #4219
Closed

Segfault when querying continuous aggregate #3248

svenklemm opened this issue May 19, 2021 · 4 comments · Fixed by #4219

Comments

@svenklemm
Copy link
Member

Query is in frame 46

found by sqlsmith

#0  0x000055b78cf2caa7 in pg_detoast_datum (datum=0x3fdf98e09ef7c83f) at fmgr.c:1728
#1  0x000055b78cde8d20 in float8_avg (fcinfo=0x55b78f3b8040) at float.c:3058
#2  0x00007f613c190951 in tsl_finalize_agg_ffunc (fcinfo=0x7ffd9e290970) at /home/sven/projects/timescaledb/tsl/src/partialize_finalize.c:551
#3  0x00007f613c27420b in ts_finalize_agg_ffunc (fcinfo=0x7ffd9e290970) at /home/sven/projects/timescaledb/src/cross_module_fn.c:48
#4  0x000055b78cb5d1af in finalize_aggregate (aggstate=0x55b78f5dc2c8, peragg=0x55b790cd3290, pergroupstate=0x55b79054c7a8, resultVal=0x55b790cd3250, resultIsNull=0x55b790cd3270) at nodeAgg.c:1135
#5  0x000055b78cb5d894 in finalize_aggregates (aggstate=0x55b78f5dc2c8, peraggs=0x55b790cd3290, pergroup=0x55b79054c7a8) at nodeAgg.c:1369
#6  0x000055b78cb6041d in agg_retrieve_hash_table_in_memory (aggstate=0x55b78f5dc2c8) at nodeAgg.c:2879
#7  0x000055b78cb60145 in agg_retrieve_hash_table (aggstate=0x55b78f5dc2c8) at nodeAgg.c:2755
#8  0x000055b78cb5f1ad in ExecAgg (pstate=0x55b78f5dc2c8) at nodeAgg.c:2179
#9  0x000055b78cb90bdb in ExecProcNode (node=0x55b78f5dc2c8) at ../../../src/include/executor/executor.h:248
#10 0x000055b78cb90bfc in SubqueryNext (node=0x55b78f5dc118) at nodeSubqueryscan.c:53
#11 0x000055b78cb506cd in ExecScanFetch (node=0x55b78f5dc118, accessMtd=0x55b78cb90bdd <SubqueryNext>, recheckMtd=0x55b78cb90c06 <SubqueryRecheck>) at execScan.c:133
#12 0x000055b78cb5076e in ExecScan (node=0x55b78f5dc118, accessMtd=0x55b78cb90bdd <SubqueryNext>, recheckMtd=0x55b78cb90c06 <SubqueryRecheck>) at execScan.c:199
#13 0x000055b78cb90c54 in ExecSubqueryScan (pstate=0x55b78f5dc118) at nodeSubqueryscan.c:87
#14 0x000055b78cb65364 in ExecProcNode (node=0x55b78f5dc118) at ../../../src/include/executor/executor.h:248
#15 0x000055b78cb657e8 in ExecAppend (pstate=0x55b78f5dbe28) at nodeAppend.c:267
#16 0x000055b78cb8a1ca in ExecProcNode (node=0x55b78f5dbe28) at ../../../src/include/executor/executor.h:248
#17 0x000055b78cb8a46b in ExecResult (pstate=0x55b78f5dbc78) at nodeResult.c:115
#18 0x000055b78cb8caf1 in ExecProcNode (node=0x55b78f5dbc78) at ../../../src/include/executor/executor.h:248
#19 0x000055b78cb8cc40 in ExecSort (pstate=0x55b78f5dba60) at nodeSort.c:108
#20 0x000055b78cb94569 in ExecProcNode (node=0x55b78f5dba60) at ../../../src/include/executor/executor.h:248
#21 0x000055b78cb965c3 in begin_partition (winstate=0x55b78f5db518) at nodeWindowAgg.c:1112
#22 0x000055b78cb987bf in ExecWindowAgg (pstate=0x55b78f5db518) at nodeWindowAgg.c:2105
#23 0x000055b78cb7d8b7 in ExecProcNode (node=0x55b78f5db518) at ../../../src/include/executor/executor.h:248
#24 0x000055b78cb7db07 in ExecLimit (pstate=0x55b78f5db268) at nodeLimit.c:96
#25 0x000055b78cb8d644 in ExecProcNode (node=0x55b78f5db268) at ../../../src/include/executor/executor.h:248
#26 0x000055b78cb8df1c in ExecScanSubPlan (node=0x55b78f3a8610, econtext=0x55b790ce2a48, isNull=0x55b78f3a857d) at nodeSubplan.c:323
#27 0x000055b78cb8d8c7 in ExecSubPlan (node=0x55b78f3a8610, econtext=0x55b790ce2a48, isNull=0x55b78f3a857d) at nodeSubplan.c:89
#28 0x000055b78cb3b3ea in ExecEvalSubPlan (state=0x55b78f3a8578, op=0x55b78f3a9138, econtext=0x55b790ce2a48) at execExprInterp.c:3901
#29 0x000055b78cb3612b in ExecInterpExpr (state=0x55b78f3a8578, econtext=0x55b790ce2a48, isnull=0x7ffd9e29197f) at execExprInterp.c:1539
#30 0x000055b78cb502b8 in ExecEvalExprSwitchContext (state=0x55b78f3a8578, econtext=0x55b790ce2a48, isNull=0x7ffd9e29197f) at ../../../src/include/executor/executor.h:322
#31 0x000055b78cb503e8 in ExecQual (state=0x55b78f3a8578, econtext=0x55b790ce2a48) at ../../../src/include/executor/executor.h:391
#32 0x000055b78cb507d6 in ExecScan (node=0x55b790ce2930, accessMtd=0x55b78cb90bdd <SubqueryNext>, recheckMtd=0x55b78cb90c06 <SubqueryRecheck>) at execScan.c:227
#33 0x000055b78cb90c54 in ExecSubqueryScan (pstate=0x55b790ce2930) at nodeSubqueryscan.c:87
#34 0x000055b78cb65364 in ExecProcNode (node=0x55b790ce2930) at ../../../src/include/executor/executor.h:248
#35 0x000055b78cb657e8 in ExecAppend (pstate=0x55b790ce2640) at nodeAppend.c:267
#36 0x000055b78cb8a1ca in ExecProcNode (node=0x55b790ce2640) at ../../../src/include/executor/executor.h:248
#37 0x000055b78cb8a46b in ExecResult (pstate=0x55b790ce2490) at nodeResult.c:115
#38 0x000055b78cb7d8b7 in ExecProcNode (node=0x55b790ce2490) at ../../../src/include/executor/executor.h:248
#39 0x000055b78cb7dc97 in ExecLimit (pstate=0x55b78f5dd0d8) at nodeLimit.c:173
#40 0x000055b78cb40322 in ExecProcNode (node=0x55b78f5dd0d8) at ../../../src/include/executor/executor.h:248
#41 0x000055b78cb42f45 in ExecutePlan (estate=0x55b79055b970, planstate=0x55b78f5dd0d8, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x55b7905b0a50,
    execute_once=true) at execMain.c:1632
#42 0x000055b78cb409a7 in standard_ExecutorRun (queryDesc=0x55b78f268710, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:350
#43 0x000055b78cb407ba in ExecutorRun (queryDesc=0x55b78f268710, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:294
#44 0x000055b78cd89de6 in PortalRunSelect (portal=0x55b78f110ec0, forward=true, count=0, dest=0x55b7905b0a50) at pquery.c:912
#45 0x000055b78cd89a23 in PortalRun (portal=0x55b78f110ec0, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x55b7905b0a50, altdest=0x55b7905b0a50, qc=0x7ffd9e291ec0) at pquery.c:756
#46 0x000055b78cd8333a in exec_simple_query (
    query_string=0x55b78f0a4b50 "select  \n  ref_0.time_bucket as c0, \n  ref_0.avg as c1, \n  ref_0.time_bucket as c2, \n  (select pg_catalog.min(time_bucket) from public.metrics_cagg)\n     as c3, \n  ref_0.time_bucket as c4, \n  pg_catalog.pg_control_checkpoint() as c5, \n  ref_0.time_bucket as c6, \n  ref_0.avg as c7, \n  ref_0.time_bucket as c8, \n  ref_0.time_bucket as c9\nfrom \n  public.metrics_cagg as ref_0\nwhere EXISTS (\n  select  \n      ref_1.device as c0, \n      \n        pg_catalog.every(\n          cast(public.timescaledb_post_restore() as bool)) over (partition by ref_1.time_bucket,ref_1.device,ref_0.device order by ref_0.time_bucket) as c1, \n      ref_0.time_bucket as c2\n    from \n      public.metrics_cagg as ref_1\n    where ref_1.avg is NULL\n    limit 83)\nlimit 50") at postgres.c:1239
@svenklemm
Copy link
Member Author

To reproduce:

CREATE TABLE metrics(filler_1 int, filler_2 int, filler_3 int, time timestamptz NOT NULL, device_id int, v0 int, v1 int, v2 float, v3 float);
CREATE INDEX ON metrics(time DESC);
CREATE INDEX ON metrics(device_id,time DESC);
SELECT create_hypertable('metrics','time',create_default_indexes:=false);

ALTER TABLE metrics DROP COLUMN filler_1;
INSERT INTO metrics(time,device_id,v0,v1,v2,v3) SELECT time, device_id, device_id+1,  device_id + 2, device_id + 0.5, NULL FROM generate_series('2000-01-01 0:00:00+0'::timestamptz,'2000-01-05 23:55:00+0','2m') gtime(time), generate_series(1,5,1) gdevice(device_id);
ALTER TABLE metrics DROP COLUMN filler_2;
INSERT INTO metrics(time,device_id,v0,v1,v2,v3) SELECT time, device_id, device_id-1, device_id + 2, device_id + 0.5, NULL FROM generate_series('2000-01-06 0:00:00+0'::timestamptz,'2000-01-12 23:55:00+0','2m') gtime(time), generate_series(1,5,1) gdevice(device_id);
ALTER TABLE metrics DROP COLUMN filler_3;
INSERT INTO metrics(time,device_id,v0,v1,v2,v3) SELECT time, device_id, device_id, device_id + 2, device_id + 0.5, NULL FROM generate_series('2000-01-13 0:00:00+0'::timestamptz,'2000-01-19 23:55:00+0','2m') gtime(time), generate_series(1,5,1) gdevice(device_id);
ANALYZE metrics;

create materialized view cagg1 WITH (timescaledb.continuous) AS SELECT time_bucket('1h',time), device_id, min(v0), max(v1), avg(v2) FROM metrics GROUP BY 1,2;
select from metrics as m, lateral(select m from cagg1 where avg is NULL limit 1) as lat;

@fabriziomello fabriziomello self-assigned this Jan 4, 2022
@svenklemm
Copy link
Member Author

I could minimize the reproducer a little bit more:

CREATE TABLE metrics(time timestamptz NOT NULL, device_id int, v2 float);
SELECT create_hypertable('metrics','time',create_default_indexes:=false);

INSERT INTO metrics(time,device_id,v2) SELECT time, device_id, random() FROM generate_series('2000-01-01 0:00:00+0'::timestamptz,'2000-01-05 23:55:00+0','2m') gtime(time), generate_series(1,5,1) gdevice(device_id);

create materialized view cagg1 WITH (timescaledb.continuous,timescaledb.materialized_only=true) AS SELECT time_bucket('1h',time), device_id, avg(v2) FROM metrics GROUP BY 1,2;
select from metrics as m, lateral(select m from cagg1 where avg is NULL limit 1) as lat;

@fabriziomello fabriziomello linked a pull request Jan 11, 2022 that will close this issue
@fabriziomello
Copy link
Contributor

fabriziomello commented Jan 12, 2022

After a lot of unsuccessful attempts to fixed it we decided to come back to this issue later

@mkindahl
Copy link
Contributor

mkindahl commented Apr 6, 2022

The error is triggered because the per-group state is modified by the tsl_finalize_agg_ffunc and when called from finalize_aggregate in nodeAgg.c the state can be re-used.

mkindahl added a commit to mkindahl/timescaledb that referenced this issue Apr 6, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
mkindahl added a commit to mkindahl/timescaledb that referenced this issue Apr 6, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
mkindahl added a commit to mkindahl/timescaledb that referenced this issue Apr 6, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
mkindahl added a commit that referenced this issue Apr 6, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes #3248
@mkindahl mkindahl mentioned this issue Apr 6, 2022
@mkindahl mkindahl added this to the TimescaleDB 2.6.1 milestone Apr 7, 2022
RafiaSabih pushed a commit to RafiaSabih/timescaledb that referenced this issue Apr 7, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
RafiaSabih pushed a commit to RafiaSabih/timescaledb that referenced this issue Apr 7, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
RafiaSabih pushed a commit to RafiaSabih/timescaledb that referenced this issue Apr 8, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
mkindahl added a commit to RafiaSabih/timescaledb that referenced this issue Apr 8, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes timescale#3248
svenklemm pushed a commit that referenced this issue Apr 11, 2022
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes #3248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants