Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crashes when trying to execute SELECT unnest(shard_placement_rebalance_array(...); #7553

Open
saygoodbyye opened this issue Mar 7, 2024 · 1 comment

Comments

@saygoodbyye
Copy link

Hello! I have got a server crash when executing the following SQL-script (test.sql). I have postgres on REL_16_STABLE branch and citus on main branch.
Postgres build:

CFLAGS="-Og" ./configure \
        --enable-cassert \
        --enable-tap-tests \
        --enable-debug \
        --with-icu \
        --with-lz4 \
        --with-libxml \
        --with-openssl \
        --prefix=$DATA \
        --quiet

Citus build:

PG_CONFIG=$PG_CONFIG ./configure --without-lz4 --without-zstd

postgresql.conf:

shared_preload_libraries='citus'

test.sql:

CREATE OR REPLACE FUNCTION shard_placement_rebalance_array(
    worker_node_list json[],
    shard_placement_list json[],
    threshold float4 DEFAULT 0,
    max_shard_moves int DEFAULT 1000000,
    drain_only bool DEFAULT false,
    improvement_threshold float4 DEFAULT 0.5
)
RETURNS json[]
AS 'citus'
LANGUAGE C STRICT VOLATILE;

SELECT unnest(shard_placement_rebalance_array(
    ARRAY['{"node_name": "hostname1"}',
          '{"node_name": "hostname2", "capacity": 3}']::json[],
    ARRAY['{"hostname1":1, "nodename":"hostname1"}',
          '{"shardid":2, "nodename":"node_name"}',
          '{"shardid":3, "nodename":"hostname1", "cost": 2}']::json[]
));

backtrace:

#0  InitRebalanceState (functions=0x7ffcb1d2b5d0, shardPlacementList=<optimized out>, workerNodeList=0x55aacd6b1f70) at operations/shard_rebalancer.c:2567
#1  RebalancePlacementUpdates (workerNodeList=0x55aacd6b1f70, activeShardPlacementListList=activeShardPlacementListList@entry=0x55aacd6b1f10, threshold=0, maxShardMoves=maxShardMoves@entry=1000000, drainOnly=drainOnly@entry=false, 
    improvementThreshold=improvementThreshold@entry=0.5, functions=functions@entry=0x7ffcb1d2b5d0) at operations/shard_rebalancer.c:2433
#2  0x00007f5bdc2787bf in shard_placement_rebalance_array (fcinfo=<optimized out>) at test/shard_rebalancer.c:176
#3  0x000055aacc0214e0 in ExecInterpExpr (state=0x55aacd69f790, econtext=0x55aacd69ed70, isnull=<optimized out>) at execExprInterp.c:758
#4  0x000055aacc02caa4 in ExecEvalExpr (isNull=0x55aacd69fee0, econtext=0x55aacd69ed70, state=<optimized out>) at ../../../src/include/executor/executor.h:336
#5  ExecEvalFuncArgs (fcinfo=fcinfo@entry=0x55aacd69feb8, argList=0x55aacd69fe70, econtext=econtext@entry=0x55aacd69ed70) at execSRF.c:847
#6  0x000055aacc02d736 in ExecMakeFunctionResultSet (fcache=0x55aacd69f708, econtext=econtext@entry=0x55aacd69ed70, argContext=0x55aacd6aa990, isNull=0x55aacd69f6b0, isDone=isDone@entry=0x55aacd69f6f8) at execSRF.c:577
#7  0x000055aacc052198 in ExecProjectSRF (node=node@entry=0x55aacd69ec68, continuing=continuing@entry=false) at nodeProjectSet.c:183
#8  0x000055aacc05223c in ExecProjectSet (pstate=0x55aacd69ec68) at nodeProjectSet.c:107
#9  0x000055aacc024c22 in ExecProcNode (node=0x55aacd69ec68) at ../../../src/include/executor/executor.h:273
#10 ExecutePlan (execute_once=<optimized out>, dest=0x55aacd6a40f8, direction=-848693592, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x55aacd69ec68, estate=0x55aacd69ea50)
    at execMain.c:1670
#11 standard_ExecutorRun (queryDesc=queryDesc@entry=0x55aacd4c5590, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=execute_once@entry=true) at execMain.c:365
#12 0x00007f5bdc218ed6 in CitusExecutorRun (queryDesc=0x55aacd4c5590, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at executor/multi_executor.c:238
#13 0x000055aacc1c0cff in PortalRunSelect (portal=0x55aacd626ff0, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:924
#14 0x000055aacc1c20e3 in PortalRun (portal=portal@entry=0x55aacd626ff0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x55aacd6a40f8, altdest=altdest@entry=0x55aacd6a40f8, 
    qc=0x7ffcb1d2bbe0) at pquery.c:768
#15 0x000055aacc1be5cd in exec_simple_query (
    query_string=0x55aacd5567a0 "SELECT unnest(shard_placement_rebalance_array(\n    ARRAY['{\"node_name\": \"hostname1\"}',\n          '{\"node_name\": \"hostname2\", \"capacity\": 3}']::json[],\n    ARRAY['{\"hostname1\":1, \"nodename\":\"hostname1\""...) at postgres.c:1274
#16 0x000055aacc1c0707 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#17 0x000055aacc13af8f in BackendRun (port=0x55aacd5d8b00, port=0x55aacd5d8b00) at postmaster.c:4464
#18 BackendStartup (port=0x55aacd5d8b00) at postmaster.c:4192
#19 ServerLoop () at postmaster.c:1782
#20 0x000055aacc13bf95 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x55aacd4bdfd0) at postmaster.c:1466
#21 0x000055aacbe8fb91 in main (argc=3, argv=0x55aacd4bdfd0) at main.c:198

Best regards,
Egor Chindyaskin
Postgres Professional: http://postgrespro.com/

@JelteF
Copy link
Contributor

JelteF commented Mar 7, 2024

Just like #7551 I don't consider this a problematic crash. It's again a function that's only supposed to be used in our tests. Feel free to submit a PR to fix it though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants