Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] count(*) query crashed and db in recovery mode #212

Open
1 of 2 tasks
liyxbeijing opened this issue Sep 21, 2023 · 1 comment
Open
1 of 2 tasks

[Bug] count(*) query crashed and db in recovery mode #212

liyxbeijing opened this issue Sep 21, 2023 · 1 comment
Labels
priority: High After critical issues are fixed, these should be dealt with before any further issues. type: Bug Something isn't working

Comments

@liyxbeijing
Copy link

Cloudberry Database version

warehouse=# select version();
version

PostgreSQL 14.4 (Cloudberry Database 1.4.0 build commit:e83e3ffc22d538deb2dbceeeae0138ca2de064e6) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11), 64-b
it compiled on Aug 3 2023 10:15:47
(1 row)

What happened

count(*) query crashed:

# \d+ zzz_admin.aaa
                                         Table "zzz_admin.aaa"
    Column     |               Type                | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
---------------+-----------------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 table_catalog | information_schema.sql_identifier |           |          |         | plain    |             |              | 
 table_schema  | information_schema.sql_identifier |           |          |         | plain    |             |              | 
 table_name    | information_schema.sql_identifier |           |          |         | plain    |             |              | 
 dt            | text                              |           |          |         | extended |             |              | 
Distributed by: (table_catalog)
Access method: heap

=# select count(*) from zzz_admin.aaa;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> \q

logs:

2023-09-21 08:40:37.498183 CST,"gpadmin","",p433642,th-891975552,"[local]",,2023-09-21 08:38:56 CST,0,con21392,cmd29,seg-1,,,,sx1,"LOG","00000","statement: select count(*) from zzz_admin.aaa;",,,,,,,0,,"postgres.c",1708,
2023-09-21 08:40:37.551299 CST,,,p433642,th123456,,,2023-09-21 08:38:56 CST,0,con21392,cmd29,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master process received signal SIGSEGV",,,,,,,0,,,,"1    0x7fb5c8cc6630 libpthread.so.0 <symbol not found> + 0xc8cc6630
2    0xd5cc6c postgres DirectFunctionCall1Coll (fmgr.c:806)
"
2023-09-21 08:40:37.651110 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","server process (PID 433642) was terminated by signal 11: Segmentation fault","Failed process was running: select count(*) from zzz_admin.aaa;",,,,,,0,,"postmaster.c",4229,
2023-09-21 08:40:37.651133 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","terminating any other active server processes",,,,,,,0,,"postmaster.c",3959,
2023-09-21 08:40:37.652496 CST,,,p75086,th-891975552,,,,0,,,seg-1,,,,,"WARNING","01000","ic-proxy-server: received signal 3",,,,,,,0,,"ic_proxy_main.c",474,
2023-09-21 08:40:37.652519 CST,"gpadmin","",p434360,th-891975552,"[local]",,2023-09-21 08:40:37 CST,0,,,seg-1,,,,,"FATAL","57P03","the database system is in recovery mode","last replayed record at 1C/27F1B788",,,,,,0,,"postmaster.c",2746,
2023-09-21 08:40:37.652569 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","background worker ""ic proxy process"" (PID 75086) exited with exit code 1",,,,,,,0,,"postmaster.c",4208,
2023-09-21 08:40:37.656272 CST,"gpadmin",,p434361,th-891975552,"","18994",2023-09-21 08:40:37 CST,0,,,seg-1,,,,,"FATAL","57P03","the database system is in recovery mode","last replayed record at 1C/27F1B788",,,,,,0,,"postmaster.c",2746,
2023-09-21 08:40:37.665056 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","all server processes terminated; reinitializing",,,,,,,0,,"postmaster.c",4515,
2023-09-21 08:40:37.831100 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","database system was interrupted; last known up at 2023-09-21 08:38:41 CST",,,,,,,0,,"xlog.c",6816,
2023-09-21 08:40:37.831129 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","Synchronization of the wal directory starts.",,,,,,,0,,"fd.c",3446,
2023-09-21 08:40:37.831191 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","synchronization of the wal directory finishes.",,,,,,,0,,"fd.c",3448,
2023-09-21 08:40:37.831701 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","database system was not properly shut down; automatic recovery in progress",,,,,,,0,,"xlog.c",7385,
2023-09-21 08:40:37.831701 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","database system was not properly shut down; automatic recovery in progress",,,,,,,0,,"xlog.c",7385,
2023-09-21 08:40:37.974973 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","redo starts at 1C/4035D810",,,,,,,0,,"xlog.c",7674,
2023-09-21 08:40:37.980902 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","invalid record length at 1C/40918430: wanted 24, got 0",,,,,,,0,,"xlog.c",4482,
2023-09-21 08:40:37.980911 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","redo done at 1C/40918260 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s",,,,,,,0,,"xlog.c",7962,
2023-09-21 08:40:38.265962 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","end of transaction log location is 1C/40918430",,,,,,,0,,"xlog.c",8051,
2023-09-21 08:40:38.281916 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","latest completed transaction id is 1719112 and next transaction id is 1719113",,,,,,,0,,"xlog.c",8449,
2023-09-21 08:40:38.281968 CST,,,p434363,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","database system is ready",,,,,,,0,,"xlog.c",8476,
2023-09-21 08:40:38.285848 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","PostgreSQL 14.4 (Cloudberry Database 1.4.0 build commit:e83e3ffc22d538deb2dbceeeae0138ca2de064e6) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11), 64-bit compiled on Aug  3 2023 10:15:10",,,,,,,0,,"postmaster.c",3538,
2023-09-21 08:40:38.285861 CST,,,p370720,th-891975552,,,,0,,,seg-1,,,,,"LOG","00000","database system is ready to accept connections","PostgreSQL 14.4 (Cloudberry Database 1.4.0 build commit:e83e3ffc22d538deb2dbceeeae0138ca2de064e6) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11), 64-bit compiled on Aug  3 2023 10:15:10",,,,,,0,,"postmaster.c",3540,
2023-09-21 08:40:38.291444 CST,,,p434368,th-891975552,,,,0,con1,,seg-1,,,,sx1,"LOG","00000","initialized 1 resource queues",,,,,,,0,,"resscheduler.c",267,
2023-09-21 08:40:38.362261 CST,,,p434368,th-891975552,,,,0,con1,,seg-1,,,,sx1,"LOG","00000","Crash recovery broadcast of the distributed transaction 'Commit Prepared' broadcast succeeded for gid = 2015839.",,,,,,,0,,"cdbdtxrecovery.c",84,
2023-09-21 08:40:38.364541 CST,,,p434368,th-891975552,,,,0,con1,,seg-1,,,,sx1,"LOG","00000","DTM Started",,,,,,,0,,"cdbdtxrecovery.c",155,
2023-09-21 08:40:38.368227 CST,,,p434374,th-891975552,,,,0,con4,,seg-1,,,,,"LOG","00000","pg_cron scheduler started",,,,,,,0,,"pg_cron.c",376,
2023-09-21 08:40:42.656687 CST,"gpadmin",,p434423,th-891975552,"","19002",2023-09-21 08:40:42 CST,0,,,seg-1,,,,sx1,"LOG","00000","rejecting TCP connection to master using internalconnection protocol",,,,,,,0,,"auth.c",570,
2023-09-21 08:40:42.697845 CST,"gpadmin",,p434423,th-891975552,"","19002",2023-09-21 08:40:42 CST,0,,,seg-1,,,,,"LOG","00000","standby ""gp_walreceiver"" is now a synchronous standby with priority 1",,,,,,"START_REPLICATION 1C/40000000 TIMELINE 1",0,,"syncrep.c",671,

What you think should happen instead

No response

How to reproduce

rerun the query.

Operating System

centos 7.9

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@liyxbeijing liyxbeijing added the type: Bug Something isn't working label Sep 21, 2023
@my-ship-it my-ship-it added the priority: High After critical issues are fixed, these should be dealt with before any further issues. label Nov 13, 2023
@congxuebin
Copy link
Collaborator

Just a reminder, this issue was marked as priority high by Max.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: High After critical issues are fixed, these should be dealt with before any further issues. type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants