Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Namespace Create Failed: The namespace is in process of deletion due to internal error #22046

Open
1 task done
archit-rastogi opened this issue Apr 18, 2024 · 1 comment
Assignees
Labels
2024.1_blocker area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@archit-rastogi
Copy link

archit-rastogi commented Apr 18, 2024

Jira Link: DB-10962

Description

Namespace creation fails with internal error while master runs out of memory, triggering new leader election.
Consequently, client sees below error:

com.yugabyte.util.PSQLException: An I/O error occurred while sending to the backend.

Stress run: http://stress.dev.yugabyte.com/stress_test/22936e17-ab94-4a8b-847e-20056f0b8907
Observed in versions: 2024.1, 2.23.0.0, 12.14.16.0`
Adding more relevant logs for investigation:

Postgres logs:

2024-04-17 04:28:57.364 UTC [45456] ERROR:  Namespace Create Failed: The namespace is in process of deletion due to internal error.

2024-04-17 04:28:57.364 UTC [45456] STATEMENT:  create database db_200;
I0417 04:28:57.367967 45470 mem_tracker.cc:264] Creating root MemTracker with garbage collection threshold 5242880 bytes
I0417 04:28:57.368043 45470 mem_tracker.cc:268] Root memory limit is 9249301463
I0417 04:28:57.368681 45470 thread_pool.cc:178] Starting thread pool { name: pggate_ybclient max_workers: 1024 }
I0417 04:28:57.369256 45470 pg_client.cc:361] Using TServer host_port: 172.151.17.13:9100
I0417 04:28:57.370360 45470 pg_client.cc:374] Session id 129: Session id acquired. Postgres backend pid: 45470

2024-04-17 04:28:57.374 UTC [45470] FATAL:  database "db_200" does not exist


I0417 04:28:57.375366 45474 poller.cc:69] Poll stopped: Service unavailable (yb/rpc/scheduler.cc:78): Scheduler is shutting down (system error 108)
I0417 04:28:57.375527 45459 poller.cc:69] Poll stopped: Service unavailable (yb/rpc/scheduler.cc:78): Scheduler is shutting down (system error 108)
2024-04-17 04:28:57.378 UTC [36202] WARNING:  server process (PID 45470) exited with exit code 1

Master logs:

W0417 04:28:46.777650 45377 scoped_leader_shared_lock.cc:170] RPC took a long time (../../src/yb/master/catalog_manager.cc:9182, ProcessPendingNamespace): 6.854s
    @     0xaaaac7999424  yb::master::CatalogManager::ProcessPendingNamespace()
    @     0xaaaac799a1b8  _ZNSt3__18__invokeB7v170002IRMN2yb6master14CatalogManagerEFvNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEENS_6vectorI13scoped_refptrINS2_9TableInfoEENS7_ISD_EEEENS1_19TransactionMetadataERKNS2_11LeaderEpochEERPS3_JRS9_RSF_RSG_RSH_EvEEDTcldsdeclsr3stdE7declvalIT0_EEclsr3stdE7declvalIT_EEspclsr3stdE7declvalIT1_EEEEOSU_OST_DpOSV_
    @     0xaaaac8a285d8  yb::ThreadPool::DispatchThread()
    @     0xaaaac8a24d58  yb::Thread::SuperviseThread()
    @     0xffff99c278b8  start_thread
    @     0xffff99c83afc  thread_start
W0417 04:28:49.578789 31446 binary_call_parser.cc:134] Unable to allocate read buffer because of limit, required: 81245751, blocked by: 0x0000273bffda36a0 -> Read Buffer->server->root, consumption: 0 of 81134223. Call will be ignored.

Large noise is generated with Tablespace information not found for table warnings in master.

533:W0417 04:28:47.439368 31514 client_master_rpc.cc:92] ListTables: Leader Master has changed (172.151.17.13:7100 is no longer the leader), re-trying...
534:W0417 04:28:47.459101 31507 catalog_manager_util.cc:95] Internal error (yb/master/ysql_tablespace_manager.cc:72): Tablespace information not found for table 00004162000030008000000000004000
535:W0417 04:28:47.459120 43562 catalog_manager.cc:9640] Could not remove namespace from maps, name=db_199, id=00004163000030008000000000000000
536:W0417 04:28:47.459501 31507 catalog_manager_util.cc:95] Internal error (yb/master/ysql_tablespace_manager.cc:72): Tablespace information not found for table 00004161000030008000000000004000 (> 1200 entries in logs)
W0417 04:28:54.391921 43632 sys_catalog.cc:738] Waited for 5.000s for synchronous write to complete. Continuing to wait.
W0417 04:28:55.688746 43632 catalog_manager.cc:9230] Aborted (yb/consensus/replica_state.cc:643): Error copying PGSQL system tables for pending namespace: Operation aborted by new leader
W0417 04:28:55.688778 43632 scoped_leader_shared_lock.cc:170] RPC took a long time (../../src/yb/master/catalog_manager.cc:9182, ProcessPendingNamespace): 6.848s

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@lingamsandeep
Copy link
Contributor

Duplicate of #19903

@lingamsandeep lingamsandeep marked this as a duplicate of #19903 Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024.1_blocker area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants