Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary & Replicas continuously restarting #3626

Closed
4 tasks
vaigau6g opened this issue Apr 9, 2023 · 6 comments
Closed
4 tasks

Primary & Replicas continuously restarting #3626

vaigau6g opened this issue Apr 9, 2023 · 6 comments
Labels

Comments

@vaigau6g
Copy link

vaigau6g commented Apr 9, 2023

Please ensure you do the following when reporting a bug:

  • Provide a concise description of what the bug is.
    Primary & Replicas continuously restarting in postgresql operator on Openshift 4.10
  • Provide information about your environment.
    Openshift - 4.10
    Openshift Data Foundation - odf-operator.v4.10.10
    Crunchydata postgresql Operator - 5.3.0
    pgo-version: 5.3.0
    postgresVersion: 14
  • Provide clear steps to reproduce the bug.
  1. On Openshift 4.10 with ODF 4.10.10 installed with local storage cluster configured, install Crunchydata postgresql Operator - 5.3.0
  2. Create cluster with replicas
  3. Start performance testing / or add lots of transactions
  4. Continuously restarting primary and replicas

Overview

Add a concise description of what the bug is.

Environment

Please provide the following details:

  • Platform: (Kubernetes, OpenShift, Rancher, GKE, EKS, AKS etc.)
  • Platform Version: (e.g. 1.20.3, 4.7.0)
  • PGO Image Tag: (e.g. ubi8-5.3.0-0)
  • Postgres Version (e.g. 14)
  • Storage: (e.g. hostpath, nfs, or the name of your storage class)

Steps to Reproduce

REPRO

Provide steps to get to the error condition:

  1. Run ...
  2. Do ...
  3. Try ...

EXPECTED

  1. Provide the behavior that you expected.

ACTUAL

  1. Describe what actually happens

Logs

Please provided appropriate log output or any configuration files that may help troubleshoot the issue. DO NOT include sensitive information, such as passwords.

Additional Information

Please provide any additional information that may be helpful.

@vaigau6g
Copy link
Author

vaigau6g commented Apr 9, 2023

Exception in primary/replicas pods =======

2023-04-09 14:34:01,752 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:48184
2023-04-09 14:34:01,754 WARNING: Traceback (most recent call last):
File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request
t.start()
File "/usr/lib64/python3.6/threading.py", line 867, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
2023-04-09 14:34:01,755 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:48182
2023-04-09 14:34:01,755 WARNING: Traceback (most recent call last):
File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request
t.start()
File "/usr/lib64/python3.6/threading.py", line 867, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
2023-04-09 14:34:10,034 INFO: no action. I am (mdsp-psql-lpc-primary-8csb-0), the leader with the lock
2023-04-09 14:34:20,031 INFO: no action. I am (mdsp-psql-lpc-primary-8csb-0), the leader with the lock
2023-04-09 14:34:21,753 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:59482
2023-04-09 14:34:21,754 WARNING: Traceback (most recent call last):
File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request
t.start()
File "/usr/lib64/python3.6/threading.py", line 867, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

@vaigau6g
Copy link
Author

pg_hba configuration ============================

patronictl show-config
loop_wait: 10
postgresql:
parameters:
archive_command: pgbackrest --stanza=db archive-push "%p"
archive_mode: 'on'
archive_timeout: 60s
default_pool_size: '100'
jit: 'off'
max_connections: '2500'
password_encryption: scram-sha-256
restore_command: pgbackrest --stanza=db archive-get %f "%p"
shared_buffers: 4GB
shared_preload_libraries: pgaudit
ssl: 'on'
ssl_ca_file: /pgconf/tls/ca.crt
ssl_cert_file: /pgconf/tls/tls.crt
ssl_key_file: /pgconf/tls/tls.key
synchronous_commit: 'on'
unix_socket_directories: /tmp/postgres
wal_level: logical
pg_hba:

  • local all "postgres" peer
  • hostssl replication "_crunchyrepl" all cert
  • hostssl "postgres" "_crunchyrepl" all cert
  • host all "_crunchyrepl" all reject
  • hostssl all "_crunchypgbouncer" all scram-sha-256
  • host all "_crunchypgbouncer" all reject
  • local all all trust
  • host all all all md5
  • hostssl all all all md5
    use_pg_rewind: true
    use_slots: false
    synchronous_commit: 'on'
    synchronous_mode: true
    ttl: 30

@vaigau6g
Copy link
Author

issue is regarding application requests in psql 
getting below error on psql pod and restarting continuously

Exception happened during processing of request from ::ffff:10.128.10.1:43576
1192023-04-11 06:58:03,453 WARNING: Traceback (most recent call last):
120File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock
121self.process_request(request, client_address)
122File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request
123t.start()
124File "/usr/lib64/python3.6/threading.py", line 867, in start
125_start_new_thread(self._bootstrap, ())
126RuntimeError: can't start new thread

@vaigau6g
Copy link
Author

vaigau6g commented Apr 13, 2023

postgres user credentials working with connection string
postgres_credentials_working

@vaigau6g
Copy link
Author

Is it due to restoring the older version database to the newer version pgo cluster ?
Previous version before upgrade
pgo version = 4.7.3
postgresql version = 13

@andrewlecuyer
Copy link
Collaborator

Considering this issue is for an older version of CPK that is no longer actively maintained via the Crunchy Developer Program, I am proceeding with closing (see the Supported Plaforms page for additional information about supported versions of CPK).

For information about upgrading from CPK v4 to v5, please see the upgrade guide:

https://access.crunchydata.com/documentation/postgres-operator/latest/upgrade/v4tov5

And if you still require support for CPK v4.7.3, I recommend recaching out to info@crunchydata.com to discuss your requirements/needs further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants