New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow recovery from rabbitmq restart #324
Merged
jmchilton
merged 6 commits into
galaxyproject:master
from
mvdbeek:dont_die_on_rabbitmq_restart
Apr 21, 2023
Merged
Allow recovery from rabbitmq restart #324
jmchilton
merged 6 commits into
galaxyproject:master
from
mvdbeek:dont_die_on_rabbitmq_restart
Apr 21, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mvdbeek
force-pushed
the
dont_die_on_rabbitmq_restart
branch
from
April 20, 2023 10:10
33724c0
to
73caeee
Compare
Should fix ``` Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: pulsar.client.manager ERROR 2023-04-20 10:35:34,753 [pN:handler_0,p:1034676,tN:pulsar_client__default__kill_ack] Exception while handling kill acknowledgement messages, this shouldn't really happen. Handler should be restarted. Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: Traceback (most recent call last): Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 446, in _reraise_as_library_errors Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: yield Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 433, in _ensure_connection Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: return retry_over_time( Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/utils/functional.py", line 312, in retry_over_time Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: return fun(*args, **kwargs) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 877, in _connection_factory Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self._connection = self._establish_connection() Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 812, in _establish_connection Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: conn = self.transport.establish_connection() Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/transport/pyamqp.py", line 201, in establish_connection Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: conn.connect() Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/amqp/connection.py", line 323, in connect Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self.transport.connect() Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/amqp/transport.py", line 129, in connect Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self._connect(self.host, self.port, self.connect_timeout) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/amqp/transport.py", line 184, in _connect Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self.sock.connect(sa) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: ConnectionRefusedError: [Errno 111] Connection refused Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: The above exception was the direct cause of the following exception: Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: Traceback (most recent call last): Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/pulsar/client/manager.py", line 191, in ack_consumer Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self.exchange.consume(queue_name + '_ack', None, check=self) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/pulsar/client/amqp_exchange.py", line 124, in consume Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: with kombu.Consumer(connection, queues=[queue], callbacks=callbacks, accept=['json']): Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/messaging.py", line 387, in __init__ Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self.revive(self.channel) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/messaging.py", line 400, in revive Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: channel = self.channel = maybe_channel(channel) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 1052, in maybe_channel Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: return channel.default_channel Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 895, in default_channel Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self._ensure_connection(**conn_opts) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 432, in _ensure_connection Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: with ctx(): Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__ Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: self.gen.throw(typ, value, traceback) Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: File "/srv/galaxy/venv/lib/python3.10/site-packages/kombu/connection.py", line 450, in _reraise_as_library_errors Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: raise ConnectionError(str(exc)) from exc Apr 20 10:35:34 gat-4.eu.galaxy.training galaxyctl[1034676]: kombu.exceptions.OperationalError: [Errno 111] Connection refused ``` on the client side (=Galaxy) when restarting rabbitmq.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ConnectionForced
seems very deliberate and may not fix all of #316, but a controlled restart of a rabbitmq server results in:and this fixes that.