Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having issues generated by zerver.worker.queue_processors #12923

Open
crizise opened this issue Aug 2, 2019 · 6 comments
Open

Having issues generated by zerver.worker.queue_processors #12923

crizise opened this issue Aug 2, 2019 · 6 comments
Labels

Comments

@crizise
Copy link

crizise commented Aug 2, 2019

Hi,
my organization have started using Zulip at that installed at Digitalocean droplet (2vcpu-4gb, Ubuntu 18.04)
Time from time we're getting following errors:

[Django]: server closed the connection unexpectedly\n This probably means the server terminated abnormally\n before or while processing the request.\n

Logger root, from module zerver.worker.queue_processors line 147: Error generated by Anonymous user (not logged in) on zulip-ubuntu-s-2vcpu-4gb-sfo2-01 deployment

Traceback (most recent call last):

File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/lib/db.py", line 32, in execute
return wrapper_execute(self, super().execute, query, vars)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/lib/db.py", line 19, in wrapper_execute
return action(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly

This probably means the server terminated abnormally
before or while processing the request.
The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/worker/queue_processors.py", line 130, in consume_wrapper
self.consume(data)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/worker/queue_processors.py", line 252, in consume
do_update_user_activity_interval(user_profile, log_time)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/lib/actions.py", line 3760, in do_update_user_activity_interval
last = UserActivityInterval.objects.filter(user_profile=user_profile).order_by("-end")[0]
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/query.py", line 289, in getitem
return list(qs)[0]
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/query.py", line 250, in iter
self._fetch_all()
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/query.py", line 1121, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/query.py", line 53, in iter
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch)
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 899, in execute_sql
raise original_exception
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 889, in execute_sql
cursor.execute(sql, params)
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/utils.py", line 94, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/zulip/deployments/2019-06-14-12-42-27/zulip-py3-venv/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/lib/db.py", line 32, in execute
return wrapper_execute(self, super().execute, query, vars)
File "/home/zulip/deployments/2019-06-14-12-42-27/zerver/lib/db.py", line 19, in wrapper_execute
return action(sql, params)
django.db.utils.OperationalError: server closed the connection unexpectedly

This probably means the server terminated abnormally
before or while processing the request.
Deployed code:

ZULIP_VERSION: 2.0.4

version: 2.0.4

Request info: none

Could you please tell why we get errors like these and how to fix the issues.
Thanks!

@timabbott
Copy link
Sponsor Member

@crizise that error is caused by the postgres server being in the process of restarting. I would do the following:

service postgresql restart
/home/zulip/deployments/current/scripts/restart-server

and see if that fixes it.

@hackerkid FYI.

@crizise
Copy link
Author

crizise commented Aug 6, 2019

@timabbott thank you, I run this command and seems no errors anymore.

@timabbott
Copy link
Sponsor Member

OK. My guess is that apt auto-updated your postgres server version, and that's the source of the problem.

@hackerkid we probably will want to either disable unattended upgrades in that image or add a post-unattended-upgrade hook to restart the Zulip server. I'd be curious if the latter is possible; that would be a much cleaner solution.

@andersk do you know if there's a way to register hook code to run after unattended-upgrades updates a package?

@timabbott
Copy link
Sponsor Member

I guess mvo5/unattended-upgrades#55 suggests one can use https://wiki.debian.org/AptConf to set a script to run after a successful dpkg invocation; we could have that check if the postgres version changed, and if so, restart the Zulip server?

@andersk
Copy link
Member

andersk commented Aug 8, 2019

Is that really a hook you want? It seems overly specific. It only helps under the conditions that Postgres is running on the same system and its restart was triggered by dpkg, neither of which fundamentally has anything to do with the problem.

Shouldn’t it be the Zulip server’s job to notice that it lost its Postgres connection for any reason, and retry or restart itself?

@timabbott
Copy link
Sponsor Member

@andersk it's a good question. We do have code to do that sort of restart, and I suspect that happened in this case, but the current setup restarts just the queue worker or other process that had the issue, not all of Zulip, and so one gets a series of random exception emails like this one for the next like 24 hours on low-traffic systems.

This is extremely confusing to sysadmins for whom the server restart happened automatically via unattended-upgrades, because as far as they know, they didn't do anything, and suddently they got this somewhat distributed burst of exception emails.

I suppose there are other ways we could address this (E.g. silencing the exception if we can see the postgres server was just restarted), but all of the options are kinda messy, and having an explicit trigger is likely helpful.

(I suspect folks running a postgres-only system are much less likely to have unattended-upgrades running by accident)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants