Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synapse newer than 1.64 doesn't start #59

Closed
worldowner opened this issue Oct 17, 2022 · 22 comments
Closed

synapse newer than 1.64 doesn't start #59

worldowner opened this issue Oct 17, 2022 · 22 comments

Comments

@worldowner
Copy link

I start synapse using avhost/docker-matrix as a service:

docker service create \
    --name synapse \
    --publish published=3478,target=3478,protocol=tcp \
    --publish published=3478,target=3478,protocol=udp \
    --publish published=8448,target=8448 \
    --mount type=volume,src=synapse-data,dst=/data \
    --mount type=bind,src=/etc/localtime,dst=/etc/localtime,ro \
    --mount type=volume,src=certbot-etc-letsencrypt,dst=/etc/letsencrypt,ro \
    --env REPORT_STATS=no \
    --env SERVER_NAME="MYDOMAIN.TLD" \
    --network matrix \
    avhost/docker-matrix:v1.68.0

Any image newer than 1.64.0 doesn't seem to try to start synapse:

# docker logs synapse.1.mdk3la8v8urmihcrlnz3umkyf
-=> start turn
-=> start matrix
Cannot create pid file: /var/run/turnserver.pid: Permission denied
Cannot create relay thread
: Operation not permitted
0: : Black listing: 172.16.0.0-172.31.255.255
0: : log file opened: /data/turnserver_2022-10-17.log
0: : 
RFC 3489/5389/5766/5780/6062/6156 STUN/TURN Server
Version Coturn-4.5.2 'dan Eider'
0: : 
Max number of open files/sockets allowed for this process: 1048576
0: : 
Due to the open files/sockets limitation,
max supported number of TURN Sessions possible is: 524000 (approximately)
0: : 

==== Show him the instruments, Practical Frost: ====

0: : TLS supported
0: : DTLS supported
0: : DTLS 1.2 supported
0: : TURN/STUN ALPN supported
0: : Third-party authorization (oAuth) supported
0: : GCM (AEAD) supported
0: : OpenSSL compile-time version: OpenSSL 3.0.3 3 May 2022 (0x30000030)
0: : 
0: : SQLite supported, default database location is /var/lib/turn/turndb
0: : Redis supported
0: : PostgreSQL supported
0: : MySQL supported
0: : MongoDB is not supported
0: : 
0: : Default Net Engine version: 3 (UDP thread per CPU core)

=====================================================

0: : Domain name: 
0: : Default realm: turn.MYDOMAIN.TLD
0: : 
CONFIGURATION ALERT: You specified --lt-cred-mech and --use-auth-secret in the same time.
Be aware that you could not mix the username/password and the shared secret based auth methods. 
Shared secret overrides username/password based auth method. Check your configuration!
0: : SSL23: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : SSL23: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : TLS1.0: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : TLS1.0: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : TLS1.1: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : TLS1.1: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : TLS1.2: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : TLS1.2: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : TLS cipher suite: HIGH
0: : DTLS: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : DTLS: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : DTLS1.2: Certificate file found: /etc/letsencrypt/live/MYDOMAIN.TLD/fullchain.pem
0: : DTLS1.2: Private key file found: /etc/letsencrypt/live/MYDOMAIN.TLD/privkey.pem
0: : DTLS cipher suite: HIGH
0: : NO EXPLICIT LISTENER ADDRESS(ES) ARE CONFIGURED
0: : ===========Discovering listener addresses: =========
0: : Listener address to use: 127.0.0.1
0: : Listener address to use: 10.255.0.30
0: : Listener address to use: 172.18.0.7
0: : Listener address to use: 10.0.1.34
0: : =====================================================
0: : Total: 3 'real' addresses discovered
0: : =====================================================
0: : NO EXPLICIT RELAY ADDRESS(ES) ARE CONFIGURED
0: : ===========Discovering relay addresses: =============
0: : Relay address to use: 10.255.0.30
0: : Relay address to use: 172.18.0.7
0: : Relay address to use: 10.0.1.34
0: : =====================================================
0: : Total: 3 relay addresses discovered
0: : =====================================================
0: : Cannot create pid file: /var/run/turnserver.pid
0: : pid file created: /var/tmp/turnserver.pid
0: : IO method (main listener thread): epoll (with changelist)
0: : Wait for relay ports initialization...
0: :   relay 10.255.0.30 initialization...
0: :   relay 10.255.0.30 initialization done
0: :   relay 172.18.0.7 initialization...
0: :   relay 172.18.0.7 initialization done
0: :   relay 10.0.1.34 initialization...
0: :   relay 10.0.1.34 initialization done
0: : Relay ports initialization done

1.64.0 works perfectly fine.

@andreaspeters
Copy link
Member

Thanks for your issue. One of your error messages is Due to the open files/sockets limitation and you use coturn, I want ask you if you can disable this service.

@worldowner
Copy link
Author

I'm not sure how to disable coturn in your image without rebuilding it. README doesn't mention any option to control that.
Also this message looks like a warning and "max supported number of TURN Sessions possible is: 524000 (approximately)" is more than enough. In fact I don't even use it, my synapse is configured to use different TURN server so I'd be happy to get rid of it :)

@andreaspeters
Copy link
Member

Thats good, then we can just remove the coturn config file and remove the ports forward in the docker command. I only want to be sure that it's not an coturn issue. To be honest, no one of us contributors use coturn in this image we took over from Silvio. We just keep it inside for compatibility reasons. But if that one start to mess up something. It's a good time to remove it. 😅

@worldowner
Copy link
Author

Deleting config file and removing ports forward doesn't prevent coturn from starting :)

BTW. someone reported a problem with coturn version that you use in the image (4.5.2) that looks the same as I encounter:
coturn/coturn#683

@andreaspeters
Copy link
Member

Thats right, but it will run with default values. So, please remove the file and the forward ports.

@andreaspeters
Copy link
Member

But the coturn PR is interesting. 👍

@worldowner
Copy link
Author

worldowner commented Oct 18, 2022

I tried running coturn with default values (by deleting my config file and removing port forwarding from docker service) and synapse still doesn't start.

@andreaspeters
Copy link
Member

Messages the same as before? Next step would be, to share your homeserver.yaml config file and information about your host system (linux version, docker version). Please be sure that you remove credentials, tokens server name and IP's from the config file.

@biberino
Copy link

Same issue here, all Version above 1.64.0 don't start Synapse. I use Coturn.
Last lines when starting the container:

synapse_1 | 0: : ===========Discovering listener addresses: =========
synapse_1 | 0: : Listener address to use: 127.0.0.1
synapse_1 | 0: : Listener address to use: 172.29.0.2
synapse_1 | 0: : =====================================================
synapse_1 | 0: : Total: 1 'real' addresses discovered
synapse_1 | 0: : =====================================================
synapse_1 | 0: : NO EXPLICIT RELAY ADDRESS(ES) ARE CONFIGURED
synapse_1 | 0: : ===========Discovering relay addresses: =============
synapse_1 | 0: : Relay address to use: 172.29.0.2
synapse_1 | 0: : =====================================================
synapse_1 | 0: : Total: 1 relay addresses discovered
synapse_1 | 0: : =====================================================
synapse_1 | 0: : Cannot create pid file: /var/run/turnserver.pid
synapse_1 | 0: : pid file created: /var/tmp/turnserver.pid
synapse_1 | 0: : IO method (main listener thread): epoll (with changelist)
synapse_1 | 0: : WARNING: I cannot support STUN CHANGE_REQUEST functionality because only one IP address is provided
synapse_1 | 0: : Wait for relay ports initialization...
synapse_1 | 0: : relay 172.29.0.2 initialization...
synapse_1 | 0: : relay 172.29.0.2 initialization done
synapse_1 | 0: : Relay ports initialization done
matrix_synapse_1 exited with code 255

It looks like Coturn hangs at "Relay ports initialization done", or its initialization is completed and Synapse doesnt start.
Rolling the container back to v1.64.0 and everything works again and I dont get any of the above Coturn logs.

@andreaspeters
Copy link
Member

Ok, with the next release, I will add a feature to disable coturn and flag it as deprecated. I will also add these feature to version 1.69.x. Only to give you a working update path to the 1.7x version.

@biberino
Copy link

biberino commented Nov 1, 2022

I looked into it, and it looks like the coturn server is fine.
If i disable the turnserver in "start.sh", and only start synapse and remove the "&" to actually see the output, i get the following error from synapse:

synapse_1 | 2022-11-01 10:13:45,649 - synapse.federation.federation_server - 1385 - INFO - main - Registering federation query handler for 'profile'
synapse_1 | 2022-11-01 10:13:45,650 - synapse.federation.federation_server - 1365 - INFO - main - Registering federation EDU handler for 'm.presence'
synapse_1 | 2022-11-01 10:13:45,651 - synapse.federation.federation_server - 1365 - INFO - main - Registering federation EDU handler for 'm.typing'
synapse_1 | 2022-11-01 10:13:45,652 - synapse.federation.federation_server - 1385 - INFO - main - Registering federation query handler for 'directory'
synapse_1 | 2022-11-01 10:13:45,653 - twisted - 274 - INFO - main - Redirected stdout/stderr to logs
synapse_1 | 2022-11-01 10:13:45,653 - synapse.app.homeserver - 167 - INFO - sentinel - Running
synapse_1 | 2022-11-01 10:13:45,655 - synapse.app.homeserver - 29 - INFO - sentinel - Set file limit to: 1048576
synapse_1 | 2022-11-01 10:13:45,657 - twisted - 274 - CRITICAL - sentinel - Unhandled Error
synapse_1 | Traceback (most recent call last):
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/base.py", line 498, in fireEvent
synapse_1 | DeferredList(beforeResults).addCallback(self._continueFiring)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 497, in addCallback
synapse_1 | return self.addCallbacks(callback, callbackArgs=args, callbackKeywords=kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 477, in addCallbacks
synapse_1 | self._runCallbacks()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 857, in _runCallbacks
synapse_1 | current.result = callback( # type: ignore[misc]
synapse_1 | --- ---
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/base.py", line 510, in _continueFiring
synapse_1 | callable(*args, **kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/enterprise/adbapi.py", line 237, in _start
synapse_1 | return self.start()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/enterprise/adbapi.py", line 247, in start
synapse_1 | self.threadpool.start()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 158, in start
synapse_1 | self.adjustPoolsize()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 304, in adjustPoolsize
synapse_1 | self._team.grow(self.min - self.workers)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 118, in grow
synapse_1 | def createOneWorker():
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 107, in do
synapse_1 | working.pop(0)()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 120, in createOneWorker
synapse_1 | worker = self._createWorker()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 55, in limitedWorkerCreator
synapse_1 | return ThreadWorker(startThread, Queue())
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 49, in init
synapse_1 | startThread(work)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 49, in startThread
synapse_1 | return threadFactory(target=target).start()
synapse_1 | File "/usr/lib/python3.10/threading.py", line 935, in start
synapse_1 | _start_new_thread(self._bootstrap, ())
synapse_1 | builtins.RuntimeError: can't start new thread
synapse_1 |
synapse_1 | 2022-11-01 10:13:45,658 - synapse.metrics.background_process_metrics - 242 - ERROR - _handle_new_device_update_async-0 - Background process '_handle_new_device_update_async' threw an exception
synapse_1 | Traceback (most recent call last):
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/metrics/background_process_metrics.py", line 240, in run
synapse_1 | return await func(*args, **kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/handlers/device.py", line 681, in _handle_new_device_update_async
synapse_1 | rows = await self.store.get_uncoverted_outbound_room_pokes()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/storage/databases/main/devices.py", line 1900, in get_uncoverted_outbound_room_pokes
synapse_1 | return await self.db_pool.runInteraction(
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/storage/database.py", line 882, in runInteraction
synapse_1 | return await delay_cancellation(_runInteraction())
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 959, in send
synapse_1 | raise result.value
synapse_1 | RuntimeError: can't start new thread
synapse_1 | 2022-11-01 10:13:45,660 - synapse.handlers.deactivate_account - 230 - INFO - user_parter_loop-0 - Starting user parter
synapse_1 | 2022-11-01 10:13:45,660 - synapse.metrics.background_process_metrics - 242 - ERROR - user_parter_loop-0 - Background process 'user_parter_loop' threw an exception
synapse_1 | Traceback (most recent call last):
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/metrics/background_process_metrics.py", line 240, in run
synapse_1 | return await func(*args, **kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/handlers/deactivate_account.py", line 233, in _user_parter_loop
synapse_1 | user_id = await self.store.get_user_pending_deactivation()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/storage/databases/main/registration.py", line 1264, in get_user_pending_deactivation
synapse_1 | return await self.db_pool.simple_select_one_onecol(
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/storage/database.py", line 1614, in simple_select_one_onecol
synapse_1 | return await self.runInteraction(
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/storage/database.py", line 882, in runInteraction
synapse_1 | return await delay_cancellation(_runInteraction())
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 959, in send
synapse_1 | raise result.value
synapse_1 | RuntimeError: can't start new thread
synapse_1 | 2022-11-01 10:13:45,661 - synapse.app._base - 257 - CRITICAL - sentinel - Error during startup
synapse_1 | Traceback (most recent call last):
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/_base.py", line 242, in wrapper
synapse_1 | await cb(*args, **kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/homeserver.py", line 391, in start
synapse_1 | await _base.start(hs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/_base.py", line 407, in start
synapse_1 | resolver_threadpool.start()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 158, in start
synapse_1 | self.adjustPoolsize()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 304, in adjustPoolsize
synapse_1 | self._team.grow(self.min - self.workers)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 118, in grow
synapse_1 | def createOneWorker():
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 107, in do
synapse_1 | working.pop(0)()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 120, in createOneWorker
synapse_1 | worker = self._createWorker()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 55, in limitedWorkerCreator
synapse_1 | return ThreadWorker(startThread, Queue())
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 49, in init
synapse_1 | startThread(work)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 49, in startThread
synapse_1 | return threadFactory(target=target).start()
synapse_1 | File "/usr/lib/python3.10/threading.py", line 935, in start
synapse_1 | _start_new_thread(self._bootstrap, ())
synapse_1 | RuntimeError: can't start new thread
synapse_1 | Error during startup:
synapse_1 | Traceback (most recent call last):
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/_base.py", line 242, in wrapper
synapse_1 | await cb(*args, **kwargs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/homeserver.py", line 391, in start
synapse_1 | await _base.start(hs)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/synapse/app/_base.py", line 407, in start
synapse_1 | resolver_threadpool.start()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 158, in start
synapse_1 | self.adjustPoolsize()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/python/threadpool.py", line 304, in adjustPoolsize
synapse_1 | self._team.grow(self.min - self.workers)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 118, in grow
synapse_1 | def createOneWorker():
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 107, in do
synapse_1 | working.pop(0)()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_team.py", line 120, in createOneWorker
synapse_1 | worker = self._createWorker()
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 55, in limitedWorkerCreator
synapse_1 | return ThreadWorker(startThread, Queue())
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_threadworker.py", line 49, in init
synapse_1 | startThread(work)
synapse_1 | File "/usr/local/lib/python3.10/dist-packages/twisted/_threads/_pool.py", line 49, in startThread
synapse_1 | return threadFactory(target=target).start()
synapse_1 | File "/usr/lib/python3.10/threading.py", line 935, in start
synapse_1 | _start_new_thread(self._bootstrap, ())
synapse_1 | RuntimeError: can't start new thread

It looks like Synapse cant start new threads. This is only true for V1.65.0, V1.64.0 can start these threads!
I dont have any systemd or docker limits enabeld (at least that I'm aware of).

@andreaspeters
Copy link
Member

andreaspeters commented Nov 1, 2022

🤔 Looks like a limits issue from your system. May I ask where/how you run docker-matrix?

@biberino
Copy link

biberino commented Nov 1, 2022

Sure i run it on a Debian 10 Server, Docker version 20.10.3

image

with the following docker-compose.yaml:

    synapse:
            image: avhost/docker-matrix:v1.64.0
            ports:
                - "127.0.0.1:8008:8008"
                - "3478:3478"
            volumes:
                - /opt/matrix/synapse:/data
            restart: always

I dont see limits in systemd or docker itself.
cat /proc/sys/kernel/threads-max gives me 128106
Number of currently running threads is 1342, this includes Synapse 1.64.0.

Are there other limits i need to check? And why is it that only from v1.65.0 on, that Synapse has suddenly problems spawning threads?

@andreaspeters
Copy link
Member

Strange.... Yeah that's a good question. 🤷‍♂️ I do not have an answer right now and my own instance is also not affected by that. I will ask in our docker-matrix room if someone have an idea.

@neurosys-zero
Copy link

+1

@andreaspeters
Copy link
Member

No feedback from the community. But I add the possibility to disable coturn. (docker tag v1.65.0_av1). If you set the environment COTURN_ENABLE="false" then it should be disable. Please try and give me feedback. Also do not forget to remove the exposed coturn ports from your docker command. Sorry, I have no other idea. :-(

@neurosys-zero
Copy link

Unfortunately this did not work for me :(

@andreaspeters
Copy link
Member

:-( but thanks for feedback @neurosys-zero

@neurosys-zero
Copy link

Update: The host was running Debian 9 / Stretch I did a dist-update on her to 10 / Buster, updated the old docker to docker-ce and now any tags later than 1.64 run correctly. :)

@andreaspeters
Copy link
Member

🥳 thats a great feedback. Thanks a lot @neurosys-zero :-D

@worldowner
Copy link
Author

I can confirm that it works on Debian 10 or higher. I successfully upgraded to 1.84.1.

@andreaspeters
Copy link
Member

@worldowner thanks for feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants