New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Talk crashes the entire instance when doing public meetings. #2010
Comments
Mind to send me a link to a public conversation to Works very fine here. |
Going in a meeting but I can do that in exactly 2 hours if that works for you
Anon
…________________________________
From: Joas Schilling <notifications@github.com>
Sent: Wednesday, July 17, 2019 2:26:48 PM
To: nextcloud/spreed
Cc: G; Author
Subject: Re: [nextcloud/spreed] Talk crashes the entire instance when doing public meetings. (#2010)
Mind to send me a link to a public conversation to <my github name>@nextcloud.com ?
Works very fine here.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#2010?email_source=notifications&email_token=AEI3N64TMSTJMTCZ526C7SLP74FYRA5CNFSM4IEPXECKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2EAWUQ#issuecomment-512232274>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AEI3N66QASZSPVDDOILE6ADP74FYRANCNFSM4IEPXECA>.
|
I receive a "502 Bad Gateway". Can you check your apache2/nginx logs? There should be something somewhere. |
Yeah that's the web container crashing... |
When exactly does it crash, when you make the conversation public, when you join the chat as a guest or when you start the call? |
As soon as a user is in the call the web container starts getting sluggish which ends up with 502 errors. |
We do long-polling requests (2 in parallel, 1 for chat messages and one for call related webrtc signaling messages). Maybe that is the problem? |
I'm not intelligent enough to understand that answer unfortunately. I'm still investigating the nginx logs but they seem to be completely empty on /var/log/ nginx/error.log, in both web and app containers. Which sounds rather weird. |
basically every user has 2 constant connections open to your server. Maybe something in your configuration limits the number of possible open connections and therefore causes this problem. |
I don't recall doing any specific setup, apart setting php-fpm to static and limiting it to 2 instances. re: nginx:
updateIf someone's reading this later for some reason, yes it's normal. This allows you to watch errors direcly using docker, or portainer if you use that, without going to the file. |
I just checked my nginx-conf but for me this looks OK re basic setup, I double checked against the official docker one:
|
Also checked the main proxy log to check if there was anything there, but acccording to it all requests get forwarded normally with no issue. |
ok I now have logs for the proxy, the web container, the app container in front me of - ALL requests end up with a 200 status code. I'm at a loss as to where to look from here. |
So running the test again with every logs active, I haave some sight. Seems the client is closing the conenction before crashing everything, but I still dono't see why .0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy HTTP/1.1" 404 79 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "DELETE /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy/participants/active HTTP/1.1" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy HTTP/1.1" 499 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /apps/apporder/getOrder HTTP/1.1" 200 182 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /avatar/Ombi/64 HTTP/1.1" 201 951 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" 10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20" |
Ok I've even tried just putting the nextcloud container online online with no other load balancer or anything running - still fails. I'm 99% sure this has to do with docker, and I think what @nickvergessen wrote is logical - as it happens as soon as a call it started, it could be the long polls. As is I don't know how to investigate any further and I am kinda stuck. I'll keep everything as-is for further testing and debug if someone can help. |
Hi everyone. I got the same error. When i use 7.0 php the problem disapear |
Which version were you using before? |
@nickvergessen i used php 7.3 |
Ok I'm back - I'll try to do this next week. Just changing PHP version in the container was annoying so I'll clone into https://github.com/nextcloud/docker/tree/060cf0883ff12241081778714507e2823d84e629/16.0/fpm change the reference to PHP7.0 then build the image from that. If anyone has concerns about this way of testing the issue let me know. Probably will do this tomorrow or smth. |
@Windyo can you give me all your compose after that ? ;) |
Is there an issue with using PHP 7.3 with Nextcloud talk? I'm on PHP 7.3 and I sometimes can't get reliable joins from people in a video call. Considering dropping back to 7.2 as a test. |
@Windyo - PHP 5.6 and 7.0 are both end of life and no longer supported. I think you need 7.1 at a minimum, 7.2 probably better. (7.1 goes end of life in December 2019) |
did it work for you too @tdm4 ? |
@nickvergessen yes, downgrading to PHP 7.2 fixed my issues.. I think there's some problems with PHP 7.3. |
@tdm4 for us we got obviously bad performances with 7.1 7.2 7.3 but we use fpm. |
@tanguy-opendsi Yes, I use fpm too. It doesn't matter whether you use php-fpm or some kind of apache prefork.. it's PHP itself here. I just use php-fpm with |
@tdm4 thx for your reply but, when i switch to fpm i got same troubles. |
@tanguy-opendsi I don't use apache webserver. |
@tdm4, |
quick update - I'm still trying to get the 7.2 container running with no errors. buliding the image from https://github.com/nextcloud/docker/tree/060cf0883ff12241081778714507e2823d84e629/16.0/fpm with the reference changed to 7.2 works, but then I get a slew of server errors complaining about wrong permissions for some reason. Still haven't figured that out, so ATM I can't test if this fixes Talk. |
Hi, I am getting similar issues wih public chats being very buggy and slow. I use Mail-in-a-Box ( https://mailinabox.email/ ), and modified it to install Nextcloud Talk (spreed).
You can see the basic nginx configuration on their Github: https://github.com/mail-in-a-box/mailinabox/tree/master/conf but I think you are mostly looking at https://github.com/mail-in-a-box/mailinabox/blob/master/conf/nginx-primaryonly.conf |
Well Nextcloud 15 is 3 major versions behind, have you considered doing an update? |
It looks like Mail-in-a-Box hard coded Nextcloud version 17.0.2 with hash of 8095fb46e9e0c536163708aee3d17fab8b498ad6. I would like to propose a change, and there any concerns I might want to address with the project? |
I ran this command, and got that it is up-to-date.
|
Well that updates the database based on the current files.
|
Hey er i still have the original issue though.
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: Joas Schilling <notifications@github.com>
Sent: Wednesday, February 12, 2020 9:32:58 AM
To: nextcloud/spreed <spreed@noreply.github.com>
Cc: G <w_i_n_d_y_o@hotmail.com>; Mention <mention@noreply.github.com>
Subject: Re: [nextcloud/spreed] Talk crashes the entire instance when doing public meetings. (#2010)
Closed #2010<#2010>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#2010?email_source=notifications&email_token=AEI3N6ZT6WLRBSSPVHMNPZ3RCOX3VA5CNFSM4IEPXECKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWSLVHEQ#event-3029816210>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEI3N63LYXHIVN6SPDLGJSLRCOX3VANCNFSM4IEPXECA>.
|
I guess so, but we can't fix a misconfigured php/webserver setup in our software. |
On one side i'm100% with you that you can't fix a misconfigured server, on the other side it's a docker instance, so config shouldn't really be an issue, esp as I've tested with a brand new config... I'll check the HTTP2 thing out, but I don't think that's it as in this case the entire container crashes and restarts. |
Hi, I was having a similar issue and I solved it. Server configuration
DescriptionStarting a call makes php extremly slow. Once the conversion is closed and php restarted, everything is back to normal. SolutionAfter looking at php logs I saw that max_children was reached. After raising it (from 5 to 20, 10 was not enough) and restarting php7.3-fpm, everything is working again. |
Steps to reproduce
Expected behaviour
Calls happen
Actual behaviour
Entire instance becomes ultra-slow on any operation.
Restarting App container gets the performance back to normal.
Netdata does not report any high CPU usage, iowait, or network issue. The intance just borks and ends up throwing a 502 error.
Browser
All
Microphone available: yes
Camera available: yes
Operating system: Windows
Browser name: All
Spreed app
Spreed app version: 6.0.2
Custom TURN server configured: no
Custom STUN server configured: no
Server configuration
Operating system: Ubuntu
Web server: Apache-fpm
Database: MySQL
PHP version: 7.3
Nextcloud Version: 16.0.3
working in a docker-container setup
List of activated apps:
Enabled:
Disabled:
server config:
Server log (data/nextcloud.log)
The text was updated successfully, but these errors were encountered: