Extremely slow on Ubuntu 20.04 Server #425

Davincible · 2021-02-21T22:41:57Z

I am running the edge version of the docker, tried it both locally (Manjaro) and on my Ubuntu server. Locally it works fine, on the server I need to wait for a minute for all the requests to complete. Resources are not the issue, its a quad core with 8gb ram. Tried it both with a mariodb server as well as a mariodb instance on my server itself. Running it on the server directly installed with the bench also works fine.

Ive disabled the traefik and exposed the port of the nginx container directly.
I can't seem to figure out what's causing it to slow down, as logs are extremely limited.

Any ideas as to whats going on>

revant · 2021-02-22T00:24:33Z

Is it slow in docker only?

Site creation is slow, on develop branch, it creates 950+ tables in mariadb during creating a new site.

Davincible · 2021-02-22T14:02:11Z

Yes, only in docker. I have also installed it with the bench command, and those web requests return instantaneously.

I don't mean during site creation, but during site operation. I see the requests coming in in the docker logs, but somehow it take a long time to process all requests.

As the same simple setup does work fast on Manjaro, I am inclined to speculate that some configuration is affecting the Ubuntu server setup, but I am not sure where to look.

I have tried a few different containers on the server, and it's the same result every time.

revant · 2021-02-24T17:56:05Z

I'm using on Ubuntu 20.04 for k8s nodes. That's what is provided by default by my cloud provider.

revant · 2021-02-25T05:44:02Z

I think we need more details to fix config.

I'm closing the issue.

Re-open if needed.

revant · 2021-02-25T07:15:45Z

❯ kubectl get node -o wide -w
NAME                            STATUS   ROLES    AGE   VERSION   INTERNAL-IP    EXTERNAL-IP     OS-IMAGE                        KERNEL-VERSION     CONTAINER-RUNTIME
project-44cfec5f844943c78f34c   Ready    <none>   87d   v1.20.4   XX.XX.XXX.XX   XX.XX.XXX.XXX   Ubuntu 20.04.1 LTS 2da9bb3059   5.4.0-53-generic   docker://19.3.13
project-95d303e2489c4af2b5c14   Ready    <none>   87d   v1.20.4   XX.XX.XX.XXX   XX.XXX.XXX.XX   Ubuntu 20.04.1 LTS 2da9bb3059   5.4.0-53-generic   docker://19.3.13
project-d58c47a5a3e7449b9c299   Ready    <none>   87d   v1.20.4   XX.XX.XXX.XX   XX.XX.XXX.XXX   Ubuntu 20.04.1 LTS 2da9bb3059   5.4.0-53-generic   docker://19.3.13

Davincible · 2021-02-25T08:19:56Z

What details do you want? I wouldn't consider it closed. This is the only docker container I have slow responses with. I've tried Odoo and Wazuh.

revant · 2021-02-25T14:51:53Z

I'll keep it open. If someone else from community finds a fix we'll close it. At least I'm not fixing it for now.

I really can't help right now, my Ubuntu 20.04 LTS servers are not causing any problems. If I face them, I'll have to fix it, I depend on it with lot of data already running in production.

Also edge version introduces many things daily. Keep trying daily if there are any improvements. Or keep track of Travis ci cron job.

I'm not using edge version in production.

I'm using up to date v12 and v13-beta.

sunhoww · 2021-03-09T20:24:40Z

One possible reason could be the mariadb config. bench installs sets innodb-buffer-pool-size dynamically off of the host memory. See -
https://github.com/frappe/bench/blob/f3809b00acc4bfa586e9a12116fb6bd262d3226e/bench/playbooks/roles/mariadb/files/mariadb_config.cnf#L46

However, the config file here does not. So, I had to manually set the config value after going through this -
https://dba.stackexchange.com/questions/27328/how-large-should-be-mysql-innodb-buffer-pool-size

Another thing could be with the nginx config. Specifically client_body_buffer_size. I was getting a lot of warnings like this -

1963/11/22 12:30:00 [warn] 150#150: *211054 a client request body is buffered to a temporary file /var/cache/nginx/client_temp/0000001631, client: 0.0.0.0, server: example.com, request: "POST /api/method/frappe.desk.form.save.savedocs HTTP/2.0", host: "example.com", referrer: "https://example.com/desk"

Possibly because requests forwarded from the proxy to the containers are already decompressed. I'm not 100% about this tho. I was using docker-compose-letsencrypt-nginx-proxy-companion then and did changed the mentioned config value and some.

Lastly I noticed that the worker services seem to run wild sometimes, maybe you need some resource limits on these containers. Although, I haven't come around to doing this myself.

Davincible · 2021-03-11T15:30:45Z

@sunhoww Hmmm those sound like things that could cause the issues. I'll have a look at them, thanks

Davincible · 2021-03-18T21:50:21Z

@sunhoww I did some digging and don't think its the database limit, In hindsight I'm using a local db on my host instead of a container, and frappe set the limit to 5G, manually set it to 1G then as that was reported by the command in linked stackexchange.

However, while looking at the request logs, and in the devtools network inspector, I noticed that the issue is a request to /website_script.js?ver=1615007321.0.
In my host nginx error log it turns up after a minute (timeout), with the following logs:

2021/03/18 22:37:41 [warn] 4093098#4093098: *53592 upstream server temporarily disabled while connecting to upstream, client: <ip>, server: example.com, request: "GET /website_script.js?ver=1615007321.0 HTTP/1.1", upstream: "http://[::1]:8060/website_script.js?ver=1615007321.0", host: "example.com", referrer: "example.com"
2021/03/18 22:37:41 [error] 4093098#4093098: *53592 upstream timed out (110: Connection timed out) while connecting to upstream, client: <ip>, server: example.com, request: "GET /website_script.js?ver=1615007321.0 HTTP/1.1", upstream: "http://[::1]:8060/website_script.js?ver=1615007321.0", host: "example.com", referrer: "example.com"

Do you have any idea what might be causing this?

Davincible · 2021-03-20T00:20:04Z

@revant How can I check the server logs in the python container? the default logs are very limited. I want to figure out why /website_script.js is timing out

revant · 2021-03-20T01:34:10Z

Recently "logs" volume was contributed

#422 check if it helps.

For old images logs are in container, exec into the container and grep frappe-bench/logs directory for errors.

sunhoww · 2021-03-20T08:54:59Z

Sorry for reply late. If it is only happening with /website_script.js and other dynamically generated content and NOT with the static assets, you can maybe disable/stop the other worker containers - -default, -short, -long, -schedule and check if the timeout still persists.

Davincible · 2021-03-20T11:22:56Z

Hmm when I turn off the workers non of dynamic links get resolved, when they're on they all do but that single request. What I also noticed is that website_script.js is requested twice, the first comes through without problem, and then the second request times out a minute later.

@revant Yeah I noticed, unfortunately, those logs are only logging high-level processes and are not very telling in this case

sunhoww · 2021-03-20T15:04:32Z

Interesting. I have no issue serving any resources dynamic or otherwise when the workers are disabled (running just -nginx, -python and -socketio containers).

Are you running the frappe build of the images or have you rebuilt some of the images?
Are you using docker-compose or running in swarm mode?

Davincible · 2021-03-20T16:42:54Z

@sunhoww

Not sure what build I'm running, just regular old docker compose without the traefik, tried latest version and version-13-beta
WARNING: The Docker Engine you're using is running in swarm mode. -- I suppose I do, not sure what the implications are of this

Davincible · 2021-03-20T17:18:57Z

I've disabled swarm, and tried with and without the extra services, now the other dynamic content does get resolved, where it previously only did if the extra services were enabled, but it doesn't make a difference for the timeout request, as shown in the screenshot

My docker-compose is here: https://gist.github.com/Davincible/d4b9f02bd5d9f60352780ffe5d88ae4c

sunhoww · 2021-03-20T19:54:19Z

Weird how everything is being requested twice. I just noticed this on my production deploys as well. Happening just with firefox tho and not with chrome. Maybe something to do with the application code. So probably not related to the issue at hand.

Did these...

I take it your following the guide here - https://github.com/frappe/frappe_docker/blob/develop/docs/single-bench.md

I haven't used this, so I tried on a 1vCPU / 4GB RAM VM running debian 10. Again no issues, apart from the duped requests. I tried with both version-12 and edge tags. Even with YOUR compose file from the gist. Of course, I had to reenable the mariadb container and had to make requests over port 80. Also I did not proceed onto the setup wizard stages.

One more thing you could check is the disk space. Some applications take a performance hit when available space becomes limited. While you're at it, you could look into the disk IOPS and throughput

revant · 2021-03-21T06:47:11Z

Can you also check ubuntu kernel and the manjaro kernel and if something is related to that?

This is one of my Digitalocean VPS that is on Ubuntu 20.04 and running in production with few sites.
I'm using swarm mode with portainer.

Distributor ID: Ubuntu
Description:    Ubuntu 20.04.1 LTS
Release:        20.04
Codename:       focal
Kernel:         Linux docker 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

I faced some problem with KVM kernel, didn't try to figure out why, I just changed to generic kernel and everything was working as expected.

Davincible · 2021-03-21T09:50:26Z

@revant Pretty much the same here, I am running the docker containers on an Ubuntu VPS too, as I want to use them for production:

Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:        20.04
Codename:       focal
Kernel:         Linux Naboo 5.4.0-67-generic #75-Ubuntu SMP Fri Feb 19 18:03:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Docker:         Docker version 19.03.8, build afacb8b7f0
Docker-Compose: Docker-compose version 1.25.0, build unknown

Davincible · 2021-03-21T10:03:02Z

@sunhoww disk usage is 50Gb/150Gb, and IOPS shouldn't be an issue either as during testing I wasn't running much else. Using a quadcore VPS with 8Gb ram, and worked with a bunch of other containers too, so I'd assume hardware is not the issue.

I'm really curious to see the direct log outputs from the python server in the containers and see the requests there while they come in and get processed, as I think that might give some valuable insight. Unfortunately, I haven't been able to figure out how to get any of these logs besides the main generic logs in the logs folder of the domain

Davincible · 2021-03-21T10:37:57Z

@sunhoww My disk usage is 50Gb/150Gb, and the VPS has a quadcore with 8Gb ram, so I don't think hardware is the issue. Ran a bunch of other stuff without any issues.

What would be interesting would be to see the python server logs inside the python container, and to see the requests as they come in and are being processed, that could provide some information as to what is going wrong, but I haven't been able to find a way to find such logs.

The other possibility I could think of is that something is going wrong in the proxy forwarding from the host to the container. My host nginx config is here

sunhoww · 2021-03-21T17:38:37Z

What would be interesting would be to see the python server logs inside the python container, and to see the requests as they come in and are being processed, that could provide some information as to what is going wrong, but I haven't been able to find a way to find such logs.

I think you might need to change gunicorn log level here -

frappe_docker/build/common/worker/docker-entrypoint.sh

Lines 93 to 99 in 5121e27

    
           gunicorn -b 0.0.0.0:$FRAPPE_PORT \ 
        
             --worker-tmp-dir /dev/shm \ 
        
             --threads=4 \ 
        
             --workers $WORKERS \ 
        
             --worker-class=gthread \ 
        
             --log-file=- \ 
        
             -t 120 frappe.app:application --preload

Then you could just do docker logs <container-name>

The other possibility I could think of is that something is going wrong in the proxy forwarding from the host to the container.

Any particular reason why you need nginx and certbot? The traefik service included in the single bench guide should be enough to proxy and manage certs. Maybe you can disable those host services and verify if indeed the proxy forward is the issue.

revant · 2021-05-06T08:05:25Z

WORKER_CLASS environment variable for erpnext-python container defaults to gthread.

Try setting it to sync,

erpnext-python:
  ...
  environment:
    ...
    - WORKER_CLASS=sync

I found best performance on worker class gevent

github-actions · 2021-08-01T00:31:25Z

This issue has been automatically marked as stale. You have a week to explain why you believe this is an error.

Davincible added the bug label Feb 21, 2021

revant added question and removed bug labels Feb 22, 2021

revant closed this as completed Feb 25, 2021

revant reopened this Feb 25, 2021

revant added the help wanted label Feb 26, 2021

github-actions bot added the no-issue-activity label Aug 1, 2021

github-actions bot closed this as completed Aug 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely slow on Ubuntu 20.04 Server #425

Extremely slow on Ubuntu 20.04 Server #425

Davincible commented Feb 21, 2021

revant commented Feb 22, 2021 •

edited

Davincible commented Feb 22, 2021 •

edited

revant commented Feb 24, 2021

revant commented Feb 25, 2021

revant commented Feb 25, 2021

Davincible commented Feb 25, 2021

revant commented Feb 25, 2021 •

edited

sunhoww commented Mar 9, 2021

Davincible commented Mar 11, 2021

Davincible commented Mar 18, 2021

Davincible commented Mar 20, 2021

revant commented Mar 20, 2021 •

edited

sunhoww commented Mar 20, 2021

Davincible commented Mar 20, 2021

sunhoww commented Mar 20, 2021

Davincible commented Mar 20, 2021

Davincible commented Mar 20, 2021 •

edited

sunhoww commented Mar 20, 2021

revant commented Mar 21, 2021

Davincible commented Mar 21, 2021

Davincible commented Mar 21, 2021

Davincible commented Mar 21, 2021

sunhoww commented Mar 21, 2021

revant commented May 6, 2021 •

edited

github-actions bot commented Aug 1, 2021

Extremely slow on Ubuntu 20.04 Server #425

Extremely slow on Ubuntu 20.04 Server #425

Comments

Davincible commented Feb 21, 2021

revant commented Feb 22, 2021 • edited

Davincible commented Feb 22, 2021 • edited

revant commented Feb 24, 2021

revant commented Feb 25, 2021

revant commented Feb 25, 2021

Davincible commented Feb 25, 2021

revant commented Feb 25, 2021 • edited

sunhoww commented Mar 9, 2021

Davincible commented Mar 11, 2021

Davincible commented Mar 18, 2021

Davincible commented Mar 20, 2021

revant commented Mar 20, 2021 • edited

sunhoww commented Mar 20, 2021

Davincible commented Mar 20, 2021

sunhoww commented Mar 20, 2021

Davincible commented Mar 20, 2021

Davincible commented Mar 20, 2021 • edited

sunhoww commented Mar 20, 2021

revant commented Mar 21, 2021

Davincible commented Mar 21, 2021

Davincible commented Mar 21, 2021

Davincible commented Mar 21, 2021

sunhoww commented Mar 21, 2021

revant commented May 6, 2021 • edited

github-actions bot commented Aug 1, 2021

revant commented Feb 22, 2021 •

edited

Davincible commented Feb 22, 2021 •

edited

revant commented Feb 25, 2021 •

edited

revant commented Mar 20, 2021 •

edited

Davincible commented Mar 20, 2021 •

edited

revant commented May 6, 2021 •

edited