Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very high CPU and swap space usage. #120

Closed
dessalines opened this issue Jun 5, 2020 · 8 comments
Closed

Very high CPU and swap space usage. #120

dessalines opened this issue Jun 5, 2020 · 8 comments

Comments

@dessalines
Copy link
Contributor

dessalines commented Jun 5, 2020

I'm getting really high CPU usage, and swap space being used up on my server all of the sudden from pictshare (I'm using the docker image). pictshare has about 5000 pictures, and has had no issues until now. Might be a memory leak?

7416  php-fpm        335072 kB
7415  php-fpm        334936 kB
7417  php-fpm        334376 kB
7428  php-fpm        334328 kB
7424  php-fpm        333968 kB
7426  php-fpm        231788 kB
7513  php-fpm        31904 kB
7422  php-fpm        12692 kB
7973  php-fpm        8356 kB
7396  php-fpm        3372 kB

I just checked and no new files are being added to the pictshare folder, so no uploads are taking place either.

@Nutomic
Copy link

Nutomic commented Jun 5, 2020

Some interesting log lines (I can send you the full log by email).

WARNING: [pool www] child 1456, script '/usr/share/nginx/html/index.php' (request: "GET /index.php?url
=/webp/192/zlr3vh.jpg") execution timed out (618.597039 sec), terminating
[error] 261#261: *161002 upstream timed out (110: Operation timed out) while reading response header from
upstream, client: 172.19.0.1, server: _, request: "GET /webp/96/ed9ej7.jpg HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "dev.lemmy.ml", referr
er: "https://dev.lemmy.ml/"
WARNING: [pool www] server reached pm.max_children setting (9), consider raising it

Edit: log sent to the contact email on https://haschek.solutions/

@dessalines
Copy link
Contributor Author

Here's a stackoverflow answer for a fix: https://serverfault.com/questions/479443/php5-fpm-server-reached-pm-max-children

But I don't know if that's the issue, since it seems like the webp conversion might be leaking memory or something.

@dessalines
Copy link
Contributor Author

Some error messages:

^[[36mpictshare_1  |^[[0m 172.19.0.1 - - [05/Jun/2020:14:16:36 +0000] "GET /webp/192/zlr3vh.jpg HTTP/1.0" 504 569 "https://dev.lemmy.ml/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/68.0.3618.125"
^[[36mpictshare_1  |^[[0m [05-Jun-2020 14:26:01] WARNING: [pool www] child 277 said into stderr: "NOTICE: PHP message: PHP Fatal erro
r:  gd-webp encoding failed in /usr/share/nginx/html/content-controllers/image/image.controller.php on line 256"
^[[36mpictshare_1  |^[[0m [05-Jun-2020 14:27:26] WARNING: [pool www] child 276, script '/usr/share/nginx/html/index.php' (request: "GET /index.php?url=/webp/192/zlr3vh.jpg") execution timed out (713.932975 sec), terminating
^[[36mpictshare_1  |^[[0m [05-Jun-2020 14:27:27] WARNING: [pool www] child 275, script '/usr/share/nginx/html/index.php' (request: "GET /index.php?url=/192/zlr3vh.jpg") execution timed out (713.929932 sec), terminating
^[[36mpictshare_1  |^[[0m 2020/06/05 14:31:06 [error] 261#261: *241 upstream timed out (110: Operation timed out) while reading respo
nse header from upstream, client: 172.19.0.1, server: _, request: "GET /webp/192/zlr3vh.jpg HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "dev.lemmy.ml", referrer: "https://dev.lemmy.ml/"

LOTS of these ones:

^[[36mpictshare_1  |^[[0m 2020/06/05 12:32:08 [error] 261#261: *155304 FastCGI sent in stderr: "PHP message: PHP Warning:  imagecreatefromstring(): Empty string or invalid image in /usr/share/nginx/html/content-controllers/image/image.controller.php on line 225PHP message: PHP Warning:  imagesx() expects parameter 1 to be resource, bool given in /usr/share/nginx/html/content-controllers/image/resize.php on line 82PHP message: PHP Warning:  imagesy() expects parameter 1 to be resource, bool given in /usr/share/nginx/html/content-controllers/image/resize.php on line 83PHP message: PHP Warning:  Division by zero in /usr/share/nginx/html/content-controllers/image/resize.php on line 99PHP message: PHP Warning:  imagecreatetruecolor() expects parameter 2 to be int, float given in /usr/share/nginx/html/content-controllers/image/resize.php on line 104PHP message: PHP Warning:  imagecolorstotal() expects parameter 1 to be resource, bool given in /usr/share/nginx/html/content-controllers/image/resize.php on line 106PHP message: PHP Warning:  imagefill() expects parameter 1 to be resource, null given in /usr/share/nginx/html/content-controllers/image/resize.php on line 113PHP message: PHP Warning:  imagesavealpha() expects parameter 1 to be resource, null given in /usr/share/nginx/html/content-controllers/image/resize.php on line 114PHP message: PHP Warning:  imagealphablending() expects parameter 1 to be resource, null given in /usr/share/nginx/html/content-controllers/image/resize.php on line 115PHP message: PHP Warning:  imagecopyresampled() expects parameter 1 to be resource, null given in /usr/share/nginx/html/content-controllers/image/resize.php on line 117PHP message: PHP Warning:  imagewebp() expects parameter 1 to be resource, null given in /usr/share/nginx/html/content-controllers/image/image.controller.php on line 256PHP message: PHP Warning:  filemtime(): stat failed for /usr/share/nginx/html/data/y595n6.jpg/4a26594eabeeb128e09d7cf85cdbfb6b_y595n6.jpg in /usr/share/nginx/html/content-controllers/image/image.controller.php on line 216PHP

@geek-at
Copy link
Member

geek-at commented Jun 5, 2020

hmm yes looks like a problem with the jpg to webp conversion. pictshare.net is running the same docker image with 6m hits a week and 30k images and I have no such issues.

Your "Operation timed out" error seems to confirm that something is running that uses a large amount of ressources. How much traffic are you seeing?

@Nutomic
Copy link

Nutomic commented Jun 5, 2020

Not sure about the traffic, especially because pictshare is down now and we dont have any proper monitoring. My guess is less than 10 requests/s on average (with spikes when a user first opens the page).

@dessalines
Copy link
Contributor Author

One thing, I removed a large picture, (~20 MB), and it stabilized A LOT (down from using 2GB of swap, to 20 MB of swap), but there's still a memory leak somewhere, the swap usage is still rising.

@Nutomic
Copy link

Nutomic commented Jun 5, 2020

This image might be the one that broke it: https://commons.wikimedia.org/wiki/File:Pizigani_1367_Chart_10MB.jpg

@geek-at
Copy link
Member

geek-at commented Nov 24, 2021

The reason seems to be that behind the scenes the php-gd library is mapping the images pixel by pixel (basically BMP in RAM) which blows up on larger images. Takes a over a gig of ram to convert the 10mb jpeg you linked. Until we have a better conversion rate.

I also tested cwebp on a server it took about two minutes and used half a gigabyte of RAM to convert the same photo to webp. Seems the only way to go to make it work is increased RAM and higher timeouts

@geek-at geek-at closed this as completed Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants