Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wordpress] oom-killer killing mysql #927

Closed
marianopeck opened this issue May 21, 2023 · 58 comments
Closed

[Wordpress] oom-killer killing mysql #927

marianopeck opened this issue May 21, 2023 · 58 comments
Assignees
Labels
php-fpm solved stale 15 days without activity tech-issues The user has a technical issue about an application

Comments

@marianopeck
Copy link

marianopeck commented May 21, 2023

Platform

AWS

bndiagnostic ID know more about bndiagnostic ID

307e09bd-5c59-eba7-92e5-403e0088aba8

bndiagnostic output

===== Begin of bndiagnostic tool output =====

✓ Resources: No issues found
? Connectivity: Found possible issues
✓ Mariadb: No issues found
✓ Processes: No issues found
✓ Wordpress: No issues found
✓ Apache: No issues found
✓ Php: No issues found

[Connectivity]

Server ports 22, 80 and/or 443 are not publicly accessible. Please check the
following guide to open server ports for remote access:

https://docs.bitnami.com/general/faq/administration/use-firewall/

===== End of bndiagnostic tool output =====

bndiagnostic was not useful. Could you please tell us why?

I reviewed everything found by the tool but nothing was useful

Describe your issue as much as you can

Every couple of days/weeks I find out that my Wordpress stack is not responding. After some investigation, I found out in /var/log/messages.1 the following:

May 19 12:19:38 ip-172-26-14-230 kernel: [230460.042721] systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.055206] CPU: 0 PID: 1 Comm: systemd Not tainted 5.10.0-21-cloud-amd64 #1 Debian 5.10.162-1
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.068379] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.077932] Call Trace:
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.083212]  dump_stack+0x6b/0x83
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.088544]  dump_header+0x4a/0x1f4
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.094403]  oom_kill_process.cold+0xb/0x10
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.100925]  out_of_memory+0x1bd/0x4e0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.106637]  __alloc_pages_slowpath.constprop.0+0xbcc/0xc90
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.113732]  __alloc_pages_nodemask+0x2de/0x310
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.120773]  pagecache_get_page+0x175/0x390
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.127646]  filemap_fault+0x6a2/0x900
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.133434]  ? xas_load+0x5/0x80
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.139383]  ext4_filemap_fault+0x2d/0x50
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.145641]  __do_fault+0x37/0xa0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.150840]  handle_mm_fault+0x1254/0x1c60
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.157523]  do_user_addr_fault+0x1b8/0x400
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.164795]  exc_page_fault+0x78/0x160
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.171433]  ? asm_exc_page_fault+0x8/0x30
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.178015]  asm_exc_page_fault+0x1e/0x30
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.184761] RIP: 0033:0x7f211cf40c91
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.191628] Code: Unable to access opcode bytes at RIP 0x7f211cf40c67.
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.200394] RSP: 002b:00007ffe5040a5e8 EFLAGS: 00010283
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.208396] RAX: 0000000000000cc9 RBX: 0000000000000007 RCX: 000000000000000a
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.218050] RDX: 0000000000000000 RSI: 000055a0795058c9 RDI: 000055a0796794c0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.228891] RBP: 000055a07969fe70 R08: 000055a07969fe68 R09: 000055a0795058aa
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.238367] R10: 000055a0795058b2 R11: 0000000000000007 R12: 000055a0796794c0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.247697] R13: 000055a0795058c9 R14: 000055a07969fe70 R15: 0000000000000000
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.257461] Mem-Info:
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230] active_anon:2763 inactive_anon:942671 isolated_anon:0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230]  active_file:31 inactive_file:32 isolated_file:0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230]  unevictable:0 dirty:0 writeback:0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230]  slab_reclaimable:7014 slab_unreclaimable:10568
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230]  mapped:32199 shmem:32295 pagetables:8157 bounce:0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.261230]  free:20810 free_pcp:408 free_cma:0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.301145] Node 0 active_anon:11052kB inactive_anon:3770684kB active_file:124kB inactive_file:128kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:128796kB dirty:0kB writeback:0kB shmem:129180kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1708032kB writeback_tmp:0kB kernel_stack:17328kB all_unreclaimable? yes
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.336190] Node 0 DMA free:15392kB min:268kB low:332kB high:396kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.371868] lowmem_reserve[]: 0 3713 3894 3894 3894
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.379412] Node 0 DMA32 free:64748kB min:64172kB low:80212kB high:96252kB reserved_highatomic:0KB active_anon:10812kB inactive_anon:3638428kB active_file:424kB inactive_file:408kB unevictable:0kB writepending:0kB present:3915776kB managed:3823196kB mlocked:0kB pagetables:27284kB bounce:0kB free_pcp:1084kB local_pcp:680kB free_cma:0kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.414399] lowmem_reserve[]: 0 0 181 181 181
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.421540] Node 0 Normal free:3224kB min:3140kB low:3924kB high:4708kB reserved_highatomic:0KB active_anon:240kB inactive_anon:132256kB active_file:380kB inactive_file:0kB unevictable:0kB writepending:0kB present:262144kB managed:186064kB mlocked:0kB pagetables:5344kB bounce:0kB free_pcp:548kB local_pcp:212kB free_cma:0kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.460798] lowmem_reserve[]: 0 0 0 0 0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.466530] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) = 15392kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.481565] Node 0 DMA32: 529*4kB (UME) 454*8kB (UME) 589*16kB (UME) 441*32kB (UME) 219*64kB (UME) 103*128kB (UME) 35*256kB (ME) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 65444kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.496388] Node 0 Normal: 32*4kB (UM) 174*8kB (U) 94*16kB (UE) 8*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3280kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.508386] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.517037] 32381 total pagecache pages
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.521301] 0 pages in swap cache
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.525081] Swap cache stats: add 0, delete 0, find 0/0
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.533517] Free swap  = 0kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.538902] Total swap = 0kB
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.544166] 1048477 pages RAM
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.549573] 0 pages HighMem/MovableOnly
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.555839] 42186 pages reserved
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.560752] Tasks state (memory values in pages):
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.567169] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.577843] [    373]     0   373     2046     1138    57344        0             0 haveged
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.589130] [    418]     0   418    24973      344    77824        0             0 dhclient
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.600957] [    473]     0   473    24973      342    77824        0             0 dhclient
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.612286] [    570]     0   570     1687       65    49152        0             0 cron
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.620136] [    571]   101   571     2023      196    57344        0          -900 dbus-daemon
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.630333] [    574]     0   574    55200      796    69632        0             0 rsyslogd
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.641203] [    583]     0   583     3448      226    61440        0             0 systemd-logind
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.652085] [    588]     0   588      931       44    45056        0             0 atd
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.664596] [    616]     0   616     7310     1788    90112        0             0 unattended-upgr
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.676430] [    622]   106   622     2714      168    61440        0             0 chronyd
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.686771] [    625]   106   625     2682      138    61440        0             0 chronyd
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.698148] [   2983]     0  2983   346191     3517   196608        0             0 gonit
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.709410] [   3170]     0  3170     3340      244    65536        0         -1000 sshd
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.719882] [   3719]     0  3719    76024     3409   225280        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.731187] [   3723]     1  3723   102510    36831   536576        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.739517] [   3724]     1  3724   118607    45585   647168        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.748160] [   3725]     1  3725   109589    42065   602112        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.756730] [   3726]     1  3726   109230    40791   593920        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.765556] [   3727]     1  3727   110672    44527   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.775473] [   3728]     1  3728   111163    44425   610304        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.785713] [   3729]     1  3729   112689    47291   618496        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.795646] [   3730]     1  3730    82502    32892   512000        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.803811] [   3731]     1  3731   119086    44516   643072        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.814052] [   3732]     1  3732   112951    46057   626688        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.825229] [   3733]     1  3733   110684    43585   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.834561] [   3734]     1  3734   114367    49431   638976        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.845709] [   3735]     1  3735   107660    41750   581632        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.856507] [   3736]     1  3736   120816    47549   663552        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.868227] [   3737]     1  3737   110728    44323   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.879837] [   3738]     1  3738   117249    43225   634880        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.889176] [   3739]     1  3739   118522    46142   638976        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.901737] [   3740]     1  3740   106170    39107   565248        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.915839] [   3741]     1  3741    92277    42398   589824        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.925592] [   3742]     1  3742    94221    44742   602112        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.937506] [   3743]     1  3743    92847    43832   598016        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.947614] [   3744]     1  3744   110751    43618   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.959467] [   3745]     1  3745   109118    41655   593920        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.971215] [   3746]     1  3746   102022    36012   536576        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.982252] [   3747]     1  3747   110614    43511   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230460.993378] [   3748]     1  3748   118992    43506   647168        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.003314] [   3749]     1  3749   133502    58782   757760        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.013139] [   3750]     1  3750   112719    47144   626688        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.023538] [   3751]     1  3751    87594    38102   548864        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.032568] [   3752]     1  3752   112727    44815   626688        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.042394] [   3753]     1  3753   110684    44134   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.051764] [   3754]     1  3754    89705    39360   569344        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.247066] [   3755]     1  3755   110326    35396   573440        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.256848] [   3756]     1  3756   112699    45786   622592        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.267711] [   3757]     1  3757   120610    45832   659456        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.282658] [   3758]     1  3758   111289    44811   610304        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.294580] [   3759]     1  3759   111225    44778   610304        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.304747] [   3760]     1  3760   112769    46380   622592        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.315478] [   3761]     1  3761   110644    44736   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.325267] [   3762]     1  3762    94812    45144   606208        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.335371] [   3774]  1001  3774   418321   108996  1060864        0             0 mysqld
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.348442] [   3812]     0  3812     3189      580    61440        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.357654] [   3815]     1  3815   515695     3608   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.367137] [   3816]     1  3816   515825     4499   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.378541] [   3818]     1  3818   515775     3597   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.389904] [   3831]     1  3831   515683     3659   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.401349] [   4339]     1  4339   515699     5981   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.412038] [   4470]     1  4470   515787     3678   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.423367] [   4471]     1  4471   515693     3640   667648        0             0 httpd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.433653] [   4797]     1  4797   112869    45572   626688        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.444849] [   4803]     1  4803   112694    45188   622592        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.456866] [  10390]     0 10390     4805      204    57344        0         -1000 systemd-udevd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.468746] [  12806]     0 12806    20268      297   172032        0          -250 systemd-journal
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.478868] [  24391]     1 24391    92576    40808   581632        0             0 php-fpm
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.488762] [  42045]     0 42045     2424      130    57344        0             0 cron
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.498996] [  42046]     0 42046      620       16    45056        0             0 sh
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.509281] [  42047]     0 42047     2452      138    57344        0             0 su
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.518837] [  42049]     1 42049     3764      240    73728        0             0 systemd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.529099] [  42050]     1 42050    41641      665    86016        0             0 (sd-pam)
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.540905] [  42056]     1 42056      620       17    45056        0             0 sh
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.550734] [  42057]     1 42057    83303    38363   516096        0             0 php
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.563334] [  42190]     0 42190     1806       71    49152        0             0 cron
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.572476] [  42191]     0 42191      325       11    36864        0             0 sshd
May 19 12:19:38 ip-172-26-14-230 kernel: [230461.584459] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/bitnami.service,task=mysqld,pid=3774,uid=1001
bitnami@ip-

As you can see, the OOM Killer is killing MySQL. Notice that I did not change anything from the default bitnami wordpress configuration.

It caught my attention the amount of php-fpm processes running by the time of the kill. So I tried to change the system to use memory-small.conf instead of memory-medium.conf. But other than this, I don't have any other clue.

The output of free is:

bitnami@ip-172-26-14-230:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       907Mi       1.8Gi        63Mi       1.2Gi       2.6Gi
Swap:             0B          0B          0B

Also, I have another server with the same exact hardware (memory reporting the same, same wordpress site, etc) but with an older bitnami image and there is no problem there. It's only happening on this "newer" bitnami image.

Notice the php-fpm processes in the "old" bitnami image that works perfectly:

Screenshot 2023-05-22 at 2 57 48 PM

And now notice in the "new" bitnami image:

Screenshot 2023-05-22 at 2 58 31 PM

Maybe this is related to this other case? or this? or this? There seem to be many cases reported.... see also this.

Any ideas?

Thanks in advance,

@gongomgra
Copy link
Collaborator

Hi @marianopeck,

Thanks for using Bitnami. According to the bndiagnostic information you shared, your server hasn't been under heavy load (providing your server size), nor has run short of memory (from the free output you shared). I'm afraid I don't know what could have caused those OOM-killer actions.

-----------------------------------
Check number of lines of Apache access log
-----------------------------------
Running: wc -l access_log
In: /opt/bitnami/apache2/logs/

Output:

514 access_log



-----------------------------------
Check performance issues: Count number of requests for the 10 most active IP addresses in the last 100.000 requests
-----------------------------------
Running: tail -n 100000 access_log | awk '{print $1}' | sort | uniq -c | sort -nr | head -n 10 | awk '{print $1}'
In: /opt/bitnami/apache2/logs/

Output:

284
21
14
12
11
11
10
9
8
8

I also see you are still using the https-medium.conf file (at least at the moment of running the bndiagnostic tool). Can you update the symbolic link and then restart all services for changes to take effect?

sudo /opt/bitnami/ctlscript.sh restart
-----------------------------------
Check which server type configuration has been applied
-----------------------------------
Running: ls -la bitnami/httpd.conf
In: /opt/bitnami/apache2/conf

Output:

lrwxrwxrwx 1 root root 24 May 16 20:20 bitnami/httpd.conf -> memory/httpd-medium.conf

@marianopeck
Copy link
Author

Hi @gongomgra

Thanks for your feedback. I agree with you that this server didn't have much load. In fact, this is a "staging" or "dev" site where we do the development before making it public. So yeah, little load. And as you said, plenty of available memory.

Yes, by the time the bndiagnostic run (hence the information you have), was all with the default settings according to my server site. That is, memory-medium.conf and https-medium.conf. As I said, AFTER the crash/bndiagnostic, I then replaced https-medium.conf with memory-small.conf to see if it would help. However, since these crashes happen after days/weeks, I need to wait to see if that change changed anything or not.

Thanks for the tip of https-medium.conf. I now also changed https-medium.conf to https-small.conf. Is there any other files like these 2 that I could change?

BTW, so this "dev" server is now running with the "small" version of those 2 files. Let's see how it behaves for the next couple of days.

My gut feeling is that those default "medium" is a bit too much for the server size. Mostly on php-fpm. Did you see the difference on the amount of php-fpm processes that I reported previously? In addition, I can provide even more info. The old Bitnami image where everything is working fine (the production site):

$ cat /opt/bitnami/php/etc/bitnami/common-medium.conf
;
; Bitnami PHP-FPM Configuration
; Copyright 2020 Bitnami.com All Rights Reserved
;
; Note: This file will be modified on server size changes
;
pm.max_children=25
pm.start_servers=4
pm.min_spare_servers=4
pm.max_spare_servers=10
pm.max_requests=5000

In the new bitbami image, where I see this crashes:

cat /opt/bitnami/php/etc/memory/memory-medium.conf
; Bitnami memory configuration for PHP-FPM
;
; Note: This will be modified on server size changes

pm.max_children=60
pm.start_servers=40
pm.min_spare_servers=40
pm.max_spare_servers=45
pm.max_requests=5000

Doesn't that look like a huge difference???

Thanks in advance,

Mariano

@gongomgra
Copy link
Collaborator

Hi @marianopeck,

Thanks for your message and for sharing your findings. It is true that the values have been bumped a lot from one image to another. I will double-check the reasons behind this with our engineering team.

@marianopeck
Copy link
Author

Hi @gongomgra

You are welcome.

Yes, please let me know what you find out with the engineering team. I also found there is a memory.conf for MariaDB, but that one seems to be the same/equivalent between the old bitnami image and the new one (same for Apache). The only one with a big difference is PHP FPM.

Thanks!

@gongomgra
Copy link
Collaborator

Hi @marianopeck,

I hope that using smaller values for the PHP-FPM configuration solves your issue.

I've filled the internal task for the engineering team. I will mark the ticket as 'on hold' until we have any update. The 'on hold' status will prevent the stale bot to close the ticket due to inactivity.

@gongomgra gongomgra added on-hold Issues or Pull Requests with this label will never be considered stale and removed triage Triage is needed labels May 31, 2023
@savionlee
Copy link

savionlee commented Jun 5, 2023

I was also experiencing this challenge. I remapped to the small version and my used memory instantly was cut by more than half from 2.7 GiB to 1.0 GiB.

It is a wordpress multisite deployment on azure, defaulted to the medium size config file. It doesn't get traffic really, it is more an internal office tool. After a week out of office, the medium config seemed to balloon many useless instances of php-fpm for what amounts to our default cronjobs, eating up tons of memory of a 2vCPU and 4GB instance.

if we were active in the office making some site edits, it probably would have crashed because it had a free amount of <350 MiB this morning when we came back. We were getting nightly OOM kills to mariadb leading into our office closure, the only changes we made were site edits like adding new pages and such.

@github-actions github-actions bot added triage Triage is needed and removed on-hold Issues or Pull Requests with this label will never be considered stale labels Jun 5, 2023
@marianopeck
Copy link
Author

Hi @savionlee
Thanks for your input. Feels good not to be alone ;)
Just in case it helps, I can also confirm that my server is now running well (at least for two weeks) after I changed to the small version of files.
Regards,

@gongomgra
Copy link
Collaborator

Hi @marianopeck, @savionlee,

Thanks for your updates. Our engineering team didn't have time to check this yet, but I'm sure this will be a very helpful information for them.

@savionlee
Copy link

savionlee commented Jun 12, 2023

after switching, it seems that php-fpm continues to keep spawning extra things.
image

here's the config that is active from httpd.conf -> ./memory/httpd-micro.conf:

# Bitnami memory configuration for Apache
#
# Note: This will be modified on server size changes

<IfModule mpm_prefork_module>
  StartServers    5
  MinSpareServers 5
  MaxSpareServers 10
  MaxRequestWorkers       5
  MaxConnectionsPerChild  5000
  KeepAliveTimeout 1
</IfModule>

<IfModule mpm_event_module>
  ServerLimit               4
  StartServers              2
  MinSpareThreads         128
  MaxSpareThreads         192
  ThreadsPerChild          64
  MaxRequestWorkers       256
  MaxConnectionsPerChild 5000
  KeepAliveTimeout          2
</IfModule>

<IfModule mod_passenger.c>
  PassengerMinInstances       1
  # PassengerMaxInstancesPerApp 1
  PassengerMaxPoolSize        3
</IfModule>

what settings do i change to timeout these unused threads or choose not to spin up 42 of them?

when i restart php-fpm with the ctlscript.sh, my memory goes from 2.7 down to 1. I definitely don't need all this extra power and would rather use it in my mariadb.

@savionlee
Copy link

apologies, i was changing out apache's config file. found the right one, now i'm monitoring it some more for its consumption of resources

@gongomgra
Copy link
Collaborator

gongomgra commented Nov 16, 2023

Updated WordPress Multisite images have been released as well.

@olof-dev
Copy link

Thanks! Are the values in #927 (comment) also appropriate for a singlesite WordPress image?

@gongomgra
Copy link
Collaborator

HI @olof-dev,

Thanks for your message. Yes, the updated values should work fine for a regular WordPress server. Just remember to pick the values matching your instance site and restarting the services for changes to take effect.

@eiiot
Copy link

eiiot commented Dec 2, 2023

I'm running into a similar issue with a functional memory leak on the latest version (or latest as of yesterday) of Bitnami & WordPress. We migrated because of php 8, but now I'm starting to regret it. No matter how many php-fpm processes our config allows, our wordpress site seems to crash spectacularly every few hours.

For reference, this is just over the last half an hour:
image

Any help with the status of this issue would be appreciated.

@itify
Copy link

itify commented Dec 2, 2023

@eiiot Is your php - fpm configuration matching the recommendations above?

@eiiot
Copy link

eiiot commented Dec 2, 2023

Yes, I've tried multiple configurations with no luck. Ended up moving away from bitnami last night, but I still have the old server up if there's anything else I can try.

@savionlee
Copy link

@eiiot have you tried switching out your apache memory configs?

@github-actions github-actions bot removed the triage Triage is needed label Dec 11, 2023
@github-actions github-actions bot assigned jotamartos and unassigned gongomgra and mdhont Dec 11, 2023
@jotamartos jotamartos assigned gongomgra and unassigned jotamartos Dec 11, 2023
@gongomgra
Copy link
Collaborator

@eiiot please double-check both memory settings (apache & php-fpm) match the updated one (or manually update them if required), and restart the services to ensure new settings take effect. Can it also be related to plugins/custom-theme not properly working with php 8?

@olof-dev
Copy link

If there are some updated apache settings, I'd love for those to be posted too so that I can make sure my machine won't go down again. (Unless it's trivial to launch a new machine with the new image and get all the existing custom configuration from the old machine moved over? I haven't seen anything about that.) [I'm running the single-site WP image.]

@gongomgra
Copy link
Collaborator

Hi @olof-dev,

Sorry, my fault. There shouldn't be any updated settings for Apache, we only updated PHP-FPM configuration. Did you update PHP-FPM settings manually on your instance? Remember to restart services for changes to take effect.

@olof-dev
Copy link

No worries, good to have that confirmed. I updated the values manually and rebooted the machine. I haven't had any crashes since! (But I'd want to give it another week or so to make sure.)

Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Dec 28, 2023
Copy link

github-actions bot commented Jan 2, 2024

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
php-fpm solved stale 15 days without activity tech-issues The user has a technical issue about an application
Projects
None yet
Development

No branches or pull requests