Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

High TTFB with PageSpeed #2089

Open
AnonDev1312 opened this issue Apr 11, 2022 · 15 comments
Open

High TTFB with PageSpeed #2089

AnonDev1312 opened this issue Apr 11, 2022 · 15 comments

Comments

@AnonDev1312
Copy link

Hi,

TTFB is much higher when PageSpeed is enabled and I don't understand why

264ms TTFB with PageSpeed=off
https://speedvitals.com/report/www.keinegotterkeinemeister.de/eEuqEG/

1100ms TTFB with PageSpeed ON
https://speedvitals.com/report/www.keinegotterkeinemeister.de/tg1zwa/

Pagespeed config:
https://pastebin.com/fLqVcP1M

@jmarantz
Copy link
Contributor

Thanks for the data including the config; it's very helpful.

It does look like the performance is pretty good even from the US, despite the TTFB. I assume the server is in Germany with no US CDN.

Of course TTFB will never get faster with PageSpeed, only slower. The question is whether the optimizations it makes during that time provide a better user experience.

I visited view-source:https://www.keinegotterkeinemeister.de/tshirt-sabotage-black-cat-D01004174288P0210/?PageSpeedFilters=+debug to see what timing info PageSpeed injected at the end, and found this:

#NumFlushes            0
#EndDocument after     5906us
#Total Parse duration  5685us
#Total Render duration 411563us
#Total Idle duration   221us
No critical images detected.
The following filters were disabled for this request:
CriticalSelectorFilter: No critical selector info in cache
DelayImages

So it looks like your server spent 5.7 seconds in PageSpeed parsing your HTML -- that's mostly CPU bound. Then it spent 452ms rendering it -- that may be a combination of image optimization, network operations, waiting for asynchronous optimizations and URL fetches (depending on the cache setup) That leaves 1100-270= 378ms unaccounted for.

I'm guessing that most of this comes from the file cache. It's possible your disk is I/O bound. My main suggestion is to try deploying memcached or redis as an asynchronous cache. That should be faster than the file-cache.

@AnonDev1312
Copy link
Author

AnonDev1312 commented Apr 12, 2022

Hi, thanks for the reply

The server is a dedicated server in OVH Datacenter in Beauharnois, Quebec, Canada
Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz (16 core(s)) - 2499 MHz
64 GB Ram
NVME SSD
1GBPS network

CPU load is always under 25% https://prnt.sc/Xte_RHbEGSgR

We also integrated a benchmark into the website to calculate the page load speed of the different PHP modules and calculate SQL queries execution time. You can see it on the bottom of the page when adding ?benchmark=1 to the URL
https://www.keinegotterkeinemeister.de/tshirt-sabotage-black-cat-D01004174288P0210/?benchmark=1

Currently, it takes 0.2110 seconds to generate the page with PHP on the first page load. But out of this 0.21 seconds, 0.1865 is due to calling an external geolocation API. This is just because Cloudflare is currently disabled for our testing phase. When Cloudflare is enabled, we get the geolocation from $_SERVER["HTTP_CF_IPCOUNTRY"] so we avoid calling the external API and page load time goes down to around 0.01s

TTFB is not the only problem, we also have high LCP and it throws warnings in Google Search Console: https://prnt.sc/2PjB74ekUgdR
Our search ranking dropped a lot and we lost 50% of our search traffic since June 2021 and we suspect it could be partly due to that

How can I confirm whether the disk is I/O bound or not? I followed Plesk guide and here are the results:
https://support.plesk.com/hc/en-us/articles/360006437794-Plesk-for-Linux-server-is-running-slow-too-much-IO-interactivity

5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 13.634 s, 188 kB/s
[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=512 count=5000 oflag=direct
5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 14.1703 s, 181 kB/s
[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=512 count=5000 oflag=direct
5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 14.0026 s, 183 kB/s

This seems pretty slow. Could it be because we run pagespeed from too many domains?

Latency seems fine:

[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=1024M count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.05548 s, 522 MB/s

however iotop looks perfectly fine. Monitored it for around 10 minutes and IO never went above 1%
https://prnt.sc/F65mP1MkTeBC

also no high %WA with top:
https://prnt.sc/Oe1hhgcqrNDC

@jmarantz
Copy link
Contributor

I was suspicious there may be spinning-disk latency but as you have an SSD I'm not sure. Nevertheless it might be worth trying redis/memcached, as we never really tuned the file-cache for SSD.

@AnonDev1312
Copy link
Author

AnonDev1312 commented Apr 12, 2022

There could be an IO issue because according to the Plesk guide, the first test I ran should get a speed of 142 MB/s but I got only 183 kB/s

Memcached is now installed but I'm not sure how to check if it's correctly configured
https://www.keinegotterkeinemeister.de/phpinfo.php
I enabled statistics and I can see some data related to Memcached so it seems to be working:
https://www.keinegotterkeinemeister.de/mod_pagespeed_statistics?memcached

ModPagespeedMemcachedServers localhost:11211
ModPagespeedCreateSharedMemoryMetadataCache "/var/cache/mod_pagespeed/" 51200

Is 51200 enough? We have plenty of memory on the server and the websites are big with thousands of images

It seems to have lowered TTFB, great! Now it is almost the same as with Pagespeed disabled. Thanks!

Anything suggestions to improve LCP?

@jmarantz
Copy link
Contributor

jmarantz commented Apr 12, 2022 via email

@jmarantz
Copy link
Contributor

One more thing -- you probably should control access to /pagespeed_admin and /mod_pagespeed_statistics with allow/deny directives per https://www.modpagespeed.com/doc/admin#handlers

@AnonDev1312
Copy link
Author

Thanks, we disabled defer_javascript and some other filters

However today we noticed that images are not getting converted to webp anymore after switching to Memcached
When commenting out Memcached in the config, the images are converted again to webp

@jmarantz
Copy link
Contributor

My first suspicion is that they just haven't been (re)converted into webp yet. By switching to memcached you effectively are flushing your cache. As you may have noticed, PageSpeed optimizes resources in the background when it learns they need to be optimized, but may serve the first few requests unoptimized. Is it possible that if you refresh your page a few times the images will start coming in as webp?

@AnonDev1312
Copy link
Author

I tried CTRL+F5 many many times but I didn't see any webp images in the code. As soon as I turn off Memcached, I see webp images after 2 refreshes. We let Memcached be enabled for 18 hours and didn't see any webp images on the homepage.

Memcached disabled, many webp images in the code:
view-source:https://www.keinegotterkeinemeister.de/

Memcached enabled for 18 hours with zero webp in the code:
view-source:https://www.ni-dieu-ni-maitre.com/

@AnonDev1312
Copy link
Author

AnonDev1312 commented Apr 12, 2022

I see this error in /mod_pagespeed_statistics?cache#physical_cache
Error getting memcached server status for localhost:11211

I temporarily re-enabled public statistics on both domains if you want to check

[root@ns508279 ~]# service memcached status
Redirecting to /bin/systemctl status memcached.service
● memcached.service - memcached daemon
   Loaded: loaded (/usr/lib/systemd/system/memcached.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

edit: got it enabled now, but still confused about the last line, inactive/dead

[root@ns508279 ~]# systemctl enable memcached
Created symlink /etc/systemd/system/multi-user.target.wants/memcached.service → /usr/lib/systemd/system/mem                                           cached.service.
[root@ns508279 ~]# systemctl start memcached.service
[root@ns508279 ~]# service memcached status
Redirecting to /bin/systemctl status memcached.service
● memcached.service - memcached daemon
   Loaded: loaded (/usr/lib/systemd/system/memcached.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-04-12 22:39:25 UTC; 3s ago
 Main PID: 805728 (memcached)
    Tasks: 10 (limit: 409644)
   Memory: 3.6M
   CGroup: /system.slice/memcached.service
           └─805728 /usr/bin/memcached -p 11211 -u memcached -m 64 -c 1024 -l 127.0.0.1,::1

Apr 12 22:39:25 ns508279.ip-167-114-64.net systemd[1]: Started memcached daemon.

Now the error is gone in /mod_pagespeed_statistics?cache#physical_cache

edit: it's fixed, webp images are working now

edit2: TTFB is now back above 1200ms 😭😭😭
https://speedvitals.com/report/www.keinegotterkeinemeister.de/B4rUYX/

@jmarantz
Copy link
Contributor

It's possible the reason that TTFB was good for a while is that the memcached code is good at failing-fast if it looks like memcached is not working.

I can think of three reasons generally that PageSpeed would make TTFB noticeably worse:

  • HTML parsing overhead: Running with ?PageSpeedFilters=+debug shows us that parsing overhead is less than 6ms so I don't think it's that.
  • Rewrite Deadlines: in the config you pasted above it looks like that's set to 20ms which should be fine. But it might be worth looking at the config again to see if anything deadline-related is overridden
  • General overhead and queuing on your server. PageSpeed keeps a limited number of threads for each process, with work queues for each thread, and at some scale the queue-depths might start growing. There are histograms for that on /pagespeed_admin.

@AnonDev1312
Copy link
Author

Hi,

Makes sense. I didn't understand how it was faster with Memcached failing lol

I don't think the rewrite deadline is being overwritten. I have not modified any conf file, the entire config is in Plesk GUI except for ModPagespeedCreateSharedMemoryMetadataCache

I have looked at the histograms but I'm not sure how to interpret the data
image

You can check the admin section, I put it public temporarily again
https://www.keinegotterkeinemeister.de/pagespeed_admin/

@Lofesa
Copy link
Contributor

Lofesa commented Apr 14, 2022

Hi @Ungov77 @jmarantz

Take a look on this:

https://www.keinegotterkeinemeister.de/pagespeed_admin/cache#physical_cache

See evictions and current objects stored:

curr_items: 53457
evictions: 25654735

I think is a big amount of evictions.

You are using only a 64Mb memcache database

bytes: 63936177

In the config file for memcached (normaly /etc7memcached.conf or /etc/system/memcached.conf) you can increase the size of RAM memcached uses:

change m 64 whit the size you will use.

Think you are using memcached to store those object stored in disk before.

@AnonDev1312
Copy link
Author

AnonDev1312 commented Apr 14, 2022

Hi,

The config file aas in /etc/sysconfig/memcached

What size should we increase it to? We have 64gb of ram on this server
I increased it from 64 to 6400 for now

edit: 1 hour later, Evictions are still at zero. Seems to be fixed now.
Just tested TTFB again, ran 3 tests
2.4s https://speedvitals.com/report/www.keinegotterkeinemeister.de/CmxbQ4/
670ms https://speedvitals.com/report/www.keinegotterkeinemeister.de/jXjxJs/
660ms https://speedvitals.com/report/www.keinegotterkeinemeister.de/FB2FrO/

@Lofesa
Copy link
Contributor

Lofesa commented Apr 15, 2022

hi @Ungov77
Take account that your filesystem cache was near of 40GB, if I remember +30 (Yes, I´m Longinos in the goolge group email list)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants