Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

☔ Pagespeed Insights results differ from lighthouse in chrome #6708

Open
vertic4l opened this Issue Dec 3, 2018 · 55 comments

Comments

Projects
None yet
@vertic4l
Copy link

vertic4l commented Dec 3, 2018

Hey there!

I'm optimizing a mobile site and lighthouse reported me a score of 92. So far so good i thought. But after checking back with Pagespeed Insights, which also uses lighthouse, i'm getting a score of 57.

Is there any reliable way to get the same score?

@exterkamp

This comment has been minimized.

Copy link
Member

exterkamp commented Dec 3, 2018

Hey, thanks for reaching out!

To help with diagnosing this we'll need to have the URL you're testing, and the context that you are running Lighthouse in.

How did you get the 92? Are you running on Devtools in Chrome? On the node CLI? And what throttling settings are you using?

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Dec 4, 2018

Hey @exterkamp,

i've made an audit with lighthouse in DevTools in Chrome and now tested it with node CLI aswell.
Lighthouse in DevTools gives a score of 92 (Performance) right now. Lighthouse (latest, 4.0.0-alpha.2-3.2.1) reports 75 for Best Practices and 0 for Performance. But still huge difference to the PSI score. Mailed you the corresponding website!

These are my settings:

bildschirmfoto 2018-12-04 um 11 17 12

Last test with lighthouse in DevTools:
bildschirmfoto 2018-12-04 um 11 24 34

Last test with PSI:
bildschirmfoto 2018-12-04 um 11 26 47

bildschirmfoto 2018-12-04 um 11 26 02

Last test with CLI
bildschirmfoto 2018-12-04 um 11 28 20

@exterkamp

This comment has been minimized.

Copy link
Member

exterkamp commented Dec 4, 2018

So I ran that URL against all channels and I am finding things to be a consistently in the high 50's low 60's for performance. I would say within expected variance.
Node v4.0.0-alpha.1
image
PSI (I also force ran against an EU data center, and got similar results)
image
DevTools in Production Chrome v3.0.3
image

The 0 for performance in the CLI, that definitely seems like an error, that site shouldn't get a 0, and if it consistently does without surfacing an error that might be a bug or something might be blocking LH from running locally?

If you run from other machines does your devtools still get such a high score? That seems unreasonably high given those settings, I am seeing a consistent ~10 seconds for TTI, but that screenshot of devtools running is ~4 seconds, seems oddly fast compared to the other runs.

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Dec 6, 2018

@exterkamp Thanks for testing it. It's odd that there's no Performance score when using the CLI version (although Best Practices gets a score as you can see). And it's very odd that the TTI differs so much.

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Dec 18, 2018

@exterkamp So, i made some changes to pass some more of the audits for mobile.

Lighthouse in DevTools
Passed audits: 15
Performance Score: 83

Pagespeed Insights
Passed Audits: 12
Performance Score: 45 (sometimes 55)

Both tests were made on newly updated production site.

@AlexVadkovskiy

This comment has been minimized.

Copy link

AlexVadkovskiy commented Dec 18, 2018

Currently on Chromium Version 71.0.3578.80 I get the same weird results for almost any website. Lighthouse in DevTools is always 25-45 points better than on PSI.
example: https://www.omgubuntu.co.uk/ (got 82 in lighthouse and around 50 on PSI)
image
image

Could this be related to DNS lookups or similar stuff? Could PSI server have a little bit slower access to the tested website than my local machine (depending on where the server actually located in US or in EU?

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Dec 18, 2018

@AlexVadkovskiy just checked the website (https://www.omgubuntu.co.uk/) from Germany.

PSI: 29
Lighthouse in DevTools: 90

(mobile scores)

@exterkamp

This comment has been minimized.

Copy link
Member

exterkamp commented Dec 18, 2018

Hmmm checking out https://www.omgubuntu.co.uk/ I get:
~80 from cli on v4.0.0-alpha.1
~40 from pagespeed insights (running v4.0.0-alpha.1)

So this is solidly reproducible for that site.

I am force running pagespeed in the EU and seeing similar results. I will say that pagespeed has a consistent ~200ms TTFB while running locally have ~60ms TTFB, and that URL seems to send more images to pagespeed for some reason? Some odd behavior for sure, and some questionable latency which might be us or the site.

@patrickhulce might want to take a look at this from a lantern perspective on why this could be different/maybe something is up with the trace.

@paulirish might want to take a look at this from a global pagespeed latency perspective.

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Dec 19, 2018

@exterkamp is this the same case as @wardpeet brought up? some of his runs in LR had every request duplicated which would immediately explain 2x lantern predictions.

@AlexVadkovskiy the Chromium difference is also probably explained by #6772 FYI

@exterkamp

This comment has been minimized.

Copy link
Member

exterkamp commented Dec 19, 2018

@patrickhulce I remember that, but I can't find that issue/discussion, you have a link to it/can you bump it to mention this thread?

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Dec 19, 2018

but I can't find that issue/discussion, you have a link to it/can you bump it to mention this thread?

IIRC that's because we just discussed it over chat :) @wardpeet do you happen to still have those?

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Jan 2, 2019

@exterkamp any update here?

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Jan 14, 2019

@patrickhulce so there is a bug in chrome which is mentioned in other issues. Will this fix the PSI score as well ?

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Jan 14, 2019

@vertic4l to which bug are you referring?

@exterkamp

This comment has been minimized.

Copy link
Member

exterkamp commented Jan 14, 2019

Oops this got buried! Missed that bump. I will definitely be looking into these URLs again soon. So far we have:

Taking a guess, the bug might be ignoring flags? But that shouldn't be an issue because this was reported before that version of Chrome shipped iirc.

@exterkamp exterkamp self-assigned this Jan 14, 2019

@ashtonlance

This comment has been minimized.

Copy link

ashtonlance commented Jan 15, 2019

I'm having the complete opposite experience with this. My dev tools Lighthouse is give me a score of 37 where PSI is giving a score of 91.

The url in question is: https://biketours.com

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Jan 15, 2019

@ashtonlance I'm seeing PSI score of 24 for that URL
image

@ashtonlance

This comment has been minimized.

Copy link

ashtonlance commented Jan 15, 2019

@patrickhulce Sorry, those number I gave were the desktop scores.

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Jan 15, 2019

@ashtonlance Ah you're probably experiencing #6772 then. Give it a whirl in Chrome Canary.

@ashtonlance

This comment has been minimized.

Copy link

ashtonlance commented Jan 15, 2019

Lighthouse is busted for me in Canary Version 73.0.3672.0.

screen shot 2019-01-15 at 4 10 44 pm

I was able to get the Lighthouse CLI to produce similar results as PSI by turning off emulation and throttling.

@amaladevi-r

This comment has been minimized.

Copy link

amaladevi-r commented Jan 16, 2019

@exterkamp I'm seeing the same issue, where Lighthouse reports a high score while PSI shows a lower one. You can try on this url

https://www.bankbazaar.com/credit-card.html?mobileSite=true

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Jan 16, 2019

upload
Well, not that bad...

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

patrickhulce commented Jan 16, 2019

@ashtonlance something is off, that's Lighthouse 3.0.0-beta.0 which is ~8 months old, not the right Canary version 🤔

@ashtonlance

This comment has been minimized.

Copy link

ashtonlance commented Jan 16, 2019

@patrickhulce Weird indeed. That was from a build I downloaded at https://www.google.com/chrome/canary/. However, I just downloaded a fresh copy and all seems to be good now.

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Feb 1, 2019

@exterkamp any news?

@Lux589

This comment has been minimized.

Copy link

Lux589 commented Feb 8, 2019

Hi @exterkamp,

we are experiencing this issue on all of our sites, where lighthouse shows a higher score and PSI shows lower score. here are the sites we are experiencing this on. https://skinrenewal.co.za https://bodyrenewal.co.za https://healthrenewal.co.za https://duram.co.za

screen shot 2019-02-08 at 14 29 05
screen shot 2019-02-08 at 14 29 39

@sdenathaniel

This comment has been minimized.

Copy link

sdenathaniel commented Feb 8, 2019

We are also experiencing some wide inconsistencies,

DevTools against Production Server running Mobile 3G 4x Slow Down
screen shot 2019-02-08 at 1 01 54 pm

Scores twice as high as PSI
screen shot 2019-02-08 at 1 02 04 pm

Should we be concerned with PSI regarding SEO implications? Especially if its returning lower numbers.

@duartegarin

This comment has been minimized.

Copy link

duartegarin commented Feb 9, 2019

@exterkamp thank you for the info, it definitely did stabilise, although lower still than expected.

I guess the main issue that persists as some others point out is the considerable inconsistency between PSI and lighthouse.

In all pages lighthouse (both chrome audit or node cli tool) always show higher scores than PSI, and this makes it difficult to create a proper testing process.

Can anyone clarify what causes this differences?

@paulirish paulirish changed the title Pagespeed Insights results differ from lighthouse in chrome ☔ Pagespeed Insights results differ from lighthouse in chrome Feb 13, 2019

@paulirish paulirish added P1 and removed needs-priority labels Feb 13, 2019

@florianjung

This comment has been minimized.

Copy link

florianjung commented Feb 15, 2019

I experience the same inconsistencies using PSI and Lighthouse in DevTools for the domain https://www.fitness-tracker-test.info

While getting a performance score of 98-100 using Lighthouse, PSI scores 65-70 only. There is no difference in using the build in DevTools Lighthouse or node CLI - both show superior scores.

image

image

@exterkamp Joining this issue thread I hope to provide some more information to identify and resolve the issue.

@snake-345

This comment has been minimized.

Copy link

snake-345 commented Feb 21, 2019

Hello, I ran into the same problem. While I was trying found out what's problem, I created really simple page with 100 scores in Chrome lighthouse and about 85 in PSI. Maybe it could help in further investigate:
Simple Page - This page contains only twenty dummy js files which only print preformance.now() in console.

PSI Results
I ran PSI tests five times and get this results: 84, 69, 85, 88, 86
You can see full reports here: PSI.zip

Google Chrome Lighthouse Results
I ran tests five times and get this results: 99, 100, 99, 99, 100
You can see full reports here: Google-chrome-lighthouse.zip

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Feb 21, 2019

@snake-345, please bare in mind that page speed insights is run in a datacenter somewhere and will definitely have different performance characteristics than running Lighthouse on your local machine.

The numbers you've posted, while having some variance are pretty stable overall.

@snake-345

This comment has been minimized.

Copy link

snake-345 commented Feb 21, 2019

@benschwarz But when I run lighthouse in chrome I test site on github servers, that should send some requests too(as PSI). Isn't it too much difference in scoring between requests are sending from my local machine to server and from psi to server?

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Feb 21, 2019

Isn't it too much difference in scoring between requests are sending from my local machine to server and from psi to server?

Yeah! Well… there can be. Your local machine is likely faster than the servers used to test PSI and your connection speed will be different too!

PSI will throttle the network for "desktop" to a "4G"-like connection.

@sdenathaniel

This comment has been minimized.

Copy link

sdenathaniel commented Feb 21, 2019

@benschwarz you should note I have been running our Lighthouse at 3G throttle so if PSI is running 4G throttle that would make the discrepancies even MORE alarming.

@snake-345

This comment has been minimized.

Copy link

snake-345 commented Feb 21, 2019

@benschwarz But I used throttling in lighthouse, like @sdenathaniel

@duartegarin

This comment has been minimized.

Copy link

duartegarin commented Feb 22, 2019

Hi everyone.
I gotta find this somewhat strange.

So many reports for different users in different regions cannot be a coincidence. The discrepancies aren't minor. We have variations around 15%-20%, that's not marginal.

Also, the results are consistent within each tool, e.g running a report on cli/chrome consecutively yields similar results (that yes is an expected level of variance). But when comparing with PSI, PSI is always lower.

We need to understand this as it's hard to comply with the metrics if they aren't reproducible.

All of us seem to be using throttling and other simulated behaviours. Is there anything else we should be doing?

Thanks!

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Feb 22, 2019

screen shot 2019-02-22 at 10 58 22 pm

I investigated a trace captured by Pagespeed / Lighthouse and observed the benchmarkIndex for this test, it was 762.

🤔 What is the benchmark index anyway?

Benchmark index is a quick, but not hugely clever test that is done by lighthouse to assess how fast the machine is before conducting the test.

It's an abstract number, but is rather effective at describing the speed of the machine via js runtime.

For example, my current model macbook pro yields a benchmarkIndex of around 1200. As you'll note, that's far higher than the desktop trace from pagespeed.

This is most likely the reason why your local tests are so different to PSI. The servers are possibly working on other tasks, are running in containers or some other method of virtualisation that is special to google.

@sdenathaniel

This comment has been minimized.

Copy link

sdenathaniel commented Feb 22, 2019

@benschwarz this is very interesting but how would we check our bechmark index on local? I can see it pretty clearly in PSI in the network tab (see responses below). But since audit happens on local it doesn't trace to the network tab.

*** Edit ***
Is this the benchmark index? It is labeled differently but seems to be a similar format.
screen shot 2019-02-22 at 12 04 21 pm

Also note that on this PSI run the returned Mobile index was higher than the Desktop one, and no where near as high as yours.

*** Desktop ***

"lighthouseResult": {
"requestedUrl": "http://shootdotedit.com/",
"finalUrl": "https://shootdotedit.com/",
"lighthouseVersion": "4.1.0",
"userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/72.0.3603.0 Safari/537.36",
"fetchTime": "2019-02-22T19:53:10.201Z",
"environment": {
"networkUserAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3559.0 Safari/537.36",
"hostUserAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/72.0.3603.0 Safari/537.36",
"benchmarkIndex": 537.0
},
"runWarnings": [],
"configSettings": {
"emulatedFormFactor": "desktop",
"locale": "en-US",
"onlyCategories": [
"performance"
]
},

*** Mobile ***

},
"lighthouseResult": {
"requestedUrl": "http://shootdotedit.com/",
"finalUrl": "https://shootdotedit.com/",
"lighthouseVersion": "4.1.0",
"userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/72.0.3603.0 Safari/537.36",
"fetchTime": "2019-02-22T19:53:09.330Z",
"environment": {
"networkUserAgent": "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3559.0 Mobile Safari/537.36",
"hostUserAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/72.0.3603.0 Safari/537.36",
"benchmarkIndex": 659.0
},
"runWarnings": [],
"configSettings": {
"emulatedFormFactor": "mobile",
"locale": "en-US",
"onlyCategories": [
"performance"
]
},

*** Summary Question ***
If the 2x factor is JS execution related would that suggest that if a page of static html is loaded on PSI and Lighthouse Audit the results should be nearly 1:1 as JS execution time and script eval is not happening.

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Feb 22, 2019

Is this the benchmark index? It is labeled differently but seems to be a similar format.
Yep. That's it.

*** Summary Question ***
If the 2x factor is JS execution related would that suggest that if a page of static html is loaded on PSI and Lighthouse Audit the results should be nearly 1:1 as JS execution time and script eval is not happening.

Sorry I don't understand what the question is here. Could you rephrase?

Also note that on this PSI run the returned Mobile index was higher than the Desktop one, and no where near as high as yours.

Yeah. I ran another pagespeed test this morning and observed a similar 600 range benchmarkIndex.

It seems like PSI's machines have variability in regards to CPU activity or perhaps even machine type (Let's remember, we have no idea what machines these are running on).

A couple of points worth noting:

  • Certain sites (read: Sites with lots of JavaScript) will be more effected by CPU / main thread work than others. They'll likely vary more, too.
  • Mobile runs are run with CPU simulation. The benchmark index that you're seeing for a mobile run is run with Chrome adding delays to main thread tasks in order to simulate a mobile CPU.

TLDR; Pagespeed's test machines are more than likely slower than your local desktop, and are causing scores to be lower than if you run them on your machine.

I believe there will be some work done to calibrate a test based on the benchmarkIndex, but that has not been completed yet.

Comparing tests conducted in different locations, on different hardware is not going to yield the same result.

At Calibre we've developed a number of strategies to mitigate these issues:

  • We have some machine learning that watches a given sites metrics during monitoring. If there is a result detected outside of an expected range (for certain metrics), the test is quarantined and run up to three times before the result is "believed". This happens transparently so you'll never see a result that we're not 100% sure of.
  • Our machines have been carefully selected to provide stable performance. This made a huge impact on stability.
  • A whole bunch of other details

I think it's fair to say that what PSI is giving you is still a solid and believable metric to work with, but it's maybe just not the experience that you might imagine. If you go ahead on the basis of understanding that tests run in different places on different hardware will give different results, then I believe you'll find some comfort. 👍

I hope all my details have been useful. I'm on leave next week but will pickup any questions thereafter.

@sdenathaniel

This comment has been minimized.

Copy link

sdenathaniel commented Feb 22, 2019

@benschwarz yes that is very helpful, so basically our options are to either run PSI against a live all else equals staging copy of the live site OR plan for a discrepancy proportional to the use of JS on the page. Then in the future if tests results are normalized based on benchmarkIndex we should see a 1:1.

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Feb 28, 2019

@benschwarz very interesting, but how comes that PSI was just fine in terms of mobile score before 2-3 months? Those crazy low mobile scores came to light after something changed in PSI.
v4 was fine, v5 smashed the score in the ground.

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Mar 3, 2019

AFAIK Pagspeed didn't use lighthouse before v5 and wasn't actually performing analysis of your site, but more checking it against a bunch of static rules (eg: compress your scripts, use small images) and it added up to a final score. That's why it was more consistent.

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Mar 4, 2019

@benschwarz Thanks for the explanation. Unfortunately it's now very inconsistent and not reliable anymore. We still struggle to get back in the green score (still 45-55).

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Mar 4, 2019

@vertic4l, I don’t want to be pushing our product here really, but I’d be interested in hearing how stable you find Calibre for your site given the troubles you’ve been having with PSI

@vertic4l

This comment has been minimized.

Copy link
Author

vertic4l commented Mar 11, 2019

@benschwarz Doesn't matter what other tools are saying. PSI is the one that counts for the company.

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Mar 11, 2019

Doesn't matter what other tools are saying. PSI is the one that counts for the company.
FYI, Calibre runs Lighthouse. The PSI score is the same as the Performance score from Lighthouse.

Based on where I think PSI is at right now, I think you're going to have a bumpy time trying to rely on it. Good luck all the same. ✌️

@duartegarin

This comment has been minimized.

Copy link

duartegarin commented Mar 11, 2019

@benschwarz Just to provide some context, the reason many of us want to rely on Lighthouse is that we are extremely dependant on SEO.

And given Lighthouse is google's tool we like to align ourselves as most as possible with it to ensure we are scored as best as possible in the way Google measures performance and other indicators

This is my case at least. Hope that make sense.

@benschwarz

This comment has been minimized.

Copy link
Contributor

benschwarz commented Mar 11, 2019

@duartegarin Yeah absolutely. It sucks that PSI is so unreliable at the moment. It'll get better, but it's going to require a lot of patience and trust from all the people who use it.

@bflopez

This comment has been minimized.

Copy link

bflopez commented Mar 14, 2019

Glad I found this issue. We have been using PageSpeed Insights since before they switched to Lighthouse and something DEFINITELY changed recently. I used to be able to get scores from Lighthouse in DevTools that were consistent with PSI until maybe 1-2 months ago. Not sure what changed. I honestly thought it was my fixes and just kept smashing ahead against the wall. Glad it is not just me.

Right now PSI says my main thread execution and JavaScript execution is like 40-60 secs! That is crazy right? Like there can be no way? I run the exact same page through Dev Tools and I get a more accurate reflection of 6-10s.

Google does have another page speed tool: https://www.thinkwithgoogle.com/feature/testmysite but I am not sure if that just uses PSI underneath. Anyone know?

@graylaurenm

This comment has been minimized.

Copy link

graylaurenm commented Mar 21, 2019

I'm seeing this same thing, specifically on sites with LiteSpeed.

As a baseline, if I test https://www.stetted.com/ (not LiteSpeed):

  • PSI in 80s
  • DevTools in 80s
  • CLI in 70s

However, here's an example of a site on LiteSpeed, https://diabetesstrong.com/:

  • PSI in 40s
  • DevTools in 70s
  • CLI in 70s
    edit: actually, the live site does have the LiteSpeed module on the server, but is currently using WP Rocket for caching; sorry for the confusion.

Now, if I put that same site on a different server, http://ds-staging.flywheelsites.com/:

  • PSI in the 60s
  • DevTool in the 70s
  • CLI in the 60s

And I can replicate that with https://bakingmischief.com/ (also LiteSpeed) and http://bm-stage.flywheelsites.com/.

  • PSI in 30s -> 80s
  • DevTools in 80s -> 80s
  • CLI in 60s -> 70s

I understand that DevTools and CLI could be higher because they have better resources & connection. I also realize that changing servers will impact performance and thus scores.

I'm not really connecting all the dots, though... in terms of paint, LS seems to perform very well on my devices. Is this an issue where LiteSpeed is specifically disadvantageous on slow connections? That's been my primary takeaway from the above examples and conversation. There's no real bug here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.