Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(gather-runner): use final document when reporting non-HTML error #11620

Merged

Conversation

bradfrosty
Copy link
Contributor

Summary

Fixes a bug where non-HTML page load error would be reported for sites that redirect and resolve to a valid HTML document. The main document's mime type would be checked instead of the "final" document resolved from the redirect.

This change is needed because previously, false positive errors may be reported for valid pages that ultimately respond with HTML via redirects.

Related Issues/PRs

Fixes #11585
Introduced in #11042

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks very much @bfrost2893! 🎉 this is great navigating of core Lighthouse code, and excellent tests too 👍

* @param {LH.Artifacts.NetworkRequest|undefined} mainDocument
* @returns {LH.Artifacts.NetworkRequest}
*/
static findFinalDocument(mainDocument) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about...

Suggested change
static findFinalDocument(mainDocument) {
static resolveRedirects(request) {
if (!request) return undefined;
while (request.redirectDestination) request = request.redirectDestination;
return request;

findFinalDocument sounds more like a function that would search through an array of records :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree, given it takes a single NetworkRequest.

Comment on lines 262 to 265
let finalRecord;
try {
finalRecord = NetworkAnalyzer.findFinalDocument(mainRecord);
} catch (_) {}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need try/catch around this if we end up going with the resolveRedirects option that can handle undefined :)

Suggested change
let finalRecord;
try {
finalRecord = NetworkAnalyzer.findFinalDocument(mainRecord);
} catch (_) {}
// MIME Type is only set on the final redirected document request. Use this for the HTML check instead of root.
const finalRecord = NetworkAnalyzer.resolveRedirects(mainRecord);

@@ -385,4 +388,26 @@ describe('DependencyGraph/Simulator/NetworkAnalyzer', () => {
assert.equal(mainDocument.url, 'https://www.example.com');
});
});

describe('#findFinalDocument', () => {
it('should find the final document without redirects', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one extra one for the undefined case?

@bradfrosty
Copy link
Contributor Author

bradfrosty commented Nov 2, 2020

@patrickhulce thanks for the review! Addressed the changes and added the additional test.

One thing I'm not sure of is why smoke_3_ToT is failing. However it seems to be failing on other branches as well.

PS: Sorry for the name change, been itching to get rid of those numbers...

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! over to @Beytoven for the final seal of 👍

lighthouse-core/gather/gather-runner.js Show resolved Hide resolved
Copy link
Contributor

@Beytoven Beytoven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything LGTM as well! Thanks.

@patrickhulce
Copy link
Collaborator

The test failure is a result of a Chrome change, we can land this once https://github.com/GoogleChrome/lighthouse/pull/11625/files lands 👍

@Beytoven
Copy link
Contributor

Beytoven commented Nov 4, 2020

The test failure is a result of a Chrome change, we can land this once https://github.com/GoogleChrome/lighthouse/pull/11625/files lands 👍

It has landed now.

@patrickhulce
Copy link
Collaborator

Awesome, @bradfrosty if you can rebase we should be able to get this landed 👍

@bradfrosty
Copy link
Contributor Author

@patrickhulce @Beytoven any idea about how to fix the CLA check?

@connorjclark
Copy link
Collaborator

CLA is blocked on "patrick.hulce@gmail.com" for some reason .... We can just override. Yours went through. Thanks!

* @param {LH.Artifacts.NetworkRequest|undefined} request
* @returns {LH.Artifacts.NetworkRequest|undefined}
*/
static resolveRedirects(request) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: @patrickhulce would this method be better on NetworkRequest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marginally yes, but we have network analysis somewhat scattered at the moment, so I don't think it makes a large difference.

* @returns {LH.Artifacts.NetworkRequest|undefined}
*/
static resolveRedirects(request) {
if (!request) return undefined;
Copy link
Collaborator

@connorjclark connorjclark Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle the undefined case where this is called instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with either, but this came from this suggestion

cc @patrickhulce

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested that so it would be drop-in replacement with the rest :)

I would also have a slight preference to handle the undefined at the caller, but I don't think it matters much honestly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, I leave it to you @bradfrosty, we're good with w/e!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the undefined case to the caller -- let me know if that looks alright! Instead of handling the possible exception (like with the call to findMainDocument()), I type checked the mainDocument .

I also updated the undefined case test in network-analyzer-test.js to expect an exception to throw.

@google-cla
Copy link

google-cla bot commented Nov 5, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Nov 5, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@patrickhulce
Copy link
Collaborator

@googlebot I consent.

@@ -259,9 +261,15 @@ class GatherRunner {
mainRecord = NetworkAnalyzer.findMainDocument(networkRecords, passContext.url);
} catch (_) {}

// MIME Type is only set on the final redirected document request. Use this for the HTML check instead of root.
let finalRecord;
if (mainRecord instanceof NetworkRequest) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple truthy check is good enough, other than that this LGTM.

@@ -259,9 +261,15 @@ class GatherRunner {
mainRecord = NetworkAnalyzer.findMainDocument(networkRecords, passContext.url);
} catch (_) {}

// MIME Type is only set on the final redirected document request. Use this for the HTML check instead of root.
let finalRecord;
if (mainRecord instanceof NetworkRequest) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (mainRecord instanceof NetworkRequest) {
if (mainRecord) {

assert.equal(finalDocument.url, 'https://m.vk.com/');
});

it('should throw if not given a request', () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the parameter is typechecked to not allow undefined, we can consider this case covered trivially by the typesystem, so let's delete this test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@patrickhulce
Copy link
Collaborator

heads up @bradfrosty there's a lint error to fix based on one of the github applied suggestions :) (unused variable)

looks great though, can merge after that!

@patrickhulce patrickhulce merged commit 4d3bda1 into GoogleChrome:master Nov 5, 2020
@bradfrosty bradfrosty deleted the non-html-final-document branch November 5, 2020 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect main document used when retrieving page load errors
5 participants