runTimeError exists when there are NO_ERRORs #6336

ebidel · 2018-10-18T06:03:14Z

Ran into this working on a web server that returns LH results. It's unexpected that runTimeErrors exists in the LHR if there were no errors! It also requires users to check for a "NO_ERROR" string code which overcomplicates matters.

    if (lhr.runtimeError && lhr.runtimeError.code !== 'NO_ERROR') {
      throw new Error(`${lhr.runtimeError.code} ${lhr.runtimeError.message}`);
    }

Environment Information

Affected Channels: all
Lighthouse version: 3.2.1
Node.js version: 10+

The text was updated successfully, but these errors were encountered:

patrickhulce · 2018-10-18T14:21:54Z

I'm missing rationale for this too.

In #6014, @brendankenny said

I'd prefer to continue throwing exceptions for these cases and then checking outside core in runLighthouseForLR that the thrown error has lhrRuntimeError set on it. Most of our clients just want a thrown error in those cases, not a pseudo-LHR, and dealing with a pseudo-LHR just makes tracking the LHR type and what's on it messier when we really just want an exception floated up.

Which makes a lot of sense to me too. Why did we not go that route? :)

ebidel · 2018-10-18T15:13:02Z

+1 just to echo that, it was unclear in our tool that consumes lhrs that the results we were saving were actually illegitimate. And you almost never want a psuedo real result... doesn't tell you much :)

…

On Thu, Oct 18, 2018, 7:22 AM Patrick Hulce ***@***.*** wrote: I'm missing rationale for this too. In #6014 <#6014>, @brendankenny <https://github.com/brendankenny> said I'd prefer to continue throwing exceptions for these cases and then checking outside core in runLighthouseForLR that the thrown error has lhrRuntimeError set on it. Most of our clients just want a thrown error in those cases, not a pseudo-LHR, and dealing with a pseudo-LHR just makes tracking the LHR type and what's on it messier when we really just want an exception floated up. Which makes a lot of sense to me too. Why did we not go that route? :) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6336 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOigLmtwoXJdG_0ePFjYY-LTJKaA5k5ks5umI6JgaJpZM4XsuU0> .

exterkamp · 2018-10-18T17:16:14Z

This has implications for proto enums as well. @paulirish

brendankenny · 2018-10-18T17:24:59Z

#6330 is also part of this

paulirish · 2018-10-19T06:37:43Z

runtimeError for failed document requests

@ebidel currently in the case of a 500 or a DNS resolving error for the page being tested, we do return an LHR with a populated runtimeError object. So you're going to have to check this prop before considering the rest of the LHR valid enough to read. (But.. more on NO_ERROR below)

On throwing and pseudo-LHRs

I'm not sold that throwing to our node module consumers is useful, but we can leave that be. I'll live with our current approach.

However, for CLI users (and I'm mostly considering CLI use at scale here), I don't think communicating over exit code is tremendously effective. In order to convey any detail about the error we'd need to log something, but that now means the user has to parse stderr, which we've been logging status to this whole time. :/

So I'd argue that CLI users would be better served with a psuedo-LHR with the runtimeError object populated.

NO_ERROR

Sounds like nobody likes lhr.runtimeError being set if there was no error. I can understand that. While it's more idomatic in JS land to leave it off, in the proto, we do need the field defined and initialized with NO_ERROR. However, we think we can populate it after going JSON->Proto. So that seems like the rough plan here.

API flexibility

Last, we need to confirm what we can do on the PSI API side of things. We don't have unlimited flexibility there, but after discussing it, @exterkamp thinks we can probably handle this well. But whatever the plan is, it needs to be compatible there, because those LHRs will be consumed more than LHRs from the CLI. ;)

patrickhulce · 2018-10-19T15:05:50Z

I'm not sold that throwing to our node module consumers is useful

😮 I am so curious why, but I'll table that for now too :)

Thanks for the context!

I don't think communicating over exit code is tremendously effective

I can understand getting more information about the error is more difficult in the CLI case, but I'm curious who those CLI users are where getting the error string easily is more important than detecting that an error happened easily.

I would expect that the only actionable information it gives anyone not on this team is that they need to re-run LH. From bash script angle, being able to tell LH failed just by exit code is a huge win over parsing JSON and checking if a specific property is equal to a particular value to figure out there was a failure. This also assumes they're not using the CLI to generate HTML output in which case determining there was an error seems impossibly difficult. Are we at least considering doing both? Populating a JSON runtimeError for those who need more detail and still exiting with 1 + basic logs to stderr?

in the proto, we do need the field defined and initialized with NO_ERROR

I should've known proto was to blame :) does this mean it's the default and won't be set in the API response? I forget where we landed on if our proto-version does that.

Last, we need to confirm what we can do on the PSI API side of things.

I probably missed this, but do our error responses and the LHR have to be same proto? In my experience, it's very common for error responses with a non-200 status to have very different schemas.

ebidel · 2018-10-19T19:18:12Z

I probably missed this, but do our error responses and the LHR have to be same proto? In my experience, it's very common for error responses with a non-200 status to have very different schemas.

In fact, I think that's what a lot of Google service APIs do. The return object is something like {errors: ...} instead of the normal object you get back with an embedded errors property you have to check. IOW, you get the response status code and response body tell similar stories (there was a failure).

paulirish · 2018-10-19T21:31:30Z

A few updates:

The proto story looks okay
Yes you're right about Google service APIs. If we return a real runtimeError, the API will return a 500 with a lil object. No LHR.

paulirish · 2018-10-22T23:57:02Z

Are we at least considering doing both? Populating a JSON runtimeError for those who need more detail and still exiting with 1 + basic logs to stderr?

Yes we'd def still set an exit code correctly.

Current status is...

Node module has a few cases where it throws. Any clients who don't like that have to catch on their own.
lhr.runtimeError object should only be created if there's an error. If there's not, then that prop will be undefined.
API will not return an empty or NO_ERROR runtimeError object
API will respond with 500 in case of a runtimeError, though the response will be some other JSON, but not the LHR.
CLI should return non-0 exit codet if runtimeError is populated
CLI will set an exit code for if it's catching an exception.
In the case mentioned immediately above, there's no LHR, so no lhr.runtimeError. Perhaps there should be?

I believe this reflects the consensus. AFAICT, node module + API behavior are settled, but we have some unresolved questions around CLI behavior.

exterkamp · 2018-12-03T20:57:04Z

I will try to organize this a little bit so that we don't get out of hand with #6671.

Error Definitions

It seems like there are two main kinds of Lighthouse errors:

"Audit Errors" which are small localized errors that are reported at the audit level.
- "Gatherer Errors" can cause Audit Errors. Think PageLoadError which is fatal, but presents itself by erroring out all gatherers which leads to most audits having an error.
"Runtime Errors" which are large errors that are catastrophic to the Lighthouse Run.
- "Fatal Errors" Errors that immediately cause the run to terminate e.g. PROTOCOL_ERROR when getting browser version, i.e. errors we can never recover from and LH will never make sense with one of these errors occurring. For clarity I will only refer to "runtime errors" since they are equivalent to "fatal errors", I think I introduced confusion around this 😨 sorry

Error Actions

As noted in this convo, surfacing these runtime errors can be a problem. This seems like the consensus.

CLI	Lightrider	node	DevTools/Extension
Non 0 Exit Code + save an LHR with runtimeError populated	Graceful exit with LHR with runtimeError populated.	~~Throw error + save output with runtimeError populated (either LHR if json or html report if html)~~	Don't worry about it?

PSI

PSI should return an HTTP status code that is consistent with the LHR inside it. Currently it will return a 200 if there is a response from LR, even if that response is that a runtime error has occured. This is not desired and should return 200 if successful (no runtimeErrors or runtimeWarnings) and a 5XX if the run was not successful.

Also PSI shouldn't mirror what LR encountered e.g. if Lightrider gets a 404 PSI should return a 5XX not a 404. This does make me want a better way to show that there was an underlying status code/details to the error. After #6457 the error codes will have some more explicit status codes appended to them, but it might be prudent to add a field to Lighthouse_result.runtime_error something like underlying_status_code so that it is simple to parse what the cause was exactly if it was a page load problem, or possibly error_details. Basically something that could be parsed easier for better error reporting/bucketing.

Eng to make this happen

Runtime errors should generate the most descriptive LHR possible and return it. Not just throw and error.
- The CLI wil have to return exit codes based on the runtimeError, not catching thrown errors, but it should also always save an LHR.
- LR will return any LHR it gets, not construct a little error-msg-only pseudo LHR.
- PSI should return 5XX on errors w/explicit status codes or protocol methods included in reseponse
Refactor PageLoadError into the Runtime Error that it is and stop erroring the gatherers, and just exit with an error like a protocol error during artifact generation.

patrickhulce · 2018-12-03T21:27:44Z

Thanks for the clarification @exterkamp this helps quite a bit!! Few Qs

This seems like the consensus.

We're missing node module in the table and it sounds like the result of this is that the node module will not throw on "Runtime errors" anymore? This seems somewhat unfortunate as the first thing all consumers will have to do is the same reason the issue was filed in the first place if (lhr.runtimeError) throw new Error(...).

The CLI, LR, and PSI end-behavior all sound great, but given that an error is just an object, can we just attach whatever additional information to it we need?

Refactor PageLoadError into the Runtime Error that it is

This isn't really the case for every pass, some passes it's OK for PageLoadError to just be a gatherer error, and move on and in fact is why we converted it from a fatal error to a non-fatal error. Is the core of what we're getting at here when the root document of the default pass errors we want the run to stop and produce a runtimeError? Because that seems like a more specific problem than all PageLoadErrors Maybe just better discussed on the actual PR though.

Graceful exit with LHR, save an LHR with runtimeError populated

I think a hurdle for me is the idea that you get an "LHR" when there's a fatal runtimeError. WDYT if we treated it exactly like we want the PSI API to work, you don't get back an LHR but an error response with details. I think the fact that you get an LHR is one of the hurdles of the original issue filing and if the node module instead returned an object like {runtimeError: error, lhr: undefined} instead of {lhr: {runtimeError: error}} I could begin to see the argument for not throwing (though I'd still fight it 😃).

exterkamp · 2018-12-06T22:38:40Z

So @patrickhulce response made me look more into the LR errors, and I finally found the root cause (#6739)! So I am going to table more advanced error handling (#6671) for later. But I definitely think that PSI error handling needs to be re-evaluated next, but I'm going to leave LH CLI, node and others for later.

exterkamp · 2019-02-15T18:51:26Z

Okay picking this back up.

Current state of affairs:

node output=json	PSI-API	Lightrider Proto
`"runtimeError": {"code": "NO_ERROR", "message": ""},`	`"runWarnings":[]` and no runtimeError 😄	No runtimeError and an empty runWarnings ListValue! 😄

This is the result of the proto definition for Lighthouse Errors using NO_ERROR as the default enum of 0, therefore in proto3 they are left blank since it is the default.

Are we satisfied with this outcome? Do we also want to remove NO_ERRORs from the LHR by default (I think the main reason they were there in the beginning was PSI, which doesn't need them anymore).

Generating a set of, what I'm still going to call emergency-artifacts and a bailout-LHR has come up again in #6802. So that might be related to LR error handling as well.

patrickhulce · 2019-02-15T19:29:43Z

I'm not really satisfied with the node outcome right now. Eliminating the "runtimeError": {"code": "NO_ERROR"} was the original point of the issue I believe, so seeing that through in all environments seems reasonable.

I think I still stand by my opinions/suggestions in #6336 (comment), so good to see a few months hasn't changed me too much :)

exterkamp · 2019-02-16T00:50:33Z

I thought this would be the answer 😉 I'll work on refactoring the node to have no runtimeError.

connorjclark · 2019-05-20T19:41:07Z

@exterkamp follow up pr for exit code (#7358 (comment)) then we can close this issue?

exterkamp · 2019-05-28T18:20:06Z

Yup afaict I think this is 💯

ebidel changed the title ~~runTimeErrors exists when there are NO_ERRORs~~ runTimeError exists when there are NO_ERRORs Oct 18, 2018

exterkamp mentioned this issue Oct 22, 2018

misc(proto): make NO_ERROR default in proto #6358

Merged

This was referenced Nov 17, 2018

INSECURE error not working as expected #6595

Closed

core: INSECURE_DOCUMENT_REQUEST errors still return lhr #6608

Merged

patrickhulce mentioned this issue Nov 28, 2018

core(gather-runner): gracefully recover from fatal errors #6671

Closed

exterkamp mentioned this issue Dec 6, 2018

core(driver): handle promise rejections from sendCommand #6739

Merged

paulirish added the needs-priority label Dec 17, 2018

paulirish added P1 and removed needs-priority labels Feb 13, 2019

paulirish assigned exterkamp Feb 13, 2019

exterkamp mentioned this issue Mar 2, 2019

core: remove NO_ERROR from default responses #7358

Merged

patrickhulce mentioned this issue Apr 16, 2019

core(gather-runner): treat NO_FCP as a pageLoadError #8340

Merged

exterkamp closed this as completed May 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runTimeError exists when there are NO_ERRORs #6336

runTimeError exists when there are NO_ERRORs #6336

ebidel commented Oct 18, 2018

patrickhulce commented Oct 18, 2018

ebidel commented Oct 18, 2018 via email

exterkamp commented Oct 18, 2018

brendankenny commented Oct 18, 2018

paulirish commented Oct 19, 2018

patrickhulce commented Oct 19, 2018

ebidel commented Oct 19, 2018

paulirish commented Oct 19, 2018

paulirish commented Oct 22, 2018 •

edited by exterkamp

Loading

exterkamp commented Dec 3, 2018 •

edited

Loading

patrickhulce commented Dec 3, 2018

exterkamp commented Dec 6, 2018

exterkamp commented Feb 15, 2019

patrickhulce commented Feb 15, 2019

exterkamp commented Feb 16, 2019

connorjclark commented May 20, 2019

exterkamp commented May 28, 2019

runTimeError exists when there are NO_ERRORs #6336

runTimeError exists when there are NO_ERRORs #6336

Comments

ebidel commented Oct 18, 2018

Environment Information

patrickhulce commented Oct 18, 2018

ebidel commented Oct 18, 2018 via email

exterkamp commented Oct 18, 2018

brendankenny commented Oct 18, 2018

paulirish commented Oct 19, 2018

runtimeError for failed document requests

On throwing and pseudo-LHRs

NO_ERROR

API flexibility

patrickhulce commented Oct 19, 2018

ebidel commented Oct 19, 2018

paulirish commented Oct 19, 2018

paulirish commented Oct 22, 2018 • edited by exterkamp Loading

exterkamp commented Dec 3, 2018 • edited Loading

Error Definitions

Error Actions

PSI

Eng to make this happen

patrickhulce commented Dec 3, 2018

exterkamp commented Dec 6, 2018

exterkamp commented Feb 15, 2019

patrickhulce commented Feb 15, 2019

exterkamp commented Feb 16, 2019

connorjclark commented May 20, 2019

exterkamp commented May 28, 2019

paulirish commented Oct 22, 2018 •

edited by exterkamp

Loading

exterkamp commented Dec 3, 2018 •

edited

Loading