Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upDisplay BigQuery error stream when a load fails during dbt seed. #1079
Conversation
joshtemple
referenced this pull request
Oct 22, 2018
Closed
Full BigQuery load errors are neither logged nor displayed for dbt seed #1076
This comment has been minimized.
This comment has been minimized.
@joshtemple nice! I just tried to kick off tests for this PR, but I think GitHub is still working it's way through webhooks. This provisionally looks good to me :) |
drewbanin
requested a review
from
beckjake
Oct 23, 2018
beckjake
reviewed
Oct 23, 2018
I like the general idea, I do have concerns about the |
@@ -278,7 +278,8 @@ def poll_until_job_completes(cls, job, timeout): | |||
raise dbt.exceptions.RuntimeException("BigQuery Timeout Exceeded") | |||
|
|||
elif job.error_result: | |||
raise job.exception() | |||
e = job.exception() | |||
raise type(e)(message=e.message, errors=job.errors) |
This comment has been minimized.
This comment has been minimized.
beckjake
Oct 23, 2018
Contributor
I'm not sure about calling type
to get a class object and just assuming that it works to call as a constructor. I mean, I know it's ok here, but job
and job.exception()
come from google, not us.
Is this interface (in particular, the fact that the __init__
of the exception class returned by job.exception()
accepts an errors
keyword argument) considered stable in any way?
I think I would prefer something like:
msg = '{}\n{}'.format(e.message, '\n'.join(str(e) for e in job.errors)).strip()
raise dbt.exceptions.RuntimeException(msg)
I haven't tested it, and I'm not 100% sure on the type of job.errors
, but I assume something like that would work.
This comment has been minimized.
This comment has been minimized.
joshtemple
Oct 23, 2018
•
Author
Contributor
Yeah, totally fair, I went back and forth on that myself. In the end I decided not to hardcode a dbt exception since I wasn't sure about the implications of that downstream for logging. If you're more comfortable with raising a RuntimeException
as you outlined, I'll change it.
Google API Errors inherit from a base class (GoogleAPICallError) that accepts errors
and message
as keyword arguments, so it should be safe to assume we can pass those args. Alternatively, we could hardcode a generic GoogleAPICallError
exception (see here) or BadRequest
(which is what is actually raised in this case) which would ensure we can pass those args, rather than using type
.
What do you think?
This comment has been minimized.
This comment has been minimized.
beckjake
Oct 23, 2018
Contributor
Unless it has a negative downstream impact (triggering the exception handler in the wrong way, comes to mind) I would prefer to raise a dbt-native exception. At some point we'll convert it anyway for display, might as well get it done early.
This comment has been minimized.
This comment has been minimized.
Made the change. Only slight difference now is that the error message displays
|
beckjake
approved these changes
Oct 23, 2018
This comment has been minimized.
This comment has been minimized.
woop woop! Nice work @joshtemple :) I'm going to let the tests run, and then will merge this in. This will go out in out 0.12.0 release! |
joshtemple commentedOct 22, 2018
•
edited
Creates and raises a new exception, augmenting the
errors
attribute of the exception with the detailed error stream from thejob
object. This errors attribute is unpacked downstream by thehandle_error
method.I tested this out with a toy CSV file, adding a leading comma in the header row to induce a BigQuery load API error.
Before this change, the error is displayed as follows:
After this change, the full error details are included:
Fixes #1076