New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bigquery: Queries with multiple statements can fail silently #3304
Comments
Thanks for reporting this, and thanks for providing the details to make this easy to repro. This is nuanced, but I think it's an error in the backend response. I've filed an issue against the backend team to look at this (internally issue 174588888). The first call to the iterator Next() should respond with an error, and it is not doing so correctly in this case. A bit more detail: the multiple statements in the input SQL means the backend treats this query as a script. This yields a parent job (the script), and it will have potentially multiple child jobs for the individual statements. The backend documents specific behavior:
The issue is that the call to jobs.getQueryResults (part of row iteration) returns an empty response but doesn't return the error signal, so the iterator treats this as an empty result set, rather than the proper "your SELECT has errors". |
That makes sense - thanks for the quick response. Quick follow-up question - as you describe it, it seems that the I ask because my program doesn't actually retrieve results - it's just a simple job runner which either returns success or failure (it mostly runs create table statements). By the sounds of it, I actually should be explicitly checking the job status for errors regardless of this behaviour? It's mainly academic at this point, just curious about it. 🙂 |
The Job abstraction serves multiple purposes, not just query execution. For things like batch data ingestion, it's possible to return warnings for a job but the job itself be considered a success (ex: load a csv, but allow some amount of malformed rows to proceed). The error in the query is considered more severe, so row iteration should not proceed. Additionally, the multi-statement execution adds more color to the scenario. If you're not interested in consuming results, I'd suggest you may not even care to bother with the Verbosely, something like this:
For your error case, this would have surfaced the problem with the first SELECT, as it does attach the error to the parent script. The next interesting bit is how deeply you want to interrogate the multi-statement execution. Scripting does support statements like Alternately, you can ask for a list of child jobs, and do additional checking for errors per-statement if your SQL is less about dependent statements and simply a collection of work. Job statistics will tell you if there's nonzero child jobs, and the job object does surface a convenience method for iterating through the child jobs:
With that, you could more precisely identify which statement(s) had problems, which is harder to determine from just the parent metadata. Hope this helps. |
Very helpful indeed. Many thanks! |
Current status: still awaiting updates to backend response to resolve this. |
Thanks for keeping me updated @shollyman ! I've since run into another quirk of the backend (seems that way at least) which might be of interest - since it's not an OS project I don't have a good way to report it but thought it might be useful to know about. If I have a script containing multiple
It seems that auto-inference of region works just fine if it's a single It's not really bothersome - we just set the region explicitly to |
Location auto-detection is going to depend on how accurately the location can be inferred from the SQL. Scripts are a bit more complex, so if the script starts with statements that don't implicate a location (defining temp tables, setting variables, etc), it's likely just picking the US as the fallback default. Also, just another resource to be aware of: https://cloud.google.com/bigquery/docs/getting-support#issuetracker has links for reviewing public issues and/or filing a new one against the service itself. |
Oh Nice, didn't know I could file a report via that site, thanks! |
The backend changes to resolve this are now in production. Running your repro, the error is propagating correctly and triggers failure:
I'm going to close this issue at this time. Thanks again for reporting! |
Client
bigQuery
Environment
OSX
Go Environment
$ go version
go version go1.15.5 darwin/amd64
$ go env
Code
Expected behavior
I'm not 100% sure if this is a bug or I'm just not quite understanding an instrumentation detail. When using
query.Run()
andjob.Read()
, an error will be thrown if a query fails. However if the query script contains multiple statements, the first can fail but no error is thrown.I can manually check for an error with something like the below, but not sure that this is the intended behaviour.
Note that this leads to a different format of error - which leads me to suspect that either the API unintentionally doesn't handle errors in these cases, or I misunderstand something and should instrument the code differently.
Any guidance you may have as to which is the case would be very much appreciated. :)
Additional Detail
In case it's not immediately clear, this can be awkward to work around because some legitimate queries necessarily have multiple statements. (eg.
BEGIN ... END;
transactions, or any query containing a variable declaration - which may fail before the final statement).The text was updated successfully, but these errors were encountered: