Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery jobs.insert needs both upload (for configuration.load) and doit (for configuration.query, .extract and .copy) #191

Closed
emk opened this issue Jun 5, 2018 · 10 comments

Comments

@emk
Copy link

emk commented Jun 5, 2018

Hello! I've been experimenting with BigQuery and Rust and I've been I've been really impressed with the google-bigquery2 crate. I managed to get OAuth2 and some simple queries working, and everything worked beautifully.

However, I ran into an interesting complication. The hub.jobs().query(...) is meant for small queries that output a few hundred or a few thousand rows of JSON. Larger queries need to be run using jobs().insert(...) with a configuration.query structure, and the results need to be exported to Google Cloud Storage using the same API with a configuration.extract structure.

But this means that JobInsertCall actually needs both an upload method, and a doit method! Specifically, if you look at the documentation here, the breakdown appears to be:

  • configuration.load: Needs upload.
  • configuration.query: Needs a regular doit.
  • configuration.extract: Needs a regular doit.
  • configuration.copy: Needs a regular doit.

But JobInsertCall has a private doit method that only provides an implementation for upload using POST https://www.googleapis.com/upload/bigquery/v2/projects/projectId/jobs, and which has no access to regular doit using POST https://www.googleapis.com/bigquery/v2/projects/projectId/jobs.

The key code appears to be in mbuild.mako, which is set up to generate either a public doit function, or both a public upload function and a private doit function that can only be used to help implement upload.

The error that BigQuery gives me is:

Error: Bad Request (400): Uploads are only allowed with load jobs
    global: Uploads are only allowed with load jobs, invalid

I would be happy to submit a PR, but this involves a refactoring of mbuild.mako. Is this something that you would prefer to do yourself? Or would you be willing to explain how you'd like the code to be refactored?

I'm going to go ahead and modify my local copy to get rid of upload_action on this endpoint, regenerate everything, and verify that it works as expected. If it would be useful, I could also supply you with a test program.

Thank you for a very helpful and well-built library, and for any advice you can provide!

@emk
Copy link
Author

emk commented Jun 5, 2018

Confirmed! If I remove the upload support from jobs.insert, then configuration.query works nicely! So the general strategy of offering both an upload method and a normal public doit method on JobInsertCall appears to be perfectly sound.

emk added a commit to faradayio/google-apis-rs that referenced this issue Jun 5, 2018
See Byron#191 and
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/insert

This makes it possible to use:

- `configuration.query`
- `configuration.extract`
- `configuration.copy`

...but it breaks `configuration.load`.
@emk
Copy link
Author

emk commented Jun 5, 2018

Here's a version of google-bigquery2 which provides doit instead of upload: faradayio@7c314e5

As mentioned above, I've verified that configuration.query works as expected. So the proposed fix of generating both an upload and regular doit method version should work nicely.

@Byron
Copy link
Owner

Byron commented Jun 5, 2018

Thanks so much for taking the time to write things down so that it's perfectly possible to follow you, and of course for the praise :).

I also think that the issue you encountered must be taken into consideration for the next iteration of this API generator here: #189 .

Your fix could be merged, but is likely to be overwritten next time someone runs make update-json. As of know, there is no mechanism for patching JSONS automatically, and possibly the generator should create 'working' versions from the originals anyway.
If you ever get an intuition how the current implementation could be fixed, please let me know or submit a PR - I am happy to merge and re-release.

@emk
Copy link
Author

emk commented Jun 5, 2018

Oh, yeah, my "fix" isn't mergeable as is, because it breaks configuration.load while fixing configuration.query, configuration.extract and configuration.copy. That's just our internal workaround so I can test some other stuff.

The right fix, I think, it to change mbuild.mako to always generate a pub fn doit(&self), and to additionally generate a completely separate upload function when the JSON file indicates that is possible. This could do done with some copying and pasting in mkbuild.mako to (1) always make doit public with no stream or MIME argument, and to (2) fix upload to call a new upload_helper function instead of doit.

Does this make sense?

@Byron
Copy link
Owner

Byron commented Jun 6, 2018 via email

@rossmeissl
Copy link

@Byron just curious if you plan to move forward on this?

@rossmeissl
Copy link

@Byron checking in on this

@rossmeissl
Copy link

@Byron any word?

@Byron
Copy link
Owner

Byron commented Jul 11, 2018

Apologies for the late reply.
I have currently no plan of working on this project, but welcome any contributions.

@Byron
Copy link
Owner

Byron commented Apr 15, 2021

I am closing this issue as since then a lot has changed. All crates are now async, available as v2.0 on crates.io and build against the latest API spec.
Please let me know in the comments if the issue is still relevant though.

@Byron Byron closed this as completed Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants