Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid possible job already exists error #751

Merged
merged 2 commits into from Jul 14, 2021
Merged

Conversation

@plamut
Copy link
Contributor

@plamut plamut commented Jul 12, 2021

Fixes #738.

If job create request fails, a query job might still have started successfully. This PR handles this edge case and returns such
query job one can be found.

Based on the similar fix in the Java client.

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)
If job create request fails, a query job might still have started
successfully. This commit handles this edge case and returns such
query job one can be found.
@plamut plamut requested a review from tswast Jul 12, 2021
@plamut plamut requested review from as code owners Jul 12, 2021
@google-cla google-cla bot added the cla: yes label Jul 12, 2021
google/cloud/bigquery/client.py Show resolved Hide resolved
Loading
google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
Loading
@plamut
Copy link
Contributor Author

@plamut plamut commented Jul 12, 2021

@tseaver I don't know the exact mechanics on the backend, this fix is mostly based on a similar fix in the Java client.

@tswast Can you chime in?

Loading

@plamut
Copy link
Contributor Author

@plamut plamut commented Jul 14, 2021

The docs check failure does not seem to be related:

gpg: keyserver receive failed: No name

Update: Indeed, but the fix is on its way.

Loading

@tseaver
Copy link
Contributor

@tseaver tseaver commented Jul 14, 2021

googleapis/synthtool#1155 landed here in #762. I'm not sure why the config isn't making you merge with master to pick that fix up, however.

Loading

tswast
tswast approved these changes Jul 14, 2021
@tswast tswast merged commit 45b9308 into googleapis:master Jul 14, 2021
11 of 12 checks passed
Loading
@plamut plamut deleted the iss-738 branch Jul 14, 2021
gcf-merge-on-green bot pushed a commit that referenced this issue Jul 19, 2021
🤖 I have created a release \*beep\* \*boop\*
---
## [2.22.0](https://www.github.com/googleapis/python-bigquery/compare/v2.21.0...v2.22.0) (2021-07-19)


### Features

* add `LoadJobConfig.projection_fields` to select DATASTORE_BACKUP fields ([#736](https://www.github.com/googleapis/python-bigquery/issues/736)) ([c45a738](https://www.github.com/googleapis/python-bigquery/commit/c45a7380871af3dfbd3c45524cb606c60e1a01d1))
* add standard sql table type, update scalar type enums ([#777](https://www.github.com/googleapis/python-bigquery/issues/777)) ([b8b5433](https://www.github.com/googleapis/python-bigquery/commit/b8b5433898ec881f8da1303614780a660d94733a))
* add support for more detailed DML stats ([#758](https://www.github.com/googleapis/python-bigquery/issues/758)) ([36fe86f](https://www.github.com/googleapis/python-bigquery/commit/36fe86f41c1a8f46167284f752a6d6bbf886a04b))
* add support for user defined Table View Functions ([#724](https://www.github.com/googleapis/python-bigquery/issues/724)) ([8c7b839](https://www.github.com/googleapis/python-bigquery/commit/8c7b839a6ac1491c1c3b6b0e8755f4b70ed72ee3))


### Bug Fixes

* avoid possible job already exists error ([#751](https://www.github.com/googleapis/python-bigquery/issues/751)) ([45b9308](https://www.github.com/googleapis/python-bigquery/commit/45b93089f5398740413104285cc8acfd5ebc9c08))


### Dependencies

* allow 2.x versions of `google-api-core`, `google-cloud-core`, `google-resumable-media` ([#770](https://www.github.com/googleapis/python-bigquery/issues/770)) ([87a09fa](https://www.github.com/googleapis/python-bigquery/commit/87a09fa3f2a9ab35728a1ac925f9d5f2e6616c65))


### Documentation

* add loading data from Firestore backup sample ([#737](https://www.github.com/googleapis/python-bigquery/issues/737)) ([22fd848](https://www.github.com/googleapis/python-bigquery/commit/22fd848cae4af1148040e1faa31dd15a4d674687))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
raise create_exc

try:
query_job = self.get_job(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, there is a slight problem with this change - self.get_job has a different return type to this function. It can return LoadJob, etc as well as the QueryJob we're expecting so the actual return type doesn't match what is declared for this function.

I don't understand the situations that could result in this code being called, but presumably in reality this would always be a QueryJob? Unfortunately this is causing me problems when running pylint over some code that calls this, because it thinks the function can return LoadJob, and that has a different set of members to QueryJob.

Many thanks,
Andrew Wilkinson

Loading

Copy link
Contributor Author

@plamut plamut Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, in this context self.get_job() returns a QueryJob, because job_id is the same ID that was used a few lines above when constructing a new query job (and then starting it).

This project uses pytype for static type checks and it did not complain, but apparently pylint could not deduce the same and reported a false issue.

Could you tell pylint to ignore return type in that specific line where query() is called? IMHO that justifiable, because pylint is wrong there.

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having looked into this a bit further I agree that pylint is wrong. It's a bit of a pain to have disable this check every time we call query, but I think this is a sign that pylint is aging and not keeping up with modern Python's type syntax.

Sorry for the noise.

Cheers,
Andrew

Loading

Copy link
Contributor Author

@plamut plamut Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, it was a perfectly valid comment.

Ideally, pylint would allow ignoring particular warnings for lines matching a regex, but I'm not sure if that's currently supported? It would make disabling those false positives much cleaner compared to spamming the # pylint: disable=... comments all around the code.

Loading

Copy link

@andrewjw andrewjw Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly the error isn't raised on the call to query, but when you try and use the return value. In my case this is accessing num_dml_affected_rows, which only exists on QueryJob, and not on LoadJob. Even if it did support disabling errors using a regex, I'm not sure it would be practical to create one.

It's been bugging me why this wouldn't be picked up by the type checker. I think I've tracked it down to the fact that LoadJob, QueryJob, etc all derive from _AsyncJob, which in turn derives from google.api_core.future.polling.PollingFuture. The problem is that google.api_core.future.polling.PollingFuture is not typable, so it gets turns into an Any type, which makes all the job types equivalent and therefore doesn't generate an error. When testing with mypy you have to add # type: ignore to the PollingFuture import line explicitly, but I guess pytype is more forgiving.

I've create the attached file demonstrating the problem (annoyingly github won't let me attach the file as a .py). As currently written it'll generate an error in both mypy and pytype, but swap the comments on lines 5 and 6 and the error goes away.

Anyway, I have a reasonable workaround, so if you want to leave this that's absolutely fine. If in future the python-api-core library adds typing then I expect this to break though. Adding an assert isinstance(query_job, job.QueryJob) will resolve the issue.

Cheers,
Andrew
invalid_return_union.txt

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

5 participants