Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message "404 job not found" for a successful job involving Cloud SQL federated query when no location specified #1303

Closed
mattwelke opened this issue Aug 2, 2022 · 3 comments
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@mattwelke
Copy link

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Done. GCP support was able to solve my issue fairly quickly. Now, I'm creating this bug report here to help the developers improve the library so people don't need to use GCP support in the future if they run into the same issue.

Please run down the following list and make sure you've tried the usual "quick fixes":

Done. Did this just now before posting the issue. Searched for issues matching "federated" and "cloud sql".

Done. Did this before opening my GCP support case when I first encountered the issue.

If you are still having issues, please be sure to include as much information as possible:

Environment details

  • OS type and version: Ubuntu 22.04
  • Python version: python --version 3.10.5
  • pip version: pip --version 22.1.2
  • google-cloud-bigquery version: pip show google-cloud-bigquery 3.3.0

Steps to reproduce

  1. Use a Python script to perform a query where the query involves federation with a Cloud SQL database.
  2. Note that a "404 job not found" error is produced, even though a job is started and will successfully complete (assuming there are not issues with it). This prevents your script from being able to read the query results idiomatically.

Code example

return f"""
        SELECT * 
        FROM EXTERNAL_QUERY("projects/REDACTED/locations/us-central1/connections/REDACTED",
    \"""SELECT customers.name as customer, count(distinct(product_id)) as products_enriched FROM curated_product_fields cpf join customers on (customers.id = cpf.customer_id)
    where
    date(cpf.updated_at) >= '{start}'
    and date(cpf.updated_at) < '{end}'
    group by customers.name;\""");
    """

Stack trace

Exception has occurred: NotFound
404 GET https://www.googleapis.com/bigquery/v2/projects/REDACTED/queries/REDACTED?maxResults=0: Not found: Job REDACTED:REDACTED
  File "REDACTED/usage-report-generator/main.py", line 208, in main
    df = job.to_dataframe()

More details:

I originally reported this issue on the GCP bug tracker. I used some troubleshooting steps to determine that my job was running and had results I could query, just not idiomatically. With idiomatic code, where I simply read the results like I would any other job, it said "404 not found". In the VS Code debugger, I found that I could see the temporary table name and copy paste it into the GCP UI and get my results that way. A job did indeed exist.

I found that I could also run the query as is in the GCP UI by copy and pasting it into the UI and clicking run. That's what made me think there was something wrong with the Python client library in this situation, not BigQuery itself.

GCP support eventually told me that I needed to specify the processing location when doing federated queries. So I was able to get the code in my script to work by changing:

client = bigquery.Client()

to

client = bigquery.Client()
client2 = bigquery.Client(location="us-central1")

If I used the second client to perform the federated query where the Cloud SQL instance was in us-central1, it worked. I was able to have my script run all the queries it needed to run, reading the results from each query, for queries that both involved and didn't involve federation.

I think this constraint is reasonable, but I would have liked a better error message telling me that I should set the processing location. I did read the docs at https://googleapis.dev/python/bigquery/latest/reference.html#query but didn't see any mention of location being important. For example, maybe the client library could look for the string "FROM EXTERNAL_QUERY" and warn when it can't find a job but finds this string in the query?

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Aug 2, 2022
@meredithslota
Copy link
Contributor

Thank you for the detailed bug report!

@meredithslota meredithslota added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Aug 12, 2022
@chalmerlowe
Copy link
Contributor

Due to conflicting priorities and the presence of a workaround, moving this to priority P3.

@chalmerlowe chalmerlowe added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Apr 11, 2023
@meredithslota
Copy link
Contributor

This is related to #1584, and isn't specific to Cloud SQL or federated queries, but rather the 404 error and location issue. I'm going to close this one in favor of the linked issue, which is requesting adding a hint to the error about this location issue, though we haven't yet settled on the solution that will be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

3 participants