Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"[notFound]" error but query was successful in logs #274

Closed
zacdav opened this issue Nov 21, 2018 · 11 comments

Comments

@zacdav
Copy link

commented Nov 21, 2018

Following the documentation in the readme for the sample data worked for me no problems. However when I switched to my own dataset (using the project that I used for billing on sample data) I am never able to return results and am always faced with the error:

Error: Not found: Job PROJECT_NAME:job_5v9bcjpgsMi0wHuMO-JaEWoK58XK [notFound]

Connection Example

library(DBI)
library(bigrquery)
bigrquery::set_service_token("../../xxx.json")
con <- DBI::dbConnect(bigrquery::bigquery(),
                      dataset = "dataset_name_here",
                      project = "project_name_here",
                      billing = "project_name_here")

Listing tables and their contents works no problem:

dbListTables(con) # success
dbListFields(con, "table_name_here") # success

dplyr test - this fails with error mentioned before:

test_query <- tbl(con, "table_name_here") %>%
  head(10) %>%
  collect()

Interestingly though, when I check the logs in the BigQuery web UI I can see the successful queries that R isn't returning.

query log

I've tried authenticating via OAuth and with a JSON token that I've created. Making sure that the user has full access to BigQuery API's.

Also have tried all methods for submiting the query (DBI, bigrquery, dplyr).

I begun digging around the source code and trying to use query_exec() directly and got another error:

Error: Not found: Dataset PROJECT_NAME: TABLE_NAME was not found in location US [notFound]

Note: Data is stored and processed in australia-southeast1.

Is this expected behaviour and I've glossed over something simple?

Thanks,
Zac


SessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14.1

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.7.7     DBI_1.0.0       bigrquery_1.0.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.19      prettyunits_1.0.2 dbplyr_1.2.2      crayon_1.3.4      assertthat_0.2.0  R6_2.3.0         
 [7] jsonlite_1.5      magrittr_1.5      pillar_1.3.0      httr_1.3.1        progress_1.2.0    rlang_0.3.0.1    
[13] curl_3.2          rstudioapi_0.8    bindrcpp_0.2.2    tools_3.5.1       glue_1.3.0        purrr_0.2.5      
[19] hms_0.4.2         compiler_3.5.1    pkgconfig_2.0.2   openssl_1.0.2     tidyselect_0.2.5  bindr_0.1.1      
[25] tibble_1.4.2     
@extrobe

This comment has been minimized.

Copy link

commented Dec 14, 2018

I'm having exactly the same issue. Public datasets are fine, and I can use

DBI::dbListTables(con) DBI::dbReadTable(con, "tablename", max_results =10) bq_table_download("project.dataset.tablename")

all successfully against my private dataset - but can't execute a query - always get the same error as above, and as above, I can see the query executed successfully, and the name of hte job appears to match the result set

Even more of a coincidence though is that we also have our data stored against australia-southeast1

Going to see if a colleague can load some data to another location to see if that changes things.

@trickbooter

This comment has been minimized.

Copy link

commented Dec 17, 2018

After we execute a query using (for example) bq_perform_query (but this applies to more than just that), we get a bq_job back. The bq_job expects the format for a job id to be $projectid.$jobid. When inspecting the BQ console, the actual identification for a job is $projectid:$region.$jobid. If a region isn't supplied, Google defaults the region to US. This means that US datasets work fine, but non-US datasets fail with job not found.

The bq_refs.R file (which handles the creation of bq_job objects) also creates references to bq_dataset, bq_table and bq_job. I believe bq_dataset and bq_table are safe from this naming convention change.

I don't know R (I am looking at BigRQuery to help @extrobe, so I don't fee confident enough in building a PR to fix this.

@trickbooter

This comment has been minimized.

Copy link

commented Dec 17, 2018

Just to contribute a bit more...

Once a query is performed, the dataset location can be retrieved with

ds <- as_bq_dataset("projectid.datasetid")
region <- bq_dataset_meta(ds)$location

Location can then be used in the job id as "$projectid:$region.$jobid"

@zacdav

This comment has been minimized.

Copy link
Author

commented Dec 17, 2018

@trickbooter Nice finds, I'll have a poke around now and see what I can do

@zacdav

This comment has been minimized.

Copy link
Author

commented Dec 17, 2018

As far as I can tell, the bigrquery:::bq_get() function does not support the location parameter currently.
I was able to force the function to return a successful request by adding it in as specified here.

I modified part of bigrquery:::bq_get() like so:

req <- httr::GET(paste0(base_url, url),
                 httr::config(token = bigrquery:::get_access_cred()), 
                 httr::user_agent(bigrquery:::bq_ua()),
                 query = "location=australia-southeast1")

Which is good to know as there then just needs to be a way to feed through the location parameter.
However, the location is stripped by as_bq_job.list and the bigrquery:::bq_path() function doesn't facilitate the location in its current form either.

So it might be a more involved fix to add support for regions unless there is a region specified upfront or determined at the time of the GET request?

@hadley I'm happy to work on the fix if you can lend some guidance.

@trickbooter

This comment has been minimized.

Copy link

commented Dec 17, 2018

Thanks @zacdav for investigating. The API reference that you link to does state that the region is required for non-US / EU based datasets. Frustratingly asymmetric!

The fix complexity is compounded because it undoes some of the efficiency of bq_refs.R, with the job being the only thing that requires a region.

I'm sorry I cannot help more with the code, but with my 90 minutes of experience of R I'm sure I'm going to break things more than I fix them!

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 23, 2019

@zacdav thanks for the offer of help, but analysing the problem to determine the fix was 90% of the work, so I just went ahead and did it myself.

@hadley hadley closed this in 4d00b7b Jan 23, 2019

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 23, 2019

This now just works for a simple example:

asia <- bq_test_dataset(location = "asia-east1")

tb <- bq_table(asia, "mtcars2")
bq_table_upload(tb, mtcars)
bq_table_download(tb)

But I'd appreciate it if someone would try the development version for a real problem, and provide a reprex if there are still problems.

@zacdav

This comment has been minimized.

Copy link
Author

commented Jan 26, 2019

@hadley Thanks a lot, works well with tests so far.

@extrobe

This comment has been minimized.

Copy link

commented Jan 28, 2019

@hadley , I also had success running a few different examples from our global datasets - many thanks

@alistairewj

This comment has been minimized.

Copy link

commented Apr 26, 2019

@hadley ran into this issue again, seems like some functions still exhibit the bug, specifically I get the issue with query_exec.

southamerica <- bq_test_dataset(location = "southamerica-east1")
tb <- bq_table(southamerica, "mtcars2")
bq_table_upload(tb, mtcars)
bq_table_download(tb)

The above works fine, but the below fails with Error: Not found:

test_query = paste("SELECT * FROM `",Sys.getenv("BIGQUERY_TEST_PROJECT"),".TESTING_eietbbiduf.mtcars2`",sep="")
data <- query_exec(test_query, 
                   location = "southamerica-east1",
                   use_legacy_sql = FALSE,
                   project=Sys.getenv("BIGQUERY_TEST_PROJECT"))

I got around this by using DBI::dbConnect and dbGetQuery which work fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.