Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ch.9 - can't export data to GCS - wrong project number? #164

Open
jgammerman opened this issue Feb 6, 2023 · 4 comments
Open

Ch.9 - can't export data to GCS - wrong project number? #164

jgammerman opened this issue Feb 6, 2023 · 4 comments

Comments

@jgammerman
Copy link

Hi @lakshmanok - loving the book! I'm on chapter 9 now and I've encountered an error I can't debug.

In the section Preparing BigQuery data for Tensorflow, (p.315), there's a part where you extract some BigQuery tables to GC storage. The relevant bit of code in your notebook is as follows:

PROJECT=$(gcloud config get-value project)
for dataset in "train" "eval" "all"; do
  TABLE=dsongcp.flights_${dataset}_data
  CSV=gs://${BUCKET}/ch9/data/${dataset}.csv
  echo "Exporting ${TABLE} to ${CSV} and deleting table"
  bq --project_id=${PROJECT} extract --destination_format=CSV $TABLE $CSV
  bq --project_id=${PROJECT} rm -f $TABLE
done

Which gives me the following error:

Exporting dsongcp.flights_train_data to gs://peppy-booth-371612-dsongcp/ch9/data/train.csv and deleting table
BigQuery error in extract operation: BigQuery API has not been used in project
457198359346 before or it is disabled. Enable it by visiting https://console.dev/
elopers.google.com/apis/api/bigquery.googleapis.com/overview?project=45719835934
6 then retry. If you enabled this API recently, wait a few minutes for the
action to propagate to our systems and retry.

I know that the BigQuery API has already been enabled for my project, so I think that the problem is that it's picking up the wrong project number: in the output above, the end of the URL refers to project=45719835934, but that's not my project number! It's 506913857436, as shown here:

image

And indeed the correct project number is returned if I ask for it explicitly in my notebook:

image

Can you think of any reason why it would be picking up the wrong project number when trying to export from BQ to GCS?

@lakshmanok
Copy link
Contributor

lakshmanok commented Feb 6, 2023 via email

@jgammerman
Copy link
Author

Thanks for the prompt response Lak. Unfortunately I don't think it's that simple, unless I've just misunderstood you...

See the screenshot below. My project ID appears to be correct, as does the project number (first 2 outputs), but then when I run the code the error suggests that its is looking at different project number, even though the project ID is definitely correct:

image

Simply setting PROJECT=506913857436 made no difference I'm afraid.

@lakshmanok
Copy link
Contributor

lakshmanok commented Feb 6, 2023 via email

@jgammerman
Copy link
Author

@lakshmanok see my in-line responses:

Could you check whether the BUCKET is in the same region as the BigQuery
dataset?

So originally my BQ datasets were located in the US and my bucket was in the EU (eu-west1). I've tried creating a new bucket in us-central1 and re-running the extraction, but unfortunately that's producing exactly the same error (with the same incorrect project number).

Also, please file a bug in BigQuery

Done - see here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants