New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Allow cross-project queries #1427

Closed
tswast opened this Issue Apr 16, 2018 · 2 comments

Comments

Projects
2 participants
@tswast
Contributor

tswast commented Apr 16, 2018

It would be great if there was a way to submit query jobs to one project, but allow querying jobs in another project.

I tried querying the StackOverflow public dataset like this:

con = ibis.bigquery.connect(
    project_id='swast-scratch',
    dataset_id='bigquery-public-data.stackoverflow')
table = con.table('posts_questions')
expr = table[table.tags.contains('ibis')][['title', 'tags']]
print(expr)

and got the following error:

$ python ibis_bigquery.py
Traceback (most recent call last):
  File "ibis_bigquery.py", line 20, in <module>
    table = con.table('posts_questions')
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 281, in table
    t = super(BigQueryClient, self).table(*args, **kwargs)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/client.py", line 119, in table
    schema = self._get_table_schema(qualified_name)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 304, in _get_table_schema
    return self.get_schema(qualified_name)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 363, in get_schema
    (table_id, dataset_id) = _ensure_split(name, database)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 32, in _ensure_split
    assert len(split) == 2
AssertionError

Note: I also tried:

con = ibis.bigquery.connect(project_id='bigquery-public-data', dataset_id='stackoverflow')
table = con.table('posts_questions')
expr = table[table.tags.contains('ibis')][['title', 'tags']]
print(expr)

but I got this error:

google.api.core.exceptions.Forbidden: 403 POST https://www.googleapis.com/bigquery/v2/projects/bigquery-public-data/queries: Access Denied: Project bigquery-public-data: The user macbook@swast-scratch.iam.gserviceaccount.com does not have bigquery.jobs.create permission in project bigquery-public-data.

I'd actually expect this second one as I do not have permission to charge queries to the bigquery-public-data project, but I should have permission to run queries against those tables (charging to my own project).

@cpcloud

This comment has been minimized.

Member

cpcloud commented Apr 16, 2018

This is up next after the 1.0.0 merge

@cpcloud cpcloud added this to the 0.14 milestone Apr 16, 2018

@cpcloud cpcloud added this to To do in BigQuery via automation Apr 16, 2018

@cpcloud cpcloud self-assigned this Apr 16, 2018

@tswast

This comment has been minimized.

Contributor

tswast commented Apr 16, 2018

I wonder if project_id in the ibis.bigquery.connect becomes ambiguous if we do this? One way to disambiguate might be to have folks pass in a google.cloud.bigquery.Client object in the cases when they want the project_id of the tables they query to be different from the project they want to charge queries to.

@cpcloud cpcloud added the enhancement label Apr 18, 2018

@cpcloud cpcloud closed this in efe3587 Apr 20, 2018

BigQuery automation moved this from To do to Done Apr 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment