Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Allow cross-project queries #1427

Closed
tswast opened this issue Apr 16, 2018 · 2 comments
Closed

BigQuery: Allow cross-project queries #1427

tswast opened this issue Apr 16, 2018 · 2 comments
Assignees
Labels
bug Incorrect behavior inside of ibis feature Features or general enhancements
Milestone

Comments

@tswast
Copy link
Collaborator

tswast commented Apr 16, 2018

It would be great if there was a way to submit query jobs to one project, but allow querying jobs in another project.

I tried querying the StackOverflow public dataset like this:

con = ibis.bigquery.connect(
    project_id='swast-scratch',
    dataset_id='bigquery-public-data.stackoverflow')
table = con.table('posts_questions')
expr = table[table.tags.contains('ibis')][['title', 'tags']]
print(expr)

and got the following error:

$ python ibis_bigquery.py
Traceback (most recent call last):
  File "ibis_bigquery.py", line 20, in <module>
    table = con.table('posts_questions')
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 281, in table
    t = super(BigQueryClient, self).table(*args, **kwargs)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/client.py", line 119, in table
    schema = self._get_table_schema(qualified_name)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 304, in _get_table_schema
    return self.get_schema(qualified_name)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 363, in get_schema
    (table_id, dataset_id) = _ensure_split(name, database)
  File "/Users/swast/.pyenv/versions/3.6.4/envs/ibis-stable/lib/python3.6/site-packages/ibis/bigquery/client.py", line 32, in _ensure_split
    assert len(split) == 2
AssertionError

Note: I also tried:

con = ibis.bigquery.connect(project_id='bigquery-public-data', dataset_id='stackoverflow')
table = con.table('posts_questions')
expr = table[table.tags.contains('ibis')][['title', 'tags']]
print(expr)

but I got this error:

google.api.core.exceptions.Forbidden: 403 POST https://www.googleapis.com/bigquery/v2/projects/bigquery-public-data/queries: Access Denied: Project bigquery-public-data: The user macbook@swast-scratch.iam.gserviceaccount.com does not have bigquery.jobs.create permission in project bigquery-public-data.

I'd actually expect this second one as I do not have permission to charge queries to the bigquery-public-data project, but I should have permission to run queries against those tables (charging to my own project).

@cpcloud
Copy link
Member

cpcloud commented Apr 16, 2018

This is up next after the 1.0.0 merge

@cpcloud cpcloud added this to the 0.14 milestone Apr 16, 2018
@cpcloud cpcloud added bug Incorrect behavior inside of ibis bigquery labels Apr 16, 2018
@cpcloud cpcloud self-assigned this Apr 16, 2018
@tswast
Copy link
Collaborator Author

tswast commented Apr 16, 2018

I wonder if project_id in the ibis.bigquery.connect becomes ambiguous if we do this? One way to disambiguate might be to have folks pass in a google.cloud.bigquery.Client object in the cases when they want the project_id of the tables they query to be different from the project they want to charge queries to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis feature Features or general enhancements
Projects
None yet
Development

No branches or pull requests

2 participants