Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Add bigquery scope for google credentials #33

Closed
wants to merge 3 commits into from

Conversation

xcompass
Copy link

@xcompass xcompass commented May 3, 2017

Bigquery requires scoped credentials when loading application default credentials

Quick test code below, it will return "invalid token" error. When uncomment the create_scoped() statement, the code run correctly without any error.

# use google default application credentials
export GOOGLE_APPLICATION_CREDENTIALS=/PATH/TO/GOOGLE_DEFAULT_CREDENTIALS.json
import httplib2

from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials

credentials = GoogleCredentials.get_application_default()
#credentials = credentials.create_scoped('https://www.googleapis.com/auth/bigquery')

http = httplib2.Http()
http = credentials.authorize(http)

service = build('bigquery', 'v2', http=http)

jobs = service.jobs()
job_data = {'configuration': {'query': {'query': 'SELECT 1'}}}

jobs.insert(projectId='projectid', body=job_data).execute()

@codecov-io
Copy link

codecov-io commented May 3, 2017

Codecov Report

Merging #33 into master will decrease coverage by 45.5%.
The diff coverage is 0%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #33       +/-   ##
===========================================
- Coverage   74.42%   28.91%   -45.51%     
===========================================
  Files           4        4               
  Lines        1552     1553        +1     
===========================================
- Hits         1155      449      -706     
- Misses        397     1104      +707
Impacted Files Coverage Δ
pandas_gbq/gbq.py 16.2% <0%> (-62.36%) ⬇️
pandas_gbq/tests/test_gbq.py 31.03% <0%> (-51.3%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c210de1...d8de384. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented May 3, 2017

@parthea

this would need a test.

@parthea
Copy link
Contributor

parthea commented May 3, 2017

@xcompass
Unfortunately, I wasn't able to re-produce this issue locally. I followed the sample code provided. I noticed that pandas-gbq already has the exact sample code provided in test_gbq.py: https://github.com/pydata/pandas-gbq/blob/master/pandas_gbq/tests/test_gbq.py#L203 .

The Google Application Default Credentials docs mentions the following:

The build() method takes care of injecting the proper scopes for the given service, although the method create_scoped can be used to do this explicitly.

Based on this text, I don't expect we will need to add create_scoped. My understanding is that it couldn't hurt to explicitly create scoped credentials but we should determine if this is necessary. In the past, when the scope was missing I would see the following error:'invalid_scope: Empty or missing scope not allowed.'

I'm happy to help you troubleshoot further. Does this issue occur every time you run the sample code?

@parthea
Copy link
Contributor

parthea commented May 3, 2017

In order to create a similar environment to re-create the issue on my end, please let me know which versions you have installed for:
oauth2client
googleapiclient
pandas-gbq

@xcompass
Copy link
Author

xcompass commented May 4, 2017

@parthea. Here are the versions:

appdirs (1.4.3)
google-api-python-client (1.6.2)
httplib2 (0.10.3)
numpy (1.12.1)
oauth2client (4.0.0)
packaging (16.8)
pandas (0.19.2)
pandas-gbq (0.1.6)
pip (9.0.1)
pyasn1 (0.2.3)
pyasn1-modules (0.0.8)
pyparsing (2.2.0)
python-dateutil (2.6.0)
pytz (2017.2)
rsa (3.4.2)
setuptools (35.0.2)
six (1.10.0)
uritemplate (3.0.0)
wheel (0.29.0)

Here are the logs how I test it:

▶ python --version
Python 2.7.10
▶ mkvirtualenv pandas-gbq
New python executable in /opt/python/pandas-gbq/bin/python
Installing setuptools, pip, wheel...done.
virtualenvwrapper.user_scripts creating /opt/python/pandas-gbq/bin/predeactivate
virtualenvwrapper.user_scripts creating /opt/python/pandas-gbq/bin/postdeactivate
virtualenvwrapper.user_scripts creating /opt/python/pandas-gbq/bin/preactivate
virtualenvwrapper.user_scripts creating /opt/python/pandas-gbq/bin/postactivate
virtualenvwrapper.user_scripts creating /opt/python/pandas-gbq/bin/get_env_details
(pandas-gbq)
~/temp
▶ pip install pandas-gbq
Collecting pandas-gbq
  Downloading pandas_gbq-0.1.6-py2.py3-none-any.whl
Collecting httplib2 (from pandas-gbq)
Collecting google-api-python-client (from pandas-gbq)
  Using cached google_api_python_client-1.6.2-py2.py3-none-any.whl
Collecting oauth2client (from pandas-gbq)
  Using cached oauth2client-4.0.0-py2.py3-none-any.whl
Collecting pandas (from pandas-gbq)
  Downloading pandas-0.19.2-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (11.9MB)
    100% |████████████████████████████████| 11.9MB 118kB/s
Collecting uritemplate<4dev,>=3.0.0 (from google-api-python-client->pandas-gbq)
  Using cached uritemplate-3.0.0-py2.py3-none-any.whl
Requirement already satisfied: six<2dev,>=1.6.1 in /opt/python/pandas-gbq/lib/python2.7/site-packages (from google-api-python-client->pandas-gbq)
Collecting pyasn1-modules>=0.0.5 (from oauth2client->pandas-gbq)
  Using cached pyasn1_modules-0.0.8-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from oauth2client->pandas-gbq)
  Using cached pyasn1-0.2.3-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client->pandas-gbq)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Collecting python-dateutil (from pandas->pandas-gbq)
  Using cached python_dateutil-2.6.0-py2.py3-none-any.whl
Collecting numpy>=1.7.0 (from pandas->pandas-gbq)
  Downloading numpy-1.12.1-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (4.4MB)
    100% |████████████████████████████████| 4.4MB 319kB/s
Collecting pytz>=2011k (from pandas->pandas-gbq)
  Downloading pytz-2017.2-py2.py3-none-any.whl (484kB)
    100% |████████████████████████████████| 491kB 2.8MB/s
Installing collected packages: httplib2, uritemplate, pyasn1, pyasn1-modules, rsa, oauth2client, google-api-python-client, python-dateutil, numpy, pytz, pandas, pandas-gbq
Successfully installed google-api-python-client-1.6.2 httplib2-0.10.3 numpy-1.12.1 oauth2client-4.0.0 pandas-0.19.2 pandas-gbq-0.1.6 pyasn1-0.2.3 pyasn1-modules-0.0.8 python-dateutil-2.6.0 pytz-2017.2 rsa-3.4.2 uritemplate-3.0.0
(pandas-gbq)
~/temp
▶ export GOOGLE_APPLICATION_CREDENTIALS=/Users/compass/temp/google.json
(pandas-gbq)

▶ more gbq_test.py
import httplib2

from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials

credentials = GoogleCredentials.get_application_default()
#credentials = credentials.create_scoped('https://www.googleapis.com/auth/bigquery')

http = httplib2.Http()
http = credentials.authorize(http)

service = build('bigquery', 'v2', http=http)

jobs = service.jobs()
job_data = {'configuration': {'query': {'query': 'SELECT 1'}}}

jobs.insert(projectId='projectid', body=job_data).execute()
(pandas-gbq)
~/temp
▶ python gbq_test.py
Traceback (most recent call last):
  File "gbq_test.py", line 17, in <module>
    jobs.insert(projectId='projectid', body=job_data).execute()
  File "/opt/python/pandas-gbq/lib/python2.7/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/opt/python/pandas-gbq/lib/python2.7/site-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 401 when requesting https://www.googleapis.com/bigquery/v2/projects/projectid/jobs?alt=json returned "Invalid Credentials">
(pandas-gbq)

▶ sed -i .bak 's/^#credentials/credentials/' gbq_test.py
(pandas-gbq)

▶ python gbq_test.py
(pandas-gbq)
~/temp
▶

@xcompass
Copy link
Author

xcompass commented May 4, 2017

It seems the tests you have for this are skipped: https://travis-ci.org/pydata/pandas-gbq/jobs/228579445#L601 and https://travis-ci.org/pydata/pandas-gbq/jobs/228579445#L602. Did you setup the google credentials in the environment variables in travis? If yes, it could be skipped because of this bug.

@xcompass
Copy link
Author

xcompass commented May 4, 2017

The build() method takes care of injecting the proper scopes for the given service, although the method create_scoped can be used to do this explicitly.

It don't think it is true according to the source code: https://github.com/google/google-api-python-client/blob/master/googleapiclient/discovery.py#L358. When you pass in http object, it bypasses the scope handling.

@xcompass
Copy link
Author

xcompass commented May 4, 2017

Just got a reply from google-api-python-client repo for the issue I created relate to this. googleapis/google-api-python-client#394 (comment).

It seem you have to handle scope yourself if you are using build().

@jreback
Copy link
Contributor

jreback commented May 12, 2017

ok, this seems reasonable. can you add an entry to the changelog?

@jreback jreback modified the milestone: 0.1.7 May 12, 2017
@jreback jreback added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label May 12, 2017
@jreback jreback modified the milestones: 0.1.7, 0.2.0 May 18, 2017
Bigquery requires scoped credentials when loading application default credentials
@xcompass
Copy link
Author

xcompass commented Jun 7, 2017

@jreback sorry for the delay. Was off the grid for a vacation. I've updated the changelog. Let me know if there is anything else.

@jreback
Copy link
Contributor

jreback commented Jun 9, 2017

@parthea ok with this?

@parthea parthea changed the title Add bigquery scope for google credentials BUG: Add bigquery scope for google credentials Jun 13, 2017
@xcompass
Copy link
Author

conflict is resolved.

@tswast
Copy link
Collaborator

tswast commented Jun 22, 2017

I believe I handled this when I converted to the google-auth library.
https://github.com/pydata/pandas-gbq/blob/852b2a3434e00ff82f13146e76ca90cee76132b1/pandas_gbq/gbq.py#L239

@xcompass
Copy link
Author

@tswast Thanks. But I don't think so. I just tested with above script. Still get 401 when requesting https://www.googleapis.com/bigquery/v2/projects/ubcxdata/jobs?alt=json returned "Invalid Credentials" error when running without create_scoped statement. Here is my shorted pip list:

pandas (0.20.2)
pandas-gbq (0.1.6+8.g852b2a3, /Users/compass/projects/pandas-gbq)
oauth2client (4.1.1)
google-api-python-client (1.6.2)
google-auth (1.0.0)
google-auth-httplib2 (0.0.2)
google-auth-oauthlib (0.1.0)

@jreback
Copy link
Contributor

jreback commented Jul 1, 2017

can you rebase

@xcompass
Copy link
Author

xcompass commented Jul 5, 2017

@tswast my bad. I was using the wrong script. I can confirm that your commit fix the issue. This PR is no longer need. I'm closing this.

@xcompass xcompass closed this Jul 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants