Skip to content

Commit

Permalink
DOC: Add how-to guide for authentication (#183)
Browse files Browse the repository at this point in the history
* Clarifies order of authentication methods.
* Updates links to external resources.
  • Loading branch information
tswast committed May 25, 2018
1 parent 0816668 commit 7439771
Show file tree
Hide file tree
Showing 5 changed files with 81 additions and 75 deletions.
8 changes: 6 additions & 2 deletions docs/source/conf.py
Expand Up @@ -359,8 +359,12 @@
# texinfo_no_detailmenu = False


# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None}
# Configuration for intersphinx:
intersphinx_mapping = {
'https://docs.python.org/': None,
'https://pandas.pydata.org/pandas-docs/stable/': None,
'https://google-auth.readthedocs.io/en/latest/': None,
}

extlinks = {'issue': ('https://github.com/pydata/pandas-gbq/issues/%s',
'GH#'),
Expand Down
62 changes: 62 additions & 0 deletions docs/source/howto/authentication.rst
@@ -0,0 +1,62 @@
Authentication
==============

pandas-gbq `authenticates with the Google BigQuery service
<https://cloud.google.com/bigquery/docs/authentication/>`_ via OAuth 2.0.

.. _authentication:


Authentication with a Service Account
--------------------------------------

Using service account credentials is particularly useful when working on
remote servers without access to user input.

Create a service account key via the `service account key creation page
<https://console.cloud.google.com/apis/credentials/serviceaccountkey>`_ in
the Google Cloud Platform Console. Select the JSON key type and download the
key file.

To use service account credentials, set the ``private_key`` parameter to one
of:

* A file path to the JSON file.
* A string containing the JSON file contents.

See the `Getting started with authentication on Google Cloud Platform
<https://cloud.google.com/docs/authentication/getting-started>`_ guide for
more information on service accounts.

Default Authentication Methods
------------------------------

If the ``private_key`` parameter is ``None``, pandas-gbq tries the following
authentication methods:

1. Application Default Credentials via the :func:`google.auth.default`
function.

.. note::

If pandas-gbq can obtain default credentials but those credentials
cannot be used to query BigQuery, pandas-gbq will also try obtaining
user account credentials.

A common problem with default credentials when running on Google
Compute Engine is that the VM does not have sufficient scopes to query
BigQuery.

2. User account credentials.

pandas-gbq loads cached credentials from a hidden user folder on the
operating system. Override the location of the cached user credentials
by setting the ``PANDAS_GBQ_CREDENTIALS_FILE`` environment variable.

If pandas-gbq does not find cached credentials, it opens a browser window
asking for you to authenticate to your BigQuery account using the product
name ``pandas GBQ``.

Additional information on the user credentails authentication mechanism
can be found `here
<https://developers.google.com/identity/protocols/OAuth2#clientside/>`__.
1 change: 1 addition & 0 deletions docs/source/index.rst
Expand Up @@ -26,6 +26,7 @@ Contents:

install.rst
intro.rst
howto/authentication.rst
reading.rst
writing.rst
tables.rst
Expand Down
39 changes: 0 additions & 39 deletions docs/source/intro.rst
Expand Up @@ -41,42 +41,3 @@ more verbose logs, you can do something like:
logger = logging.getLogger('pandas_gbq')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler(stream=sys.stdout))
.. _authentication:

Authentication
''''''''''''''

Authentication to the Google ``BigQuery`` service via ``OAuth 2.0``
is possible with either user or service account credentials.

Authentication via user account credentials is as simple as following the prompts in a browser window
which will automatically open for you. You authenticate to the specified
``BigQuery`` account using the product name ``pandas GBQ``.
The remote authentication is supported via the ``auth_local_webserver`` in ``read_gbq``. By default,
account credentials are stored in an application-specific hidden user folder on the operating system. You
can override the default credentials location via the ``PANDAS_GBQ_CREDENTIALS_FILE`` environment variable.
Additional information on the authentication mechanism can be found
`here <https://developers.google.com/identity/protocols/OAuth2#clientside/>`__.

Authentication via service account credentials is possible through the `'private_key'` parameter. This method
is particularly useful when working on remote servers (eg. Jupyter Notebooks on remote host).
Additional information on service accounts can be found
`here <https://developers.google.com/identity/protocols/OAuth2#serviceaccount>`__.

Authentication via ``application default credentials`` is also possible, but only valid
if the parameter ``private_key`` is not provided. This method requires that the
credentials can be fetched from the development environment. Otherwise, the OAuth2
client-side authentication is used. Additional information can be found on
`application default credentials <https://developers.google.com/identity/protocols/application-default-credentials>`__.

.. note::

The `'private_key'` parameter can be set to either the file path of the service account key
in JSON format, or key contents of the service account key in JSON format.

.. note::

A private key can be obtained from the Google developers console by clicking
`here <https://console.developers.google.com/permissions/serviceaccounts>`__. Use JSON key type.
46 changes: 12 additions & 34 deletions pandas_gbq/gbq.py
Expand Up @@ -476,23 +476,12 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
The main method a user calls to execute a Query in Google BigQuery
and read results into a pandas DataFrame.
The Google Cloud library is used.
Documentation is available `here
<https://googlecloudplatform.github.io/google-cloud-python/stable/>`__
This method uses the Google Cloud client library to make requests to
Google BigQuery, documented `here
<https://google-cloud-python.readthedocs.io/en/latest/bigquery/usage.html>`__.
Authentication to the Google BigQuery service is via OAuth 2.0.
- If "private_key" is not provided:
By default "application default credentials" are used.
If default application credentials are not found or are restrictive,
user account credentials are used. In this case, you will be asked to
grant permissions for product name 'pandas GBQ'.
- If "private_key" is provided:
Service account credentials will be used to authenticate.
See the :ref:`How to authenticate with Google BigQuery <authentication>`
guide for authentication instructions.
Parameters
----------
Expand Down Expand Up @@ -612,29 +601,18 @@ def to_gbq(dataframe, destination_table, project_id=None, chunksize=None,
The main method a user calls to export pandas DataFrame contents to
Google BigQuery table.
Google BigQuery API Client Library v2 for Python is used.
Documentation is available `here
<https://developers.google.com/api-client-library/python/apis/bigquery/v2>`__
Authentication to the Google BigQuery service is via OAuth 2.0.
- If "private_key" is not provided:
By default "application default credentials" are used.
If default application credentials are not found or are restrictive,
user account credentials are used. In this case, you will be asked to
grant permissions for product name 'pandas GBQ'.
- If "private_key" is provided:
This method uses the Google Cloud client library to make requests to
Google BigQuery, documented `here
<https://google-cloud-python.readthedocs.io/en/latest/bigquery/usage.html>`__.
Service account credentials will be used to authenticate.
See the :ref:`How to authenticate with Google BigQuery <authentication>`
guide for authentication instructions.
Parameters
----------
dataframe : DataFrame
dataframe : pandas.DataFrame
DataFrame to be written
destination_table : string
destination_table : str
Name of table to be written, in the form 'dataset.tablename'
project_id : str (optional when available in environment)
Google BigQuery Account project ID.
Expand Down

0 comments on commit 7439771

Please sign in to comment.