The Google Cloud Platform connection type enables the :ref:`GCP Integrations <GCP>`.
There are three ways to connect to GCP using Airflow.
- Use Application Default Credentials, such as via the metadata server when running on Google Compute Engine.
- Use a service account key
file (JSON format) on disk -
Keyfile Path
. - Use a service account key file (JSON format) from connection configuration -
Keyfile JSON
.
The following connection IDs are used by default.
bigquery_default
- Used by the :class:`~airflow.contrib.hooks.bigquery_hook.BigQueryHook` hook.
google_cloud_datastore_default
- Used by the :class:`~airflow.contrib.hooks.datastore_hook.DatastoreHook` hook.
google_cloud_default
Used by those hooks:
- :class:`~airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook`
- :class:`~airflow.contrib.hooks.gcp_dataflow_hook.DataFlowHook`
- :class:`~airflow.contrib.hooks.gcp_dataproc_hook.DataProcHook`
- :class:`~airflow.contrib.hooks.gcp_mlengine_hook.MLEngineHook`
- :class:`~airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook`
- :class:`~airflow.contrib.hooks.gcp_bigtable_hook.BigtableHook`
- :class:`~airflow.contrib.hooks.gcp_compute_hook.GceHook`
- :class:`~airflow.contrib.hooks.gcp_function_hook.GcfHook`
- :class:`~airflow.contrib.hooks.gcp_spanner_hook.CloudSpannerHook`
- :class:`~airflow.contrib.hooks.gcp_sql_hook.CloudSqlHook`
- Project Id (optional)
- The Google Cloud project ID to connect to. It is used as default project id by operators using it and can usually be overridden at the operator level.
- Keyfile Path
Path to a service account key file (JSON format) on disk.
Not required if using application default credentials.
- Keyfile JSON
Contents of a service account key file (JSON format) on disk. It is recommended to :doc:`Secure your connections <../secure-connections>` if using this method to authenticate.
Not required if using application default credentials.
- Scopes (comma separated)
- A list of comma-separated Google Cloud scopes to authenticate with.
- Number of Retries
Integer, number of times to retry with randomized exponential backoff. If all retries fail, the :class:`googleapiclient.errors.HttpError` represents the last request. If zero (default), we attempt the request only once.
When specifying the connection in environment variable you should specify it using URI syntax, with the following requirements:
- scheme part should be equals
google-cloud-platform
(Note: look for a hyphen character) - authority (username, password, host, port), path is ignored
- query parameters contains information specific to this type of
connection. The following keys are accepted:
extra__google_cloud_platform__project
- Project Idextra__google_cloud_platform__key_path
- Keyfile Pathextra__google_cloud_platform__keyfile_dict
- Keyfile JSONextra__google_cloud_platform__scope
- Scopesextra__google_cloud_platform__num_retries
- Number of Retries
Note that all components of the URI should be URL-encoded.
For example:
export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&extra__google_cloud_platform__project=airflow&extra__google_cloud_platform__num_retries=5'
- scheme part should be equals