Skip to content

Add additional BigQuery authentication method#62

Merged
WilliamDee merged 2 commits intomainfrom
will/user-creds-bigquery
May 6, 2022
Merged

Add additional BigQuery authentication method#62
WilliamDee merged 2 commits intomainfrom
will/user-creds-bigquery

Conversation

@WilliamDee
Copy link
Copy Markdown
Contributor

@WilliamDee WilliamDee commented May 2, 2022

Context

Currently, MetricFlow only allows a user to pass in a path to the google JSON credential of a service account type. However, we want to expose another method of authentication. In this PR, it allows a user to authenticate BigQuery through a method called ADC (Application Default Credentials). This way a user can authenticate via their own end user credentials. More information about ADC here (https://google.aip.dev/auth/4110)

Changes

  • Renamed the config key in BigQuery config from dwh_password to dwh_pass_to_creds for clarity purposes
  • Added a dwh_project_id config key if a user wants to auth via ADC
  • modify the bigquery engine creation flow to accept both methods of passing credentials

Usage

Current Service Account Auth (No changes)

dwh_dialect: bigquery  # Dialect (one of BigQuery, Snowflake, Redshift)
dwh_path_to_creds: '<path_to_service_account_creds.json>'  # Provide the path to the BigQuery credential file, ignore to use ADC auth
dwh_schema: 'transform_staging'
dwh_project_id: ''  # Provide the GCP Project ID, ignore if using service account credentials

Note that dwh_project_id can be ignored

Additional Auth Method

Can be easily done by running gcloud auth application-default login on the command line which would create a credential file linked to your specific user to the default location provided by Google. This will then be used automatically when the BQ engine gets created. Then in the config file you will have something like this,

dwh_dialect: bigquery  # Dialect (one of BigQuery, Snowflake, Redshift)
dwh_path_to_creds: ''  # Provide the path to the BigQuery credential file, ignore to use ADC auth
dwh_schema: 'transform_staging'
dwh_project_id: '<project_id>'  # Provide the GCP Project ID, ignore if using service account credentials

Note that dwh_path_to_creds can be ignored

Disclosure

The recommended way of providing production authentication should be ideally through a service account credential and not an end user credential

@WilliamDee
Copy link
Copy Markdown
Contributor Author

WilliamDee commented May 2, 2022

An important thing to note here is by changing the config key name here (from dwh_password to dwh_path_to_creds) any BQ user that upgrades to whatever version this change gets pushed out to will cause errors as the code will look for dwh_path_to_creds but their existing config file has dwh_password. Users will need to either

  1. delete the config file and rerun mf setup

or

  1. rename dwh_password to dwh_path_to_creds manually

@WilliamDee WilliamDee force-pushed the will/user-creds-bigquery branch 2 times, most recently from 1811f21 to 49c6fec Compare May 4, 2022 18:33
@tlento
Copy link
Copy Markdown
Contributor

tlento commented May 5, 2022

An important thing to note here is by changing the config key name here (from dwh_password to dwh_path_to_creds) any BQ user that upgrades to whatever version this change gets pushed out to will cause errors as the code will look for dwh_path_to_creds but their existing config file has dwh_password

Something just occurred to me here.

What kind of error message will these users see? Can we make sure the error message provides a concrete indication of what they should do to fix the issue? Like if they have BQ as the warehouse type, can we have the error message tell them that they need to fill out the dwh_path_to_creds (and only that key)?

This isn't blocking on merge but we should follow up with that improvement before release.

@WilliamDee WilliamDee force-pushed the will/user-creds-bigquery branch from 49c6fec to 88605dc Compare May 5, 2022 19:29
@WilliamDee
Copy link
Copy Markdown
Contributor Author

An important thing to note here is by changing the config key name here (from dwh_password to dwh_path_to_creds) any BQ user that upgrades to whatever version this change gets pushed out to will cause errors as the code will look for dwh_path_to_creds but their existing config file has dwh_password

Something just occurred to me here.

What kind of error message will these users see? Can we make sure the error message provides a concrete indication of what they should do to fix the issue? Like if they have BQ as the warehouse type, can we have the error message tell them that they need to fill out the dwh_path_to_creds (and only that key)?

This isn't blocking on merge but we should follow up with that improvement before release.

Added an error message. So for this specific case of a user using the old config version, they'll see

ValueError: One of `dwh_path_to_creds` or `dwh_project_id` should be filled.

@WilliamDee WilliamDee merged commit 3598b1d into main May 6, 2022
@WilliamDee WilliamDee deleted the will/user-creds-bigquery branch May 6, 2022 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants