Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

BigQuery documentation is misleading #263

Closed
milanaleksic opened this issue Feb 16, 2021 · 7 comments
Closed

BigQuery documentation is misleading #263

milanaleksic opened this issue Feb 16, 2021 · 7 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers soda-sql

Comments

@milanaleksic
Copy link
Contributor

Describe the bug
Our documentation currently says:

name: my_bigquery_project
connection:
    type: bigquery
    account_info: <PATH TO YOUR BIGQUERY ACCOUNT INFO JSON FILE>
    dataset: sodasql
...

but our code expects

  1. account_info_json, and
  2. that property should have JSON object, not the location of it

OS:
Python Version: *
Soda SQL Version: main
Warehouse Type: BigQuery

@milanaleksic milanaleksic added bug Something isn't working soda-sql good first issue Good for newcomers labels Feb 16, 2021
@Antoninj Antoninj self-assigned this Feb 16, 2021
@Antoninj
Copy link
Contributor

closed via e1e3903

@abuckenheimer
Copy link

@Antoninj this helps but it doesn't actually solve the issue's problem. Your diff should look like this

name: my_bigquery_project
connection:
    type: bigquery
    - account_info: <PATH TO YOUR BIGQUERY ACCOUNT INFO JSON FILE>
    + account_info_json: <YOUR BIGQUERY SERVICE ACCOUNT INFO JSON FILE>
    dataset: sodasql

the key needs to be renamed, right now when I run with a warehouse.yml that uses the account_info key instead of account_info_json I get this rather unhelpful error messages:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sodasql/cli/cli.py", line 181, in analyze
    warehouse_yml_parser = WarehouseYmlParser(warehouse_yml_dict, warehouse_file)
  File "/usr/local/lib/python3.8/site-packages/sodasql/scan/warehouse_yml_parser.py", line 64, in __init__
    self.warehouse_yml.dialect = Dialect.create(self)
  File "/usr/local/lib/python3.8/site-packages/sodasql/scan/dialect.py", line 61, in create
    return BigQueryDialect(parser)
  File "/usr/local/lib/python3.8/site-packages/sodasql/dialects/bigquery_dialect.py", line 34, in __init__
    self.account_info_dict = self.__parse_json_credential('account_info_json', parser)
  File "/usr/local/lib/python3.8/site-packages/sodasql/dialects/bigquery_dialect.py", line 92, in __parse_json_credential
    return json.loads(parser.get_credential(credential_name))
  File "/usr/local/lib/python3.8/json/__init__.py", line 341, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

FWIW I think the wording "YOUR BIGQUERY SERVICE ACCOUNT INFO JSON FILE" is still a bit ambiguous as to if the value should be inline json or a path to a file. Maybe an example would be better:

name: my_bigquery_project
connection:
    type: bigquery
    # YOUR BIGQUERY SERVICE ACCOUNT INFO JSON FILE
    account_info_json: >
      {
        "type": "service_account",
        "project_id": "...",
        "private_key_id": "...",
        "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
        "client_email": "user@project.iam.gserviceaccount.com",
        "client_id": "...",
        "auth_uri": "https://accounts.google.com/o/oauth2/auth",
        "token_uri": "https://oauth2.googleapis.com/token",
        "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
        "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/..."
      }
    dataset: dataset

@tombaeyens
Copy link
Contributor

@abuckenheimer there is somewhere another issue for refactoring the file references with include and read :

It's sodadata/soda-core#725

@Antoninj
Copy link
Contributor

Antoninj commented Mar 3, 2021

Thanks @abuckenheimer for pointing this out! I just updated the documentation following your suggestion.

@Antoninj Antoninj reopened this Mar 3, 2021
@vijaykiran vijaykiran changed the title [bug] BigQuery documentation is misleading BigQuery documentation is misleading Mar 3, 2021
@Antoninj
Copy link
Contributor

Antoninj commented Mar 3, 2021

closed via 33cab9b

@Antoninj Antoninj closed this as completed Mar 3, 2021
@bjornvandijkman-ingka
Copy link

Following the documentation still does not work for me. I have tried the following:

name: soda_sql_tutorial
connection:
    type: bigquery
    account_info_json: env_var(BIG_QUERY_ACCESS)
    auth_scopes:
    - https://www.googleapis.com/auth/bigquery
    dataset: dbt_customer_360

soda_account:
  host: cloud.soda.io
  api_key_id: env_var(API_PUBLIC)
  api_key_secret: env_var(API_PRIVATE)

Which throws the following exception:

 2.1.0b22
  | Analyzing warehouse.yml ...
  | Dialect initiated from the create command, cred is None.
  | Exception: Account_info_json is not provided
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sodasql/cli/cli.py", line 248, in analyze
    warehouse = Warehouse(warehouse_yml_parser.warehouse_yml)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sodasql/scan/warehouse.py", line 26, in __init__
    self.connection = self.dialect.create_connection()
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sodasql/dialects/bigquery_dialect.py", line 74, in create_connection
    self.try_to_raise_soda_sql_exception(e)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sodasql/scan/dialect.py", line 473, in try_to_raise_soda_sql_exception
    raise exception
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/sodasql/dialects/bigquery_dialect.py", line 66, in create_connection
    raise Exception("Account_info_json is not provided")
Exception: Account_info_json is not provided
  | If you think this is a bug in Soda SQL, please open an issue athttps://github.com/sodadata/soda-sql/issues/new/choose
  | Starting new HTTPS connection (1): collect.dev.sodadata.io:443
  | https://collect.dev.sodadata.io:443 "POST /v1/traces HTTP/1.1" 200 0

I get the same exception when proving a path to the json file. The only thing that works is when including the credentials directly in the scan.yml file.

@francis
Copy link

francis commented Jan 12, 2022

I think if you change:
account_info_json: env_var(BIG_QUERY_ACCESS)
to:
account_info_path: env_var(BIG_QUERY_ACCESS)

it might work. See - Bigquery Datasource Configurations

@vijaykiran vijaykiran transferred this issue from sodadata/soda-core Mar 22, 2022
@vijaykiran vijaykiran transferred this issue from sodadata/soda-core Mar 22, 2022
@vijaykiran vijaykiran transferred this issue from sodadata/soda-core Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working good first issue Good for newcomers soda-sql
Projects
None yet
Development

No branches or pull requests

6 participants