Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty schema of filtered_information_schema_columns when initializing #544

Conversation

yu-iskw
Copy link
Contributor

@yu-iskw yu-iskw commented Sep 22, 2023

Overview

We use elementary and dbt with BigQuery projects. The BigQuery tables modeled by dbt are created in multiple Google Cloud project. And we have GCP projects specialized to persist results of elementary.

When we tried to upgrade elementary and dbt-data-reliability from 0.8.0 to 0.10.3 with a dbt run --select elementary --profile elementary, we got the subsequent error. According to my research, the elementary_v0.filtered_information_schema_columns model tries to access metadata of all tables in the dbt project. In my opinion, when creating and upgrading the schemas of elementary, it would be ok to create empty ones.

  • dbt: 1.5.0 and 1.6.2
  • elementary: 0.10.0
  • dbt-data-reliability: 0.10.3
07:26:53  14 of 29 ERROR creating sql view model elementary.filtered_information_schema_columns  [ERROR in 33.93s]

403 GET https://bigquery.googleapis.com/bigquery/v2/projects/xxx-analysis/datasets/if_master/tables?maxResults=1&prettyPrint=false: Access D
enied: Dataset xxx-analysis:master: Permission bigquery.tables.list denied on dataset xxx-analysis:master (or it may not exist).

What is the change?

I would like to create empty schemas of elementary, when creating and upgrading the dbt schemas of elementary. To do so, there might be a couple of solutions. First, we use an environment variable to call the get_empty_columns_from_information_schema_table macro to get empty schemas of filtered_information_schema_columns. Second, as the pull request, we take advantage of the information of the --select option. context["invocation_args_dict"]["select"] enables us to get a list of values passed by the option.

Signed-off-by: Yu Ishikawa <yu-iskw@users.noreply.github.com>
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Sep 26, 2023

As the elementary team plan to change the implementation of the way we take the schema snapshots, we can' close the pull request. It would be good to discuss and implement the new approach instead of the pull request.

elementary-data/elementary#945 (comment)

@yu-iskw yu-iskw closed this Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant