Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy tags removed by BigQueryInsertJobOperator in data-platform-foundations airflow DAG #1532

Closed
danieldeleo opened this issue Jul 27, 2023 · 4 comments · Fixed by #1533
Closed

Comments

@danieldeleo
Copy link
Collaborator

Both of the following BQ insert jobs in datapipeline_dc_tags.py strip the "customer_purchase" destination tables of their policy tags:

This occurs because the writeDisposition is set to "WRITE_TRUNCATE" which as the docs state:

  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the data, removes the constraints, and uses the schema from the query result.

This should be fixed because removing policy tags exposes sensitive data and removes the column-level security that policy tags enable.

@danieldeleo
Copy link
Collaborator Author

danieldeleo commented Jul 27, 2023

datapipeline_dc_tags_flex.py would require the same fix

@juliocc
Copy link
Collaborator

juliocc commented Jul 27, 2023

@lcaggio @iht can you take a look?

@danieldeleo
Copy link
Collaborator Author

In my tests, changing writeDisposition to WRITE_APPEND resolved the issue. I can submit PR fix tomorrow

@wiktorn
Copy link
Collaborator

wiktorn commented Jul 27, 2023

There is BigQuery feature request for that. I agree that changing writeDisposition is a way forward for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants