Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add schema update functionality for Big Query tables #14769

Closed
terekete opened this issue May 31, 2023 · 15 comments
Closed

feat: add schema update functionality for Big Query tables #14769

terekete opened this issue May 31, 2023 · 15 comments

Comments

@terekete
Copy link

terekete commented May 31, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

With Big Query tables if a schema is updated the complete tables needs to be replaced. BQ is now supporting in GA the ability to rename column and drop columns. Is it possible to enable this same functionality from terraform such that schema changes are handled for updates/patches.

https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#alter_table_drop_column_statement

New or Affected Resource(s)

google_bigquery_table

Potential Terraform Configuration

The change on schema upon apply should handle these updates to the schema without recreating the table.
Cases are add table column, drop table column, rename table column, shuffle table schema.

# Propose what you think the configuration to take advantage of this feature should look like.
# We may not use it verbatim, but it's helpful in understanding your intent.

References

@terekete terekete changed the title feat: add schema update functionality for Big Query feat: add schema update functionality for Big Query tables May 31, 2023
@rileykarson
Copy link
Collaborator

Note: Nothing in the API indicates this wouldn't be updatable using standard update methods

@terekete
Copy link
Author

@rileykarson are you saying this looks doable or this already should work with present tf versions?

@yi-wei-neo-db
Copy link

This will not work with present tf version.

@terekete
Copy link
Author

Any update on this feature?

@obada-ab
Copy link

obada-ab commented Feb 1, 2024

Picking this up.

I looked into allowed table modifications (https://cloud.google.com/bigquery/docs/managing-table-schemas), and I did some testing with terraform 1.7.2, it looks like at least some types of schema updates work fine, for example adding a column:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following
symbols:
  ~ update in-place

Terraform will perform the following actions:

  # google_bigquery_table.test will be updated in-place
  ~ resource "google_bigquery_table" "test" {
        id                       = "projects/xxxxx/datasets/random_foo_bar/tables/random_foo_bar"
      ~ schema                   = jsonencode(
          ~ [
                # (1 unchanged element hidden)
                {
                    name = "column2"
                    type = "STRING"
                },
              + {
                  + name = "column3"
                  + type = "STRING"
                },
            ]
        )
        # (18 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Likewise, relaxing a field mode from REQUIRED to NULLABLE happens in-place.

As for renaming and dropping columns, I need to check if these types of in-place updates are allowed through the API, the page I linked above mentions the DDL commands as the only option for these types of changes.

@RomeoAva
Copy link

RomeoAva commented Feb 1, 2024

+1 on this. In the past had to handle Bigquery tables outside of Terraform with migrations handled with Alembic. Which made it awkward since everything else was defined in Terraform..

This would be awesome if it works!

@terekete
Copy link
Author

terekete commented Feb 2, 2024 via email

@terekete
Copy link
Author

@obada-ab - any luck on this issue?

@obada-ab
Copy link

@terekete we're still verifying internally whether the API supports this, but so far it doesn't seem to be the case. I'm also looking into other possible workarounds for renaming/dropping columns.

@obada-ab
Copy link

Working on a PR for in-place column dropping, renaming nullable columns would technically happen in-place but TF would detect it as a column being dropped and a new column being added, which would delete existing data for that column.

For general column renaming backend support would likely be needed since detecting all cases can add a lot of complexity.

@PabloPardoGarcia
Copy link

@obada-ab I recently face unexpected behavior adding a new column to an existing table. I added a column to the middle of the schema, however when applying terraform, the new column was added to the end of the columns list.

According to the documentation, new columns added not at the end should raise an error:

Add the new columns to the end of the schema definition. If you attempt to add new columns elsewhere in the array, the following error is returned: BigQuery error in update operation: Precondition Failed.

It is also not super clear in BQ documentation, since this is only mentioned in the bq section and not in any of the other languages.

@obada-ab
Copy link

Add the new columns to the end of the schema definition. If you attempt to add new columns elsewhere in the array, the following error is returned: BigQuery error in update operation: Precondition Failed.

It is also not super clear in BQ documentation, since this is only mentioned in the bq section and not in any of the other languages.

@PabloPardoGarcia This seems to be a condition only for the bq cli tools, there's no such requirements in the API afaik.

the new column was added to the end of the columns list.

I'm not aware of any way to determine a specific place for the added column.

@obada-ab
Copy link

obada-ab commented Apr 8, 2024

Added in-place column drop functionality, column renaming is more complicated and we might have to wait for backend support.

@rileykarson I think we can close this issue and track column renaming in #17787

@melinath
Copy link
Collaborator

Per #14769 (comment) closing this ticket as completed.

Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants