Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set relation_column_encoding to AUTO by default #698

Merged
merged 1 commit into from
Apr 18, 2022

Conversation

thomas-vogels
Copy link
Contributor

@thomas-vogels thomas-vogels commented Apr 7, 2022

Description

User-visible Changes

This changes the default of the relation_column_encoding setting from ON to AUTO. This means that Arthur will pick appropriate encodings for columns. (The encoding value is picked based on the type and whether it's used in distribution or sort keys.) Selecting the encoding speeds up loading data (COMPUPDATE can be left off).

Links

Closes #229

Testing

You can check the active setting using:

arthur.py settings arthur_settings.redshift.relation_column_encoding

You can check which encoding is used with show_ddl on a table. Unless you already had the auto setting active, you would see for example columns like this:

    "created_at" TIMESTAMP WITHOUT TIME ZONE NOT NULL,

but then would see an encoding with the new default:

    "created_at" TIMESTAMP WITHOUT TIME ZONE ENCODE zstd NOT NULL,

Deploy Notes

If the relation_column_encoding is set to AUTO in local configurations, then this can be removed now. If this mode is not desired, then relation_column_encoding must be set to OFF now.

Harry's internal note: This has been effectively in place in production given that we have a file in the object store with this content:

{
  "arthur_settings": {
    "redshift": {
       "relation_column_encoding": "AUTO"
    }
  }
}

(This file should now be removed.)

@github-actions github-actions bot added the python Pull requests that update Python code label Apr 7, 2022
@thomas-vogels thomas-vogels force-pushed the tom/make-auto-encoding-default branch from 2341c63 to c191f30 Compare April 7, 2022 13:16
@thomas-vogels thomas-vogels marked this pull request as ready for review April 7, 2022 13:27
Copy link
Contributor

@ynaim94 ynaim94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomas-vogels thomas-vogels merged commit 2032a0d into next Apr 18, 2022
@thomas-vogels thomas-vogels deleted the tom/make-auto-encoding-default branch April 18, 2022 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: load feature python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add default encodings
2 participants