RedShift Specific Notes
The effective utilization of Distkey in AWS Redshift significantly impacts query performance by optimizing table organization. Tables containing fewer than 10,000 records benefit from the 'DISTSTYLE ALL' command for distribution. For tables exceeding 10,000 records, employing the 'DISTSTYLE KEY' command enhances organization. When designating the Distkey for a Hub and Satellite, prioritize the Hash Key. For Link Distkeys, select the Hash Key of the referenced Hub with the highest cardinality for optimal performance (dbt documentation on how to implement)
To counter AWS Redshift's poor join performance, it's advisable to minimize unnecessary satellite splits. By doing so, the number of required joins decreases, directly enhancing the overall performance of the data vault
Table of Content
- Staging
- DV-Entities
- Hubs
- Links
- Satellites
- Standard Satellite
- Multi-Active Satellite
- Record-Tracking Satellite
- Non-Historized Satellite
-
Reference Data
- Reference Hub
- Reference Satellite
- Reference Tables
- Business Vault
- PIT
- Snapshot Control
- General
- Global Variables
- Multi-Source Entites
- How To Track Effectivity
- The rsrc_static Attribute
- Adaper Specific Notes
- Releases