Skip to content

aws-samples/sample-redshift-serverless-database-migration

Migrate Tenants Between Redshift Clusters and Redshift Serverless Workgroups

An automated AWS-native solution for migrating tenants between Amazon Redshift environments (Provisioned Clusters and Serverless Workgroups). This tool orchestrates complete schema, data, and permission migrations while minimizing downtime and ensuring data integrity.

Overview

Purpose

This migration solution provides:

  • Complete tenant migration of schemas and database objects between Redshift environments
  • Automatic endpoint repointing to redirect tenant connections to new clusters/workgroups
  • ETL pipeline integration for continuous data loading to new endpoints
  • Minimal downtime with automated orchestration and parallel processing
  • Data integrity through validation and error handling mechanisms

Key Features

Schema Migration: Tables, views, materialized views, stored procedures
User & Permission Migration: Users, groups, roles, and granular permissions
Data Migration: Parallel UNLOAD/COPY operations with Spark optimization
Automated Orchestration: AWS Step Functions workflow coordination
Security: KMS encryption, IAM roles, Secrets Manager integration
Monitoring: CloudWatch logs with detailed execution tracking
Error Handling: Automatic retries and graceful failure recovery

Intended Audience

This migration template is intended for:

  • Database Administrators (DBAs) responsible for managing Redshift clusters and serverless workgroups
  • Database Owners who oversee schema design, user access, and data integrity
  • Cloud Infrastructure Engineers managing AWS environments and networking
  • Platform / DevOps Teams responsible for provisioning and orchestrating cloud resources

⚠️ This template should not be used by application developers or end users without DBA oversight. It performs privileged operations including user creation, permission grants, DDL migration, and system table access.

Credential Requirements

The secrets stored in AWS Secrets Manager for both source and target Redshift environments must use master-level (superuser) credentials. This is required because the migration process needs to:

  • Access Redshift system tables (pg_user_info, svv_all_schemas, pg_catalog, etc.) to extract users, groups, and permissions
  • Create and alter users and groups on the target cluster/workgroup
  • Execute GRANT and REVOKE statements across schemas
  • Run UNLOAD and COPY commands with IAM role association
  • Create and modify database objects (tables, views, stored procedures) across tenant schemas

Using non-superuser credentials will result in incomplete migrations — system table queries will return partial results and permission/user operations will fail silently or with errors.

Architecture

High-Level Architecture

High-Level Architecture

Low-Level Architecture

Components

┌─────────────────────┐
│  CloudFormation     │  One-click deployment
│  Template           │  Infrastructure as Code
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  AWS Step Functions │  Orchestration Layer
│  State Machine      │  - CreateSchemaObjects
└──────────┬──────────┘  - FetchTables (Lambda)
           │              - DataMigration (Map State)
           │              - RefreshViews
           ▼
┌─────────────────────┐
│  AWS Glue Jobs      │  Execution Layer
├─────────────────────┤
│ 1. Schema Migration │  Python Shell Job
│ 2. Data Migration   │  Spark Job (Parallel)
│ 3. View Refresh     │  Python Shell Job
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│  Shell Scripts      │  Migration Logic
├─────────────────────┤
│ 01_create_users_    │  Users & Groups
│    groups.sh        │
│ 02_migrate_ddl.sh   │  Tables & Schemas
│ 03_migrate_         │  Permissions
│    permissions.sh   │
│ 04_migrate_views.sh │  Views & Procedures
│ 05_refresh_         │  Materialized Views
│    materialized_    │
│    views.sh         │
└─────────────────────┘
           │
           ▼
┌─────────────────────┐
│  Supporting         │
│  Services           │
├─────────────────────┤
│ • S3 Buckets        │  Script & Data Storage
│ • Secrets Manager   │  Credentials
│ • IAM Roles         │  Access Control
│ • KMS Keys          │  Encryption
│ • CloudWatch Logs   │  Monitoring
│ • Lambda Functions  │  Table Discovery
└─────────────────────┘

Workflow Execution

Step 1: CreateSchemaObjects (Glue Python Shell)
   ├─ Load credentials from Secrets Manager
   ├─ Execute 01_create_users_groups.sh
   ├─ Execute 02_migrate_ddl.sh
   ├─ Execute 03_migrate_permissions.sh
   └─ Execute 04_migrate_views.sh

Step 2: Lambda Invoke (FetchRedshiftTables)
   └─ Query source database for table list

Step 3: Map State (Parallel Data Migration)
   ├─ Process tables in parallel (max 10 concurrent)
   ├─ Each table: Glue Spark Job
   │  ├─ UNLOAD from source to S3
   │  └─ COPY from S3 to target
   └─ Tolerate 100% failures (continue on error)

Step 4: RefreshMaterializedViews (Glue Python Shell)
   └─ Execute 05_refresh_materialized_views.sh

Prerequisites

AWS Account Requirements

  • AWS CLI configured with appropriate permissions
  • CloudFormation deployment permissions
  • VPC with private subnets for Glue jobs
  • Existing S3 buckets for scripts and data storage

IAM Permissions for Deploying User

The user or role deploying this CloudFormation stack requires permissions across multiple AWS services. A ready-to-use IAM policy is provided at cloudformation/deployment-iam-policy.json.

To create and attach this policy:

# Create the IAM policy
aws iam create-policy \
  --policy-name RedshiftMigrationDeploymentPolicy \
  --policy-document file://cloudformation/deployment-iam-policy.json

# Attach to your IAM user or role
aws iam attach-user-policy \
  --user-name <your-iam-username> \
  --policy-arn arn:aws:iam::<account-id>:policy/RedshiftMigrationDeploymentPolicy

The policy covers: CloudFormation stack operations, IAM role/policy creation, KMS key management, Glue jobs and connections, Lambda functions and layers, Step Functions state machines, CloudWatch log groups, EC2 networking (VPC endpoints, security groups), SSM parameter reads, and Secrets Manager access.

Network Requirements

  • VPC: Private subnets with NAT Gateway or VPC Endpoints
  • S3 Gateway Endpoint: Required for Redshift UNLOAD/COPY operations
  • Security Groups: Allow Redshift port (5439) access
  • Route Tables: Properly configured for S3 endpoint

Redshift Requirements

  • Source: Redshift Provisioned Cluster or Serverless Workgroup
  • Target: Redshift Provisioned Cluster or Serverless Workgroup
  • Credentials: Stored in AWS Secrets Manager with required format
  • IAM Roles: Associated with target workgroup for S3 access

Deployment

Step 1: Pre-Deployment Setup

1.1 Create S3 Gateway Endpoint (if not exists)

# Via AWS Console
AWS Console > VPC > Endpoints > Create Endpoint
- Service: com.amazonaws.<region>.s3 (Gateway type)
- VPC: Select VPC where Redshift is located
- Route Tables: Select route tables for Redshift subnets

# Via AWS CLI
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxxx \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-xxxxxxxx rtb-yyyyyyyy

1.2 Create Secrets Manager Secrets

Secrets must contain the following key-value pairs:

{
  "host": "cluster-name.region.redshift.amazonaws.com",
  "port": "5439",
  "database": "database_name",
  "username": "admin_user",
  "password": "secure_password"
}

Create secrets:

# Source cluster/workgroup secret
aws secretsmanager create-secret \
  --name "redshift-source-cluster" \
  --description "Source Redshift credentials" \
  --secret-string '{
    "host": "source-cluster.us-east-1.redshift.amazonaws.com",
    "port": "5439",
    "database": "dev",
    "username": "admin",
    "password": "YourSecurePassword123!"
  }'

# Target cluster/workgroup secret (Producer)
aws secretsmanager create-secret \
  --name "redshift-target-producer" \
  --description "Target Redshift producer credentials" \
  --secret-string '{
    "host": "target-workgroup.us-east-1.redshift-serverless.amazonaws.com",
    "port": "5439",
    "database": "dev",
    "username": "admin",
    "password": "YourSecurePassword456!"
  }'

# Consumer secret (optional, for reader users)
aws secretsmanager create-secret \
  --name "redshift-target-consumer" \
  --description "Target Redshift consumer credentials" \
  --secret-string '{
    "host": "target-workgroup.us-east-1.redshift-serverless.amazonaws.com",
    "port": "5439",
    "database": "dev",
    "username": "reader",
    "password": "YourSecurePassword789!"
  }'

1.3 Create SSM Parameters

# VPC ID
aws ssm put-parameter \
  --name "/migration/VpcId" \
  --value "vpc-xxxxxxxxx" \
  --type "String"

# Private Subnets (comma-separated)
aws ssm put-parameter \
  --name "/migration/Subnet" \
  --value "subnet-xxxxxxxx,subnet-yyyyyyyy" \
  --type "StringList"

# Security Group ID
aws ssm put-parameter \
  --name "/migration/SecurityGroupId" \
  --value "sg-xxxxxxxxx" \
  --type "String"

1.4 Upload Migration Scripts to S3

# Create S3 bucket structure
aws s3 mb s3://your-migration-bucket

# Upload all files maintaining folder structure
cd /path/to/redshift-migration
aws s3 sync glue-jobs/ s3://your-migration-bucket/redshift-migrate/glue-jobs/
aws s3 sync shell-scripts/ s3://your-migration-bucket/redshift-migrate/shell-scripts/
aws s3 sync sql/ s3://your-migration-bucket/redshift-migrate/sql/

# Verify structure
aws s3 ls s3://your-migration-bucket/redshift-migrate/ --recursive

Expected S3 structure:

s3://your-migration-bucket/
└── redshift-migrate/
    ├── glue-jobs/
    │   ├── redshift_schema_migration_job.py
    │   ├── redshift-data-migration-spark-job.py
    │   └── refresh_views_glue_job.py
    ├── shell-scripts/
    │   ├── 01_create_users_groups.sh
    │   ├── 02_migrate_ddl.sh
    │   ├── 03_migrate_permissions.sh
    │   ├── 04_migrate_views.sh
    │   ├── 05_refresh_materialized_views.sh
    │   ├── common.sh
    │   ├── load_secrets.sh
    │   ├── pgpass.sh
    │   └── migrate.sh
    ├── sql/
    │   └── get_table_ddl.sql
    ├── lambda/
    │   └── fetch_redshift_tables_list.zip
    ├── lambda-layers/
    │   └── redshift_connector.zip
    └── libraries/
        ├── glue-psycopg2-dependencies.zip
        └── psycopg2_binary-2.9.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

1.5 Create Data Bucket with Folders

# Create data bucket
aws s3 mb s3://your-data-bucket

# Create required folders
aws s3api put-object --bucket your-data-bucket --key redshift-migrate/unload/
aws s3api put-object --bucket your-data-bucket --key redshift-migrate/temporary/

Step 2: Deploy CloudFormation Stack

Option A: AWS Console Deployment

  1. Navigate to AWS CloudFormation Console
  2. Click Create stackWith new resources (standard)
  3. Upload template: redshift_tenant_migration.yaml
  4. Configure parameters (see table below)
  5. Acknowledge IAM resource creation
  6. Click Create stack

Option B: AWS CLI Deployment

aws cloudformation create-stack \
  --stack-name redshift-tenant-migration \
  --template-body file://redshift_tenant_migration.yaml \
  --parameters \
    ParameterKey=BucketName,ParameterValue=your-migration-bucket \
    ParameterKey=DataBucketName,ParameterValue=your-data-bucket \
    ParameterKey=SourceSecretArn,ParameterValue=arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-source-cluster-abc123 \
    ParameterKey=TargetSecretArn,ParameterValue=arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-target-producer-def456 \
    ParameterKey=Schema,ParameterValue="('tenant_schema')" \
    ParameterKey=TenantName,ParameterValue=tenant_name \
    ParameterKey=TenantStackName,ParameterValue=tenant-cloudformation-stack \
    ParameterKey=SecurityGroupSSM,ParameterValue=/migration/SecurityGroupId \
    ParameterKey=SubnetSSM,ParameterValue=/migration/Subnet \
    ParameterKey=VpcSSM,ParameterValue=/migration/VpcId \
    ParameterKey=Region,ParameterValue=us-east-1 \
  --capabilities CAPABILITY_IAM

# Monitor stack creation
aws cloudformation wait stack-create-complete \
  --stack-name redshift-tenant-migration

# Get outputs
aws cloudformation describe-stacks \
  --stack-name redshift-tenant-migration \
  --query 'Stacks[0].Outputs'

CloudFormation Parameters

Parameter Description Required Example
BucketName S3 bucket for code artifacts Yes my-migration-bucket
DataBucketName S3 bucket for Redshift data (with unload/temporary folders) Yes my-data-bucket
SourceSecretArn ARN of source cluster credentials Yes arn:aws:secretsmanager:...
TargetSecretArn ARN of target cluster credentials Yes arn:aws:secretsmanager:...
Schema Schema name in SQL format Yes "('sales_data')"
TenantName Name of the tenant Yes acme_corp
TenantStackName CloudFormation stack name of tenant Yes acme-corp-stack
Region AWS Region Yes us-east-1
SecurityGroupSSM SSM parameter for security group ID Yes /migration/SecurityGroupId
SubnetSSM SSM parameter for subnet IDs Yes /migration/Subnet
VpcSSM SSM parameter for VPC ID Yes /migration/VpcId
RedshiftServerlessPort Redshift port No 5439 (default)

Execution

Start Migration via Step Functions

Option A: AWS Console

  1. Navigate to AWS Step Functions Console
  2. Select MigrationStateMachine
  3. Click Start execution
  4. Provide input JSON:
{
  "SCHEMAS": "('tenant_schema')",
  "SOURCE_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-source-cluster-abc123",
  "TARGET_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-target-producer-def456",
  "TENANT_NAME": "tenant_name",
  "CONSUMER_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-target-consumer-ghi789",
  "TENANT_STACK_NAME": "tenant-cloudformation-stack"
}
  1. Click Start execution

Option B: AWS CLI

aws stepfunctions start-execution \
  --state-machine-arn "arn:aws:states:us-east-1:123456789012:stateMachine:MigrationStateMachine" \
  --name "migration-$(date +%Y%m%d-%H%M%S)" \
  --input '{
    "SCHEMAS": "('\''tenant_schema'\'')",
    "SOURCE_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-source-cluster-abc123",
    "TARGET_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-target-producer-def456",
    "TENANT_NAME": "tenant_name",
    "CONSUMER_SECRET_ARN": "arn:aws:secretsmanager:us-east-1:123456789012:secret:redshift-target-consumer-ghi789",
    "TENANT_STACK_NAME": "tenant-cloudformation-stack"
  }'

Monitor Execution

# Get execution status
aws stepfunctions describe-execution \
  --execution-arn "arn:aws:states:us-east-1:123456789012:execution:MigrationStateMachine:migration-20260202-120000"

# Get execution history
aws stepfunctions get-execution-history \
  --execution-arn "arn:aws:states:us-east-1:123456789012:execution:MigrationStateMachine:migration-20260202-120000" \
  --max-results 100

Migration Process Details

Phase 1: Create Schema Objects (Glue Python Shell Job)

Duration: 5-30 minutes (depending on schema complexity)

Tasks:

  1. Load Secrets: Retrieve credentials from Secrets Manager
  2. Create Users & Groups (01_create_users_groups.sh):
    • Extract users from pg_user_info (Serverless compatible)
    • Create users with proper attributes (CREATEDB, CREATEUSER, etc.)
    • Create groups and assign memberships
  3. Migrate DDL (02_migrate_ddl.sh):
    • Create schemas
    • Extract table DDL using get_table_ddl.sql
    • Create tables with proper data types and constraints
    • Handle data type conversions (Provisioned → Serverless)
  4. Migrate Permissions (03_migrate_permissions.sh):
    • Schema-level grants
    • Table-level permissions
    • Column-level permissions
    • Default privileges
  5. Migrate Views (04_migrate_views.sh):
    • Standard views
    • Stored procedures and functions
    • Handle view dependencies

Logs: /aws-glue/jobs/output and /aws-glue/jobs/error

Phase 2: Fetch Tables (Lambda Function)

Duration: < 1 minute

Tasks:

  • Query source database for table list in specified schemas
  • Exclude materialized view backing tables (mv_tbl__%)
  • Return array of table names for parallel processing

Logs: /aws/lambda/FetchRedshiftTablesFunction

Phase 3: Data Migration (Distributed Map State)

Duration: Variable (depends on data volume)

Configuration:

  • Max Concurrency: 10 tables in parallel
  • Tolerated Failure: 100% (continues even if tables fail)
  • Execution Type: STANDARD (distributed processing)

Per-Table Process (Glue Spark Job):

  1. Associate IAM Role: Attach S3 access role to target workgroup
  2. UNLOAD: Export table data from source to S3
    UNLOAD ('SELECT * FROM schema.table')
    TO 's3://bucket/unload/schema/table/'
    IAM_ROLE 'arn:aws:iam::account:role/RedshiftS3Role'
    PARALLEL ON
    GZIP
    ALLOWOVERWRITE
  3. COPY: Import data from S3 to target
    COPY schema.table
    FROM 's3://bucket/unload/schema/table/'
    IAM_ROLE 'arn:aws:iam::account:role/RedshiftS3Role'
    GZIP
    COMPUPDATE ON
    STATUPDATE ON

Error Handling:

  • Failed tables logged but don't stop execution
  • Review Map State results for failed tables
  • Retry individual tables if needed

Logs: /aws-glue/jobs/output (per table execution)

Phase 4: Refresh Materialized Views (Glue Python Shell Job)

Duration: 5-60 minutes (depends on view complexity)

Tasks:

  1. Execute 05_refresh_materialized_views.sh
  2. Identify materialized views in target schemas
  3. Refresh views in dependency order
  4. Update statistics

Logs: /aws-glue/jobs/output and /aws-glue/jobs/error

Monitoring & Troubleshooting

CloudWatch Logs

Component Log Group Log Stream
Step Functions /aws/stepfunctions/<stack-name> Execution ID
Schema Migration Job /aws-glue/jobs/output Job run ID
Data Migration Job /aws-glue/jobs/output Job run ID (per table)
View Refresh Job /aws-glue/jobs/output Job run ID
Lambda Function /aws/lambda/FetchRedshiftTablesFunction Date-based

Troubleshooting Steps

1. Step Function Failures

# View failed step details
AWS Console > Step Functions > Select execution > Click failed step

# Check associated resource (Glue/Lambda) in right pane
# Navigate to that resource's logs

2. Glue Job Failures

# Via Console
AWS Console > Glue > Jobs > Select job > Runs tab > View logs

# Via CLI
aws glue get-job-runs --job-name <job-name> --max-results 10

# View specific run logs
aws logs tail /aws-glue/jobs/output --follow \
  --log-stream-name-prefix <job-run-id>

3. Lambda Function Errors

# Via Console
AWS Console > Lambda > Select function > Monitor > View CloudWatch Logs

# Via CLI
aws logs tail /aws/lambda/FetchRedshiftTablesFunction --follow

4. Map State (Parallel Data Migration) Errors

# Via Console
Step Functions > Select execution > Click Map State > Click "Map Run" (top right)

# View individual table execution results
# Check failed tables in execution details

5. Data Migration Errors (UNLOAD/COPY)

# Check Glue Spark job logs
aws logs filter-log-events \
  --log-group-name /aws-glue/jobs/error \
  --filter-pattern "ERROR"

# Common issues:
# - IAM role not associated with workgroup
# - S3 permissions issues
# - Network connectivity (VPC endpoint)
# - Table doesn't exist in source

Common Issues & Solutions

Issue Cause Solution
Connection timeout Security group or network issue Verify security group allows port 5439, check VPC endpoint
IAM role not found Role not associated with workgroup Glue job automatically associates role; check CloudWatch logs
S3 access denied Missing S3 permissions Verify IAM role has S3 read/write permissions
Table already exists Previous partial migration Scripts skip existing objects; safe to re-run
User creation fails User already exists Scripts check existence before creation
View dependency error Views created out of order 04_migrate_views.sh handles dependencies automatically
Materialized view refresh fails Base tables not migrated Ensure Phase 3 completed successfully

Retry Failed Executions

If a Step Function execution fails:

  1. Review logs to identify root cause
  2. Fix the issue (permissions, network, etc.)
  3. Restart execution with same input parameters
aws stepfunctions start-execution \
  --state-machine-arn "arn:aws:states:region:account:stateMachine:MigrationStateMachine" \
  --name "migration-retry-$(date +%Y%m%d-%H%M%S)" \
  --input '<same-input-json-as-before>'

Scripts are idempotent - they check for existing objects and skip creation if already present.

Retry Individual Failed Tables

After the data migration (Map State) completes, a Lambda function (CollectFailedTables) automatically generates a migration report and retry input file in S3. If more than 20% of tables fail, the Step Function stops and provides the report location.

1. Check the Migration Report

The report is saved to S3 at:

s3://<data-bucket>/migration-reports/<tenant>/<timestamp>/migration_report.json

Download and review it:

aws s3 cp s3://<data-bucket>/migration-reports/<tenant>/<timestamp>/migration_report.json .
cat migration_report.json

The report contains:

{
  "timestamp": "2026-03-18T04-30-00",
  "tenant": "sales_data",
  "total_tables": 10,
  "succeeded_count": 8,
  "failed_count": 2,
  "succeeded_tables": ["customers", "products", ...],
  "failed_tables": [
    {
      "table": "orders",
      "schema": "sales_data",
      "error": "TableMigrationFailed",
      "cause": "COPY failed: date conversion error"
    }
  ]
}

2. Download the Retry Input File

A retry payload is automatically generated alongside the report:

s3://<data-bucket>/migration-reports/<tenant>/<timestamp>/retry_failed_tables.json
aws s3 cp s3://<data-bucket>/migration-reports/<tenant>/<timestamp>/retry_failed_tables.json .
cat retry_failed_tables.json

The retry file contains:

{
  "tables": [
    {
      "SCHEMAS": "sales_data",
      "TABLE_NAME": "orders",
      "SOURCE_SECRET_ARN": "arn:aws:secretsmanager:...:secret:source-xxx",
      "TARGET_SECRET_ARN": "arn:aws:secretsmanager:...:secret:target-yyy"
    }
  ]
}

3. Retry a Single Table

To retry a specific failed table, start the data migration Glue job directly with the table name:

aws glue start-job-run \
  --job-name <data-migration-glue-job-name> \
  --arguments '{
    "--SCHEMAS": "sales_data",
    "--TABLE_NAME": "orders",
    "--S3_BUCKET": "s3://<data-bucket>/unload",
    "--IAM_ROLE": "arn:aws:iam::<account-id>:role/<redshift-access-role>",
    "--SOURCE_SECRET_ARN": "arn:aws:secretsmanager:...:secret:source-xxx",
    "--TARGET_SECRET_ARN": "arn:aws:secretsmanager:...:secret:target-yyy"
  }'

4. Retry All Failed Tables

To retry all failed tables from the retry file, run each entry:

# Parse retry file and start a Glue job for each failed table
for table in $(cat retry_failed_tables.json | python3 -c "
import sys, json
data = json.load(sys.stdin)
for t in data['tables']:
    print(t['TABLE_NAME'])
"); do
  echo "Retrying table: $table"
  aws glue start-job-run \
    --job-name <data-migration-glue-job-name> \
    --arguments "{
      \"--SCHEMAS\": \"sales_data\",
      \"--TABLE_NAME\": \"$table\",
      \"--S3_BUCKET\": \"s3://<data-bucket>/unload\",
      \"--IAM_ROLE\": \"arn:aws:iam::<account-id>:role/<redshift-access-role>\",
      \"--SOURCE_SECRET_ARN\": \"arn:aws:secretsmanager:...:secret:source-xxx\",
      \"--TARGET_SECRET_ARN\": \"arn:aws:secretsmanager:...:secret:target-yyy\"
    }"
done

⚠️ Replace <data-migration-glue-job-name>, bucket names, IAM role ARN, and secret ARNs with your actual values from the CloudFormation stack outputs.

Post-Migration Activities

User Password Reset (Required)

During migration, users are created on the target cluster with a randomly generated temporary password. The DBA must reset passwords for all migrated users after the migration completes.

Option 1: Manually reset each user's password

-- Reset password for a specific user
ALTER USER "username" PASSWORD 'new_secure_password';

-- List all migrated users to identify who needs a reset
SELECT usename FROM pg_user_info WHERE usename LIKE '%<tenant_name>%';

Option 2: Set a short-lived password to force a reset

Redshift does not support a native "force change on next login" flag. Instead, set a near-future expiry so users must contact the DBA for a new password:

ALTER USER "username" PASSWORD 'temp_password' VALID UNTIL '2026-03-15';

Option 3: Migrate password hashes from source (preserves original passwords)

Extract the MD5 password hash from the source cluster and apply it on the target. This lets users keep their existing passwords with no disruption:

-- On source: extract the password hash
SELECT usename, passwd FROM pg_catalog.pg_shadow WHERE usename = 'username';

-- On target: set the hash directly (use the md5 hash value from source)
ALTER USER "username" PASSWORD 'md5<hash_from_source>';

⚠️ Whichever option you choose, ensure passwords are communicated securely and comply with your organization's credential management policies.

Key Differences: Provisioned vs Serverless

System Tables

Provisioned Serverless Usage
svl_user_info pg_user_info User information
svv_all_schemas svv_all_schemas Schema listing

Note: Scripts automatically use Serverless-compatible system tables (pg_user_info).

IAM Role Association

  • Provisioned: Roles attached at cluster level
  • Serverless: Roles attached at workgroup level
  • Migration: Glue Spark job automatically associates roles with target workgroup

Security Features

Encryption

  • KMS Keys: Customer-managed keys for Glue jobs and CloudWatch logs
  • Secrets Manager: Encrypted credential storage
  • S3: Server-side encryption for data at rest
  • CloudWatch Logs: Encrypted with KMS

Network Security

  • VPC Isolation: All Glue jobs run in private subnets
  • Security Groups: Restrictive inbound/outbound rules
  • S3 Gateway Endpoint: Private connectivity to S3
  • No Internet Access: Glue jobs don't require internet connectivity

Access Control

  • IAM Roles: Principle of least privilege
  • Resource Policies: Fine-grained access control
  • Service-Linked Roles: Secure service-to-service communication
  • Secrets Manager Policies: Restricted secret access

Cost Optimization

Resource Usage

Resource Pricing Model Optimization
AWS Glue Per-second billing (Python Shell: 0.44 DPU, Spark: configurable) Use Python Shell for lightweight tasks
Step Functions Per state transition Minimize state transitions
Lambda Per invocation + duration Single invocation for table discovery
CloudWatch Logs Per GB ingested + storage 30-day retention policy
S3 Per GB stored + requests Lifecycle policies for temporary data

Best Practices

  • Parallel Processing: Leverage Map State concurrency (max 10)
  • Glue DPU: Right-size Spark job DPUs based on data volume
  • S3 Lifecycle: Delete temporary UNLOAD data after migration
  • Log Retention: Adjust CloudWatch retention based on compliance needs

Cleanup

Delete CloudFormation Stack

# Delete stack (removes all resources)
aws cloudformation delete-stack --stack-name redshift-tenant-migration

# Wait for deletion
aws cloudformation wait stack-delete-complete \
  --stack-name redshift-tenant-migration

Clean Up S3 Data

# Remove migration scripts
aws s3 rm s3://your-migration-bucket/redshift-migrate/ --recursive

# Remove temporary data
aws s3 rm s3://your-data-bucket/redshift-migrate/unload/ --recursive
aws s3 rm s3://your-data-bucket/redshift-migrate/temporary/ --recursive

Remove Secrets and Parameters

# Delete secrets (30-day recovery window)
aws secretsmanager delete-secret \
  --secret-id redshift-source-cluster \
  --recovery-window-in-days 30

aws secretsmanager delete-secret \
  --secret-id redshift-target-producer \
  --recovery-window-in-days 30

# Delete SSM parameters
aws ssm delete-parameter --name /migration/VpcId
aws ssm delete-parameter --name /migration/Subnet
aws ssm delete-parameter --name /migration/SecurityGroupId

File Structure

redshift-migration/
├── cloudformation/
│   └── redshift_tenant_migration.yaml    # CloudFormation template
├── glue-jobs/
│   ├── redshift_schema_migration_job.py  # Schema migration orchestrator
│   ├── redshift-data-migration-spark-job.py  # Data migration (UNLOAD/COPY)
│   └── refresh_views_glue_job.py         # Materialized view refresh
├── shell-scripts/
│   ├── migrate.sh                        # Main orchestration script
│   ├── common.sh                         # Shared functions
│   ├── load_secrets.sh                   # Secrets Manager integration
│   ├── pgpass.sh                         # PostgreSQL password file
│   ├── 01_create_users_groups.sh         # User/group creation
│   ├── 02_migrate_ddl.sh                 # Table DDL migration
│   ├── 03_migrate_permissions.sh         # Permission migration
│   ├── 04_migrate_views.sh               # View/procedure migration
│   └── 05_refresh_materialized_views.sh  # Materialized view refresh
├── sql/
│   └── get_table_ddl.sql                 # DDL extraction queries
├── requirements.txt                       # Python dependencies
└── README.md                              # This file

Support & Contributing

Getting Help

  1. CloudWatch Logs: Check detailed execution logs
  2. Step Function History: Review execution flow
  3. AWS Support: Contact AWS Support for infrastructure issues

Contributing

When contributing:

  1. Follow AWS security best practices
  2. Update CloudFormation templates for new features
  3. Add comprehensive error handling and logging
  4. Test in isolated environments before production
  5. Update documentation for any changes

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments


Version: 1.0
Last Updated: February 2026
Maintained By: AWS Migration Team

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors