# PostgreSQL to Aurora DSQL Migration Guide

<div style="background-color: #f8f9fa; border: 1px solid #e9ecef; border-radius: 8px; padding: 10px; margin: 10px;">
<strong>📋 Workshop Contents</strong>
<ul style="line-height: 1.2;">
<li><a href="#Introduction">Introduction</a></li>
<li><a href="#Migration-Options">Migration Options</a></li>
<li><a href="#Option-1-AWS-Glue-Migration-Path">Option 1: AWS Glue Migration Path</a></li>
<li><a href="#Option-2-DMS-with-Kinesis-Integration">Option 2: DMS with Kinesis Integration</a></li>
<li><a href="#Migration-Approach-Comparison">Migration Approach Comparison</a></li>
<li><a href="#Selecting-the-appropriate-migration-path">Selecting the appropriate migration path</a></li>
<li><a href="#Evaluating-Aurora-DSQL-for-Your-Workloads">Evaluating Aurora DSQL for Your Workloads</a></li>
</ul>
</div>

## Introduction

As organizations scale their operations, they often face challenges with traditional database architectures, particularly when requiring multi-region presence with strong consistency. Amazon Aurora Distributed SQL (DSQL) addresses these challenges by providing a distributed SQL database that offers:

- Active-active multi-region capability
- Automatic and continuous scaling of reads and writes
- Strong consistency across all regions
- PostgreSQL compatibility
- Serverless infrastructure
- Built-in fault tolerance
- Independent scaling of compute and storage

For organizations currently using RDS PostgreSQL, Aurora PostgreSQL, or self-hosted PostgreSQL databases, migrating to Aurora DSQL can provide these advanced capabilities while maintaining PostgreSQL compatibility. This enables you to scale your database operations globally without compromising on consistency or performance.

### Why Consider Aurora DSQL?

- **Global Scale**: Automatically scales reads, writes, compute, and storage independently
- **High Availability**: 99.99% availability for single-region clusters (replicated across 3 AZs) and 99.999% for multi-region deployments with region-level fault tolerance
- **Active-Active Architecture**: Supports active-active configurations in both single and multi-region clusters with two regional endpoints for reads and writes
- **Strong Consistency**: Strongly consistent reads and writes across all regional endpoints with witness region for write quorum and network partition handling
- **Simplified Management**: Eliminates the need for manual sharding or complex replication setups
- **Cost Efficiency**: Pay-as-you-go pricing model with serverless infrastructure
- **Security**: Built-in encryption, authentication, and authorization capabilities

This guide outlines the available options for migrating from PostgreSQL (RDS PostgreSQL, Aurora PostgreSQL, or self-hosted PostgreSQL) to Amazon Aurora DSQL. Each migration path has its own characteristics, benefits, and considerations that should be carefully evaluated based on your specific requirements.

<div style="padding: 15px; background-color: #e6f7e6; border-left: 5px solid #28a745; margin-bottom: 10px;">
<strong>Advanced Application:</strong> You can also enhance the serverless web application you built in <a href="../../3_Building_Your_First_Serverless_Web_App_with_Aurora/">Module 3: Building Your First Serverless Web App with Aurora</a> by replacing the Aurora database with Aurora DSQL. This upgrade would transform your application into a multi-region resilient system that can withstand regional failures while maintaining data consistency across regions. Aurora DSQL's active-active architecture ensures your application remains available even if an entire AWS region experiences an outage.
</div>


## Migration Options

> **Note:** If you're interested in creating an Aurora DSQL cluster before proceeding with migration, follow the instructions in [Create a multi-region Aurora DSQL cluster](./5.3.3_Create_Aurora-DSQL.ipynb). Once you have created a cluster, you can return to this section and continue with the migration options. If you're only reading through the material, you can skip the DSQL creation steps and proceed directly with the migration options.

### Option 1: AWS Glue Migration Path


AWS Glue provides a serverless ETL (Extract, Transform, Load) solution for migrating your PostgreSQL database to Aurora DSQL. This approach leverages AWS's managed services to handle large-scale data migrations with minimal infrastructure management.

![rds-dsql-migration-1.png](../images/5.3-rds-to-dsql-migration-step-1.png)

#### How It Works
The migration process follows these steps:
1. The source PostgreSQL database snapshot is exported to Amazon S3 in Parquet format
2. AWS Glue crawlers scan the Parquet files to create a schema in the Glue Data Catalog
3. A PySpark-based Glue ETL job processes the data and loads it into Aurora DSQL
4. Data type conversions and transformations are handled within the Glue job

#### Detailed Process
- **Snapshot Export**: 
  - Utilizes RDS/Aurora native snapshot export functionality
  - Data is automatically converted to Parquet format
  - Maintains data consistency through atomic snapshot
  
- **Schema Discovery**:
  - Glue crawlers automatically detect data structure
  - Creates table definitions in Glue Data Catalog
  - Preserves metadata information

- **Data Processing**:
  - Parallel processing across multiple Glue workers
  - Custom transformation logic can be implemented
  - Handles data type mappings and conversions



### Option 2: DMS with Kinesis Integration


This approach combines AWS Database Migration Service (DMS) with Kinesis Data Streams to provide a real-time migration path to Aurora DSQL. It's particularly useful when continuous data replication is required during the migration process.

![rds-dsql-migration-2.png](../images/5.3-rds-to-dsql-migration-step-2.png)

#### How It Works
The migration process involves:
1. DMS reads data from the source PostgreSQL database
2. Data is streamed to Kinesis Data Streams
3. A custom consumer (Lambda or application) processes the stream
4. Processed data is written to Aurora DSQL

## Migration Approach Comparison

| Aspect | AWS Glue Migration | DMS with Kinesis |
|--------|-------------------|------------------|
| Use Case | Batch migration, one-time transfer | Real-time migration, ongoing replication |
| Setup Complexity | Medium | High |
| Development Effort | PySpark knowledge required | Custom consumer development needed |
| Scalability | Glue workers | Kinesis shards |
| Cost Model | Pay per ETL runtime | Pay for continuous services |
| Real-time Capability | No | Yes |
| Data Consistency | Point-in-time snapshot | Continuous replication |

## Selecting the appropriate migration path
Choosing the right migration approach depends on various factors including database size, downtime tolerance, and ongoing replication requirements. Both options provide robust paths to Aurora DSQL, but careful planning and testing are essential for successful migration.


## Evaluating Aurora DSQL for Your Workloads

Before migrating to Aurora DSQL, conduct thorough due diligence to ensure it's the right solution for your use case:

### Key Considerations

- **Compatibility Assessment**: Review the Aurora(https://docs.aws.amazon.com/aurora-dsql/latest/userguide/what-is-aurora-dsql.html) for supported PostgreSQL features and versions
- **Feature Limitations**: Verify compatibility with your current PostgreSQL extensions, data types, and operational features
- **Migration Tools**: Explore available migration paths (AWS DMS, AWS Glue, snapshot exports) as support is rapidly evolving
- **Application Changes**: Evaluate required code modifications to accommodate any behavioral differences
- **Cost Analysis**: Compare pricing models between your current solution and Aurora DSQL's serverless model

### Migration Readiness

Start with a small proof-of-concept migration to validate compatibility and performance before committing to a full migration. Monitor AWS announcements for new migration tools and expanded feature support as this service continues to mature.

For detailed comparison on feature compatibility, see Aurora(https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with.html).

### Additional Learning Resources

For those interested in gaining deeper knowledge about Aurora DSQL and hands-on experience, the Aurora(https://catalog.workshops.aws/aurora-dsql/en-US) workshop provides comprehensive training that covers:

- Hands-on experience with Amazon Aurora DSQL's serverless distributed SQL capabilities
- Understanding Aurora DSQL's ACID transactions and active-active replication mechanisms
- Best practices for data modeling and optimistic concurrency control
- Building a retail rewards points application with multi-region active-active resiliency

## Additional Resources 📚

### Aurora DSQL Documentation
- [Aurora DSQL User Guide](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/what-is-aurora-dsql.html)
- [Aurora DSQL API Reference](https://docs.aws.amazon.com/aurora-dsql/latest/APIReference/)
- [Aurora DSQL Workshop](https://catalog.workshops.aws/aurora-dsql/en-US)

### Migration Tools & Services
- [AWS Database Migration Service](https://docs.aws.amazon.com/dms/latest/userguide/Welcome.html)
- [AWS Glue ETL Jobs](https://docs.aws.amazon.com/glue/latest/dg/author-job.html)
- [Amazon Kinesis Data Streams](https://docs.aws.amazon.com/kinesis/latest/dev/introduction.html)

### PostgreSQL Compatibility
- [Aurora DSQL PostgreSQL Compatibility](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with.html)
- [PostgreSQL Migration Best Practices](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html)
- [Database Migration Planning](https://docs.aws.amazon.com/prescriptive-guidance/latest/migration-sql-server-aurora-postgresql/)