Skip to content

bretamyers/Azure-Synapse-Lakehouse-Sync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Azure Synapse Lakehouse Sync

Description

Azure Synapse Lakehouse Sync provides an easy solution to synchronizing modeled Gold Zone data from your data lake, to your Synapse Analytics Data Warehouse. Through a series of Databricks notebooks and Synapse Analytics pipelines, it offers a working example of how to continually synchronize your tables.

Additionally, it leverages the new Change Data Feed capabilities in the Delta 2.x format to better track changes to your Gold Zone tables. This allows for significantly easier and more performant extracts of changed data. Best practices are then used to stage, ingest, and store data in the most performant and optimized way within Azure Synapse Dedicated SQL. The synchronization schedule can be configured for whatever interval works best for your environment, whether it's every 10 minutes or daily.

Azure Synapse Lakehouse Sync is designed to be a fully automated, self-healing, and hands-off approach to continually synchronize your data lake with your data warehouse.

Azure.Synapse.Lakehouse.Sync.Overview.mp4

Using Azure Synapse Lakehouse Sync

Self Deployment: Instructions for deploying, configuring, and using Azure Synapse Lakehouse Sync in your own environment.

Tutorial Environment: Deploys a fully working Azure Synapse Lakehouse Sync tutorial environment in your Azure Subscription. This is a great way to experience how Azure Synapse Lakehouse Sync works end-to-end.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published