Microsoft Data Integration Pipeline Training

The Fundamentals to Level 300

with Paul Andrew

Hey friends and welcome to my training workshop on Microsoft Data Integration Pipelines.

Overview

In this full day of training, we’ll start with the very basics and learn how to orchestrate your Azure data platform from start to finish. You will learn how to build out Azure control flow and data flow components as processing pipelines using Azure Data Factory and Azure Synapse Analytics. We’ll start by covering the fundamentals within the resources and together build out pipelines that ingest data from local source systems, transform and serve it to consumers. We’ll then continue taking an end-to-end look at our Azure integration pipeline tools within highly scalable cloud native architectures, dealing with triggering, monitoring, dynamic pipeline content as well as CI/CD practices. Start the day knowing nothing about Azure Data Integration pipelines and leave with the knowledge, slides, demos, and code to apply these resources in your role as a data engineering professional.

Objectives

How cloud native data integration resources have evolved over time.
What the basic data pipeline artifacts are.
What the common data movement deployment patterns are.
How to build complex, high dynamic control flows.
How to massively scale out executions and handle parallel orchestration workloads.
Best practices for the deployment of orchestration resources.

Agenda

The following offers an insight into the complete agenda and module breakdown for this workshop.

Module 1: Pipeline Fundamentals - Slides PDF >>>
- The History of Azure Orchestration
- Synapse Analytics vs Data Factory vs Microsoft Fabric
- Integration Components
- Common Activities
- Execution Dependencies

Module 2: Integration Runtime Design Patterns - Slides PDF >>>
- Compute Types
  - Azure
  - Hosted
  - SSIS
- Patterns & Configuration

Module 3: Data Transformation - Slides PDF >>>
- Data Flows
- Power Query Injection
- Spark Configuration
- Use Cases

Labs: Getting Hands On
- Create Azure Resources
- Build a Copy Pipeline
- Create a Reusable Pipeline
- Author a Data Flow
- Monitor Factory Activities
- Explore Synapse Pipelines
- Explore Fabric Pipelines
- Mini-project

Module 4: Dynamic Pipelines - Slides PDF >>>
- Expressions & Interpolation
- Simple Metadata Driven Execution
- Dynamic Content Chains
- Reference Names

Module 5: Pipeline Extensibility - Slides PDF >>>
- Azure Batch Service
  - Tasks
  - Compute Pools
  - Scaling
- Pipeline Custom Activities
- Azure Management API
- Azure Functions

Module 6: Execution Parallelism - Slides PDF >>>
- Control Flow Scale Out
- Concurrency Limitations
- Internal vs External Activities
- Orchestration Framework - See Cloud Formations: CF.Cumulus

Module 7: VNet Integration - Slides PDF >>>
- Private Endpoints
- Managed VNet's
- Firewall Bypass

Module 8: Security - Slides PDF >>>
- Service Principals
- Managed Identities
- Azure Key Vault Integration
- Customer Managed Keys
- Pipeline Access & Permissions

Module 9: Monitoring & Alerting - Slides PDF >>>
- Studio Monitoring
- Log Analytics & Kusto Queries
- Operational Dashboards
- Advanced Alerting

Module 10: Solution Testing - Slides PDF >>>
- Development Time Validation
- Test Coverage
- NUnit Tests

Module 11: CI/CD - Slides PDF >>>
- Source Control vs Developer UI
- Basic ARM Template Deployments
- Advanced Deployment Patterns

Module 12: Final Thoughts - Slides PDF >>>
- Running Costs
- Conclusions
- Best Practices

Suggested Prerequisites

If participating in any of these training workshop there will be labs to work through and demo code to optionally participate in. These labs will focus on the development of Azure data platform resources, it is therefore recommended that you bring the following ready to use. There will be little spare time for initial setup work.

Most importantly, access to a Microsoft Azure Tenant including a usable Azure Subscription.
- A free trial account is sufficient, but please have this setup prior to the event to avoid delays.
- This should include the ability to provision resources in an Azure Resource Group with owner level access.
A developer laptop with power and some form of WiFi connectivity (sorry if obvious).
Suggested software to be installed on your laptop to make the learning experience run smoothly:
- A modern web browser, Microsoft Edge or similar as preferred.
- A suitable IDE, VSCode or Visual Studio including Azure development extensions.
- Database tools, SQL Server Management Studio or Azure Data Studio.
- GitHub desktop or similar for repository interaction.
- Azure Storage Explorer.
- A PDF file viewer.
Play the Azure Icon Game, it will help. See blog post for context: https://mrpaulandrew.com/2017/12/15/the-azure-icon-game

For software downloads, please complete these tasks prior to the event to avoid internet bandwidth contention for other attendees.

Many thanks

Speaker Biography

Paul (AKA @mrpaulandrew) is the Founder & CTO of Cloud Formations, a specialist data consultancy based in the UK. With nearly 20 years’ experience designing and delivering Microsoft data architectures, Paul leads a passionate team of engineers, supporting businesses small and large with scalable cloud platforms. Business value delivered through data insights. Over the years, Paul has covered the breadth and depth of design patterns and industry leading concepts, including Lambda, Kappa, Delta Lake, Data Mesh and Data Fabric.

Paul is also a Microsoft Data Platform MVP, organiser for the Data Relay community conference, East Midlands user group leader, book author and mentor. In addition to the day job(s), Paul is a father of three, husband, foodie, runner, blood donor, geek, Lego, and Star Wars fan! Lastly, Paul confesses to enjoying a Ramstein playlist when given half a chance to do some coding for a customer project.

Speaker Contact Details

mrpaulandrew.com/contact

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github		.github
.vscode		.vscode
Code		Code
Content		Content
Data		Data
Images		Images
Labs		Labs
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
training.code-workspace		training.code-workspace

License

mrpaulandrewltd/Microsoft-Data-Integration-Pipeline-Training

Folders and files

Latest commit

History

Repository files navigation

Microsoft Data Integration Pipeline Training

The Fundamentals to Level 300

with Paul Andrew

Overview

Objectives

Agenda

Suggested Prerequisites

Speaker Biography

Speaker Contact Details

About

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages