Skip to content

In this repository, you will find varies demo and presentations I have delivered throughout the year. This includes the link to the video, the source codes and the data files.

Notifications You must be signed in to change notification settings

wilson-mok/demo

Repository files navigation

Presentations and Demos

In this repo, you will find the recording to my presentations. Additionally, you can find all the source codes and data used in my demos.

Presentation

EDMDATA | Azure Synapse Analytics | Conduct data analysis using Serverless SQL and Serverless Spark

In this session, I will cover:

  • Overview of Azure Synapse Analytics
  • What is Serverless SQL and Serverless Spark and the benefits of using it.
  • Explore the importance of Data Analysis - Why accurate and relevant data are important?
  • Demo: Serverless SQL and Serverless Spark to review, validate and create visualization using the sample data.

After this session, you will be able to use Azure Synapse Analytics to conduct your own data analysis for a better result.

Source code folder: EDMDATA 2024 - Conduct data analysis using Azure Synapse Analytics

MSDEVMTL Azure Databricks | Azure Data Factory | Develop data pipelines for a Medallion delta lake using Azure Databricks and Azure Data Factory

In this session, I will cover:

  • What is a Medallion delta lake architecture
  • Design and demo AutoLoader and structure streaming (Micro-batching) for batch processing using Azure Data Factory and Azure Databricks.
  • Using Databricks Serverless SQL to serve data to PowerBI.

After this session, you will be able to create your own streaming solution using Azure Databricks and Azure Data Factory.

Source code folder: Azure Databricks - MSDEVMTL1 - Develop data pipelines for a Medallion Delta Lake

Part 4: Azure Databricks | Azure Data Factory | Building streaming data pipelines with Medallion architecture

We have reached the fourth and final part of the Azure Databricks series. In this session, I will cover:

  • Design and implement streaming pipelines using Azure Databricks and Azure Data Factory.
  • A quick introduction to Delta Live Table in Azure Databricks.

After this session, you will be able to create your own streaming data pipeline for your Lakehouse using Azure Databricks and Azure Data Factory.

Source code folder: Azure Databricks - Building a streaming data pipeline

Part 3: Azure Databricks | Building a Lakehouse | Medallion architecture

This session is the third installment of a four-part series. In this session, I will discuss:

  • Create a Lakehouse with Medallion architecture.
  • Create a data model in the gold layer to share with multiple projects.
  • Implement the Azure Data Factory pipeline to automate orchestrate the Databricks notebooks.
  • Use Databricks SQL to connect to Power BI for reporting.

After this session, you will be able to create your own Lakehouse using Azure Databricks and Azure Data Factory.

Source code folder: Azure Databricks - Building a Lakehouse

Part 2: Azure Databricks | Azure Data Factory | Building your first data pipeline

This session is the second installment of a four-part series. In this session, I will discuss:

  • Introduce Delta Lake format.
  • Discuss different types of Databricks clusters and assoicated costs.
  • Introduce Azure Data Factory and how we can schedule your data pipeline.
  • A demo of using Azure Data Factory and Databricks to schedule your data pipeline.

At the end of this session, you will be able to create your own data pipeline and create a schedule for it to run automatically using Azure Databricks and Azure Data Factory.

Source code folder: Azure Databricks - Building your first pipeline

Part 1: Azure Databricks | Getting started with Azure Databricks

This is a four-part series on Azure Databricks. In this session, I will discuss:

  • Introduce Databricks and its features.
  • Compare the Databricks Community editions and Azure Databricks Premium editions.
  • Provide a brief tour of the Databricks UI.
  • A demo of using Databricks to query data using PySpark.

At the end of this session, you will gain the basic understanding of what Databricks is and how it can be used for big data processing and analytics.

Source code folder: Azure Databricks - Getting started

Azure Synapse | Data Warehousing | Dimensional Data Model | Mapping Data Flow

In this session, I will discuss the best practices for data modeling and the process of creating a data pipeline using Mapping Data flow in Synapse Analytics.

This includes:

  • What is Data Warehousing?
  • How to design and create a Dimensional data model?
  • A demo of using Azure Synapse Analytics to create a data pipeline to store data into the data warehouse.

This is the second part of a two part series on Azure Synapse Analytics.

Source code folder: Azure Synapse - Data Warehousing

Azure Synapse | Data Exploration | Serverless SQL | Serverless Spark

This is a two part series on Azure Synapse Analytics. In this session, I will guide you through the best practice for Data Exploration.

. This includes:

  • Overview of Azure Synapse Analytics
  • How to conduct a data explroation?
  • A demo of using Azure Synapse to prepare, clean and analyze data to create insight using Serverless SQL and Serverless Spark.

Source code folder: Azure Synapse - Data Exploration

Azure Data Factory | Git | CI/CD

This is the second part of the Azure Data Factory series. In this session, I will guide you through the best practice for code management and code deployment for Azure Data Factory. This includes:

  • Setting up git repository in Azure Data Factory.
  • What is continuous integration and continuous delivery (CI/CD) process?
  • A demo of creating an Azure DevOps CI/CD pipeline for Azure Data Factory.

Source code folder: Azure Data Factory - CI/CD

Azure Data Factory | Data pipeline | Mapping Data Flow

In this session, I will guide you through the best practices and the process of creating a data pipeline using Mapping Data flow in Azure Data Factory. This includes:

  • Overview of Azure Data Factory
  • How to design a data pipeline?
  • A demo of creating an end-to-end data pipeline.

Source code folder: Azure Data Factory - Data Pipeline

About

In this repository, you will find varies demo and presentations I have delivered throughout the year. This includes the link to the video, the source codes and the data files.

Topics

Resources

Stars

Watchers

Forks