Skip to content
How to implement a streaming at scale solution in Azure
Branch: master
Clone or download
Latest commit 8323f7a Jun 7, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode Revert "Merged with remote" Apr 16, 2019
_bootstrap updated README Jun 5, 2019
_common added more complexData to be closer to 1K Jun 3, 2019
_misc adding eh-asa-sql sample Mar 29, 2019
cosmos-db Revert "Merged with remote" Apr 16, 2019
event-hubs-capture Revert "Merged with remote" Apr 16, 2019
eventhubs-capture Revert "Merged with remote" Apr 16, 2019
eventhubs-databricks-cosmosdb Revert "Merged with remote" Apr 16, 2019
eventhubs-functions-azuresql Revert "Merged with remote" Apr 16, 2019
eventhubs-functions-cosmosdb
eventhubs-streamanalytics-azuresql updated RU Jun 7, 2019
eventhubs-streamanalytics-cosmosdb updated READMEs Jun 5, 2019
eventhubs-streamanalytics-eventhubs updated READMEs Jun 5, 2019
.gitattributes Revert "Merged with remote" Apr 16, 2019
.gitignore Revert "Merged with remote" Apr 16, 2019
CHANGELOG.md adding eh-asa-sql sample Mar 29, 2019
CONTRIBUTING.md adding eh-asa-sql sample Mar 29, 2019
LICENSE.md adding eh-asa-sql sample Mar 29, 2019
README.md corrected links May 17, 2019

README.md

Streaming at Scale

Sample end-to-end solutions to implement streaming at scale scenarios using Azure

About the repository

The sample shows how to setup an end-to-end solution to implement a streaming at scale scenario using a choice of different Azure technologies. There are many possible way to implement such solution in Azure, following Kappa or Lambda architectures, a variation of them, or even custom ones. Each architectural solution can also be implemented with different technologies, each one with its own pros and cons.

More info on Streaming architectures can also be found here:

Here's also a list of scenarios where a Streaming solution fits nicely

A good document the describes the Stream Technologies available on Azure is the following one:

Choosing a stream processing technology in Azure

The goal of this repository is to showcase all the possible common architectural solution and implementation, describe the pros and the cons and provide you with sample script to deploy the whole solution with 100% automation.

Running the samples

All samples uses AZ CLI and Bash scripts. Make sure you have AZ CLI installed:

https://docs.microsoft.com/en-us/cli/azure/?view=azure-cli-latest

If you're running on Windows, it is suggested to run script from WSL

https://docs.microsoft.com/en-us/windows/wsl/install-win10

although you can also run them from any Bash environment. Just keep in mind that script have been tested on Ubuntu on WSL and OS X only.

In order to clone the repository you'll also need Git:

https://git-scm.com/downloads

The Git For Windows version comes with a Bash too

https://gitforwindows.org/

Some samples may have more specific needs. In that case the required software will be mentioned in sample's readme.

Available solutions

At present time the available solutions are

Event Hubs Capture Sample

Implement stream processing architecture using:

  • Event Hubs (Ingest)
  • Event Hubs Capture (Store)
  • Azure Blob Store (Data Lake)
  • Apache Drill (Query/Serve)

Event Hubs + Azure Functions + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Azure Functions (Stream Process)
  • Cosmos DB (Serve)

Event Hubs + Stream Analytics + Cosmos DB

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Cosmos DB (Serve)

Event Hubs + Stream Analytics + Azure SQL

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Azure SQL (Serve)

Event Hubs + Stream Analytics + Event Hubs

Implement a stream processing architecture using:

  • Event Hubs (Ingest / Immutable Log)
  • Stream Analytics (Stream Process)
  • Event Hubs (Serve)

Roadmap

The following technologies are planned to be used in the end-to-end sample solution

Ingestion

  • IoT Hub
  • EventHub Kafka

Stream Processing

  • Databricks Spark Structured Streaming
  • Azure Data Explorer

Batch Processing

  • EventHubs Capture
  • Databricks Spark
  • Azure Data Explorer
  • Open Source solutions (like Apache Drill)

Serving Layer

  • Azure Data Explorer
  • Azure DW
You can’t perform that action at this time.