Skip to content

This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.

License

Notifications You must be signed in to change notification settings

Cosmo-Tech/getting-started-with-data-injection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

getting-started-with-data-injection

This sample demonstrate how to use Azure Data Factory to create a pipeline creating Azure Digital Twins instances from your entreprise data to populate your simulation models.

Pipeline

Continuing on from our previous tutorials, our digital twin instances represent bars and their customers. Our model consists of a two twin model, a twin representing a bar and one representing a customer.

to speedrun this tutorial: Deploy to Azure

Setting up the stage

Make sure you are logged into azure portal before you start. If you followed the tutorials on model creation you already have an Azure Digital Twins Instances populated with your models and you can skip the first step and go directly to Uploading the input file to azure storage. Just make sure to create all the azure resources in the same resource group as your existing Azure Digital Twin Instances.

Digital Twins Instances

Azure Digital Twins are the data source for Cosmo Tech simulation engine and the sink of our data injection pipeline. The following arm template deploys Azure Digital Twins and injects the Warehouse twin model into it. If you are starting from scratch use a new resource group to contain all the resources of this tutorial. If you are continuing on from a previous tutorial, make sure to apply the deployment templates of this tutorial in the same resource group as your existing Digital Twins deployment.

Deploy to Azure

Input storage

Our input data file is a set of three csv files: bars.csv, customers.csv and customer_relationships.csv.

The following link will create a new storage account and a blob container named input-csv containing the input files. Additionally, it will create a second container temp-csv for temporary datasets used later in Data Factory. Select the resource group containing your Digital Twins and give a name to your new storage account.

Deploy to Azure

Setting up Data Factory

In the same resource group as your Digital Twins and storage account, create a new data factory using the following link.

Deploy to Azure

Once the deployment of the Data factory is finished, open the Data Factory Editor and select the data factory you just created.

Linked Services

Linking the input Azure Storage to Data Factory

Our data source is a csv file is stored in an azure blob storage container. Data factory connects to externals services using Linked Services. To create a linked service connecting to our azure storage from Azure Data Factory Editor, Open the management panel (1). In the Linked Service tab (2), click on new (3) then select Azure Blob Storage (4). You are presented with a configuration pane allowing you to select a storage account to link to.

Blob Storage Linked service configuration

The following link will create the link service to azure storage in data factory. enter the same resource group as before, the names of the storage account and data factory created in the previous step, and a new name for the linked service.

Deploy to Azure

Linking Data Factory to Digital Twins

Linking Data Factory to Azure Digital Twins is a two step process. First We need to install azure-digital-twin-injector, a set of Azure Functions that wraps Azure Digital Twins REST API to ease its connection into Data Factory. The following template deploys the injector Azure function. WARNING: After deploying you must set the azure function as 'Azure Digital Twins Data Owner' of the ADT you want to handle. As for the previous steps, select your resource group, enter the name of your Digital Twins, and a new name for the injector Function App.

Deploy to Azure

Once the Function App for the injector is deployed, we need to register it as a linked service in our Data Factory, so that Data Factory can call the functions. In the Azure Data Factory Editor, open the management panel (1). In the Linked Service tab (2), click on new (3). In the wizard pane on the right select the compute tab (4) and then Azure Function (5).

Blob Storage Linked service configuration

After clicking 'Continue', Select the function app you just deployed. Contrary to the storage linked service, the authentication setup is not automatic. You can get a function key from the azure portal (searching for the name of your function app from the global search bar should bring it up). In the 'App Key' tab in your Azure Function App to retrieve the default key.

The function app can now be called from the data factory. One more step is necessary, the function reads its input from a storage account deployed along the injector named storage<injector_name> (look into the your resource group for the exact name). Link this storage account into Data Factory as you did for the input storage.

Deploy to Azure

Datasets

The digital twin injector expects csv file with certain fixed columns (see ....). We need 4 files to create our digital twins and their relationships

We create 7 datasets, 3 for our inputs, and 4 for outputs.

Deploy to Azure

Dataflows

Dataflows are linear processes that transforms data from one dataset into another dataset. We need to create four dataflows, one for each output dataset:

  • a file to create bar twins, based on the info from bars.csv,
    Bar dataflow
  • a file to create bar customer twins, based on the info from customers.csv,
    Customer dataflow
  • a file to create relationships between bars and customers, based on the column 'bar' from customers.csv,
    Customer-Bar dataflow
  • a file to create relationships between customers, based on the customer_relationships.csv file.
    Customer-Customer dataflow

Deploy to Azure

Pipeline

The pipeline enforces dependencies between various steps of the data injection process. For instance in our case bars and customer twins must exist before we can create relationships between them.
Customer dataflow

Deploy to Azure

About

This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages