Skip to content

Start Data Explorer when stopped #1637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 25, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs-mslearn/toolkit/hubs/data-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ The following diagram depicts the end-to-end data ingestion process within FinOp
5. (Optional) If using Azure Data Explorer:
1. The **ingestion_ExecuteETL** pipeline queues the Data Explorer ingestion pipeline when **manifest.json** files are added to the **ingestion** container.
- If ingesting custom datasets outside of Cost Management exports, create an empty **manifest.json** file in the target ingestion folder after all other files are ready (don't add this file when files are still uploading). The **manifest.json** file isn't parsed and can be empty. The sole purpose is to indicate that all files for this ingestion job are added.
- If the cluster is not running, the pipeline will start it. Azure Data Explorer can take 15 minutes or more to start.
2. The **ingestion_ETL_dataExplorer** pipeline ingests data into the `{dataset}_raw` table in the Data Explorer.
- The dataset name is the first folder in the **ingestion** container.
- All raw tables are in the **Ingestion** database in Data Explorer.
Expand Down
150 changes: 147 additions & 3 deletions src/templates/finops-hub/modules/dataFactory.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,11 @@ var storageRbacRoles = [
'18d7d88d-d35e-4fb5-a5c3-7773c20a72d9' // User Access Administrator https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator
]

// Roles for ADF to to start check ADX cluster and to start cluster if stopped
var adxRbacRoles = [
'b24988ac-6180-42a0-ab88-20f7382dd24c' // Contributor permissions on the cluster
]

//==============================================================================
// Resources
//==============================================================================
Expand All @@ -169,6 +174,11 @@ resource keyVault 'Microsoft.KeyVault/vaults@2023-02-01' existing = if (!empty(r
name: keyVaultName
}

// Get ADX cluster instance
resource dataExplorerCluster 'Microsoft.Kusto/clusters@2023-08-15' existing = if (deployDataExplorer) {
name: dataExplorerName
}

// cSpell:ignore azuretimezones
module azuretimezones 'azuretimezones.bicep' = {
name: 'azuretimezones'
Expand Down Expand Up @@ -344,6 +354,17 @@ resource factoryIdentityStorageRoleAssignments 'Microsoft.Authorization/roleAssi
}
}]

// Grant ADF identity access to manage ADX cluster
resource factoryIdentityDataExplorerRoleAssignments 'Microsoft.Authorization/roleAssignments@2022-04-01' = [for role in adxRbacRoles: {
name: guid(dataExplorerCluster.id, role, dataFactory.id)
scope: dataExplorerCluster
properties: {
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', role)
principalId: dataFactory.identity.principalId
principalType: 'ServicePrincipal'
}
}]

//------------------------------------------------------------------------------
// Delete old triggers and pipelines
//------------------------------------------------------------------------------
Expand Down Expand Up @@ -4936,7 +4957,7 @@ resource pipeline_ExecuteIngestionETL 'Microsoft.DataFactory/factories/pipelines
]
}
{
activity: 'Set Ingestion Timestamp'
activity: 'Data Explorer validation'
dependencyConditions: [
'Succeeded'
]
Expand Down Expand Up @@ -5026,6 +5047,130 @@ resource pipeline_ExecuteIngestionETL 'Microsoft.DataFactory/factories/pipelines
}
]
}
}
{
name: 'Data Explorer validation'
description: 'when the FTK is deployed with an Azure Data Explorer instance, a start of the instance will be initiated to ensure a running instance'
type: 'IfCondition'
dependsOn: [
{
activity: 'Set Ingestion Timestamp'
dependencyConditions: [
'Succeeded'
]
}
]
userProperties: []
typeProperties: {
expression: {
value: '@equals(${deployDataExplorer}, true)'
type: 'Expression'
}
ifTrueActivities: [
{
name: 'Start ADX Cluster'
type: 'WebActivity'
dependsOn: []
policy: {
timeout: '0.12:00:00'
retry: 0
retryIntervalInSeconds: 30
secureOutput: false
secureInput: false
}
userProperties: []
typeProperties: {
method: 'POST'
url: {
value: '${environment().resourceManager}${dataExplorerCluster.id}/start?api-version=2024-04-13'
type: 'Expression'
}
body: '{}'
authentication: {
type: 'MSI'
resource: {
value: environment().resourceManager
type: 'Expression'
}
}
}
}
{
name: 'Error ADX Start'
type: 'Fail'
dependsOn: [
{
activity: 'Start ADX Cluster After Error'
dependencyConditions: [
'Failed'
]
}
]
userProperties: []
typeProperties: {
message: {
value:'@concat(\'Failed to start DataExplorer Instance. Message: \', activity(\'Start ADX Cluster After Error\').output.error.message)'
type: 'Expression'
}
errorCode: {
value: '@activity(\'Start ADX Cluster After Error\').output.error.code'
type: 'Expression'
}
}
}
{
name: 'Wait ADX Provision State'
type: 'Wait'
dependsOn: [
{
activity: 'Start ADX Cluster'
dependencyConditions: [
'Failed'
]
}
]
userProperties: []
typeProperties: {
waitTimeInSeconds: 600
}
}
{
name: 'Start ADX Cluster After Error'
type: 'WebActivity'
dependsOn: [
{
activity: 'Wait ADX Provision State'
dependencyConditions: [
'Succeeded'
]
}
]
policy: {
timeout: '0.12:00:00'
retry: 0
retryIntervalInSeconds: 30
secureOutput: false
secureInput: false
}
userProperties: []
typeProperties: {
method: 'POST'
url: {
value: '${environment().resourceManager}${dataExplorerCluster.id}/start?api-version=2024-04-13'
type: 'Expression'
body: '{}'
}
authentication: {
type: 'MSI'
resource: {
value: environment().resourceManager
type: 'Expression'
}
}
}
}
]
}
}
]
parameters: {
Expand All @@ -5038,8 +5183,7 @@ resource pipeline_ExecuteIngestionETL 'Microsoft.DataFactory/factories/pipelines
type: 'string'
}
timestamp: {
type: 'string'
}
type: 'string'
}
annotations: [
'New ingestion'
Expand Down