Skip to content

Setting up private networking for Cromwell on Azure

Venkat Malladi edited this page Apr 28, 2023 · 11 revisions

The following are instructions on how to setup a resource group, with storage account, virtual network, and Azure Container Registry to run Cromwell on Azure with private networking so that no components have public IP addresses, all traffic is routed through the virtual network or private endpoints.

0. Set variables and create Resource group:

subscription="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
resource_group_name=rgname
vnet_name=vnetname
mycontainerregistry=containerregistry
storage_account_name=storageaccount123
private_endpoint_name_storage=myprivateendpoint123storage
private_endpoint_name_cr=myprivateendpoint123cr
location=eastus
    
// Create Resource group

az group create -n $resource_group_name -l $location

1. Provision a virtual network and subnets for CoA resources with an appropriate CIDR.

az network vnet create -g $resource_group_name -n $vnet_name --address-prefixes 10.1.0.0/16
az network vnet subnet create -g $resource_group_name --vnet-name $vnet_name -n vmsubnet --address-prefixes 10.1.0.0/24
az network vnet subnet create -g $resource_group_name --vnet-name $vnet_name -n sqlsubnet --address-prefixes 10.1.1.0/24
az network vnet subnet create -g $resource_group_name --vnet-name $vnet_name -n batchnodessubnet --address-prefixes 10.1.2.0/24
az network vnet subnet create -g $resource_group_name --vnet-name $vnet_name -n pesubnet --address-prefixes 10.1.3.0/24

az network vnet subnet update --resource-group $resource_group_name --vnet-name $vnet_name --name vmsubnet --service-endpoints "Microsoft.Storage"
az network vnet subnet update --resource-group $resource_group_name --vnet-name $vnet_name --name sqlsubnet --service-endpoints "Microsoft.Storage"
az network vnet subnet update --resource-group $resource_group_name --vnet-name $vnet_name --name batchnodessubnet --service-endpoints "Microsoft.Storage"

az network vnet subnet update \
        --name pesubnet \
        --resource-group $resource_group_name \
        --vnet-name $vnet_name \
        --disable-private-endpoint-network-policies true

az network vnet subnet update \
        --name batchnodessubnet \
        --resource-group $resource_group_name \
        --vnet-name $vnet_name \
        --disable-private-link-service-network-policies true

2. Provision a VM to run the CoA deployer. Since the CoA deployment will not have direct internet access, we need to create a temporary jumpbox on the virtal network with a public IP address so the deployer can have ssh access to the CoA deployment.

az vm create -n PrivateDeployerVM3 -g $resource_group_name --vnet-name $vnet_name --subnet vmsubnet --image canonical:0001-com-ubuntu-server-jammy:22_04-lts-gen2:latest --admin-username azureuser --generate-ssh-keys

3. SSH into the deployer VM we created

ssh azureuser@$(az vm show -d -g $resource_group_name -n $vnet_name_private --query publicIps -o tsv)
    
    
// Install AZ CLI

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
    
// Login into Az

az login  


// Copy and paste all the environments variables such as $subscription from the other shell. 

az account set --subscription $subscription
    

4. Create Storage Account to be used by CoA without public access, and establish private endpoints with virtual network we already created.

az storage account create --name $storage_account_name --resource-group $resource_group_name --allow-blob-public-access false

az storage account network-rule add --resource-group $resource_group_name --account-name $storage_account_name --subnet vmsubnet --vnet-name $vnet_name
az storage account network-rule add --resource-group $resource_group_name --account-name $storage_account_name --subnet sqlsubnet --vnet-name $vnet_name
az storage account network-rule add --resource-group $resource_group_name --account-name $storage_account_name --subnet batchnodessubnet --vnet-name $vnet_name
    
    stroageAccountId="/subscriptions/$subscription/resourceGroups/$resource_group_name/providers/Microsoft.Storage/storageAccounts/$storage_account_name"
    MSYS_NO_PATHCONV=1 az network private-endpoint create \
            --name $private_endpoint_name_storage \
            --resource-group $resource_group_name \
            --vnet-name $vnet_name  \
            --subnet pesubnet \
            --private-connection-resource-id $stroageAccountId \
            --group-id "Blob" \
            --connection-name "myConnection"

4. Provision Azure Container Registry

// Instal docker if not on machine

sudo apt  install docker.io
sudo usermod -aG docker azureuser

// Exit and login again to get new group permissions.
    
    
az acr create --resource-group $resource_group_name --name $mycontainerregistry --sku Premium
az acr login --name $mycontainerregistry
    
// To account for dockerinDocker [issue 401]
(https://github.com/microsoft/CromwellOnAzure/issues/401)

wget https://raw.githubusercontent.com/microsoft/CromwellOnAzure/develop/src/deploy-cromwell-on-azure/samples/docker-dockerfile -O Dockerfile
docker build -t $mycontainerregistry.azurecr.io/docker:v1 .
docker push $mycontainerregistry.azurecr.io/docker:v1

az acr import \
      --name $mycontainerregistry \
      --source mcr.microsoft.com/blobxfer \
      --image blobxfer:v1

az acr import \
      --name $mycontainerregistry \
      --source mcr.microsoft.com/mirror/docker/library/ubuntu:22.04 \
      --image ubuntu:22.04
    
az acr update --name $mycontainerregistry --public-network-enabled false
    
  
acrID="/subscriptions/$subscription/resourceGroups/$resource_group_name/providers/Microsoft.ContainerRegistry/registries/$mycontainerregistry"
    MSYS_NO_PATHCONV=1 az network private-endpoint create \
    --name $private_endpoint_name_cr \
    --resource-group $resource_group_name \
    --vnet-name $vnet_name  \
    --subnet pesubnet \
    --private-connection-resource-id $acrID \
    --group-id "registry" \
    --connection-name "myConnection"
    
zoneName="privatelink.azurecr.io"

az network private-dns zone create --resource-group $resource_group_name \
        --name  $zoneName

az network private-dns link vnet create --resource-group $resource_group_name \
        --zone-name  $zoneName\
        --name myzonelink \
        --virtual-network $vnet_name \
        --registration-enabled false 

az network private-endpoint dns-zone-group create \
        --resource-group $resource_group_name \
        --endpoint-name $private_endpoint_name_cr \
        --name "MyPrivateZoneGroup" \
        --private-dns-zone $zoneName \
        --zone-name "myzone" 

5. Run the deployer.

// Install helm

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
    
// Set environment variables for CoA

version=4.2.0
coa_identifier=vsmp
batchsubnetid=$(az network vnet subnet show --resource-group $resource_group_name --vnet-name $vnet_name --name batchnodessubnet --query id --output tsv)
    
//Download the installer

wget https://github.com/microsoft/CromwellOnAzure/releases/download/$version/deploy-cromwell-on-azure-linux.tar.gz
tar -xf deploy-cromwell-on-azure-linux.tar.gz 
chmod 744 deploy-cromwell-on-azure-linux
   
export AzureServicesAuthConnectionString="RunAs=Developer;DeveloperTool=AzureCli"
./deploy-cromwell-on-azure-linux --SubscriptionId $subscription --RegionName $location \
        --MainIdentifierPrefix $coa_identifier \
        --StorageAccountName $storage_account_name \
        --PrivateNetworking true \
        --BatchNodesSubnetId $batchsubnetid \
        --DisableBatchNodesPublicIpAddress true \
        --DockerInDockerImageName "$mycontainerregistry.azurecr.io/docker:v1" \
        --BlobxferImageName "$mycontainerregistry.azurecr.io/blobxfer:v1" \
        --ResourceGroupName $resource_group_name \
        --VnetName $vnet_name \
        --VnetResourceGroupName $resource_group_name \
        --VmSubnetName vmsubnet \
        --PostgreSqlSubnetName sqlsubnet \
        --HelmBinaryPath /usr/sbin/helm

6. Update wdl


//[Use private Docker containers hosted on Azure](https://github.com/microsoft/CromwellOnAzure/blob/develop/docs/troubleshooting-guide.md#use-private-docker-containers-hosted-on-azure)
    
    //Update path of container in test.wdl ($storage_account_name/inputs.test.wdl)  [issue 585](https://github.com/microsoft/CromwellOnAzure/issues/585)  

     docker: '$mycontainerregistry.azurecr.io/ubuntu:22.04'
     
     //Upload new trigger file to test.
     
     {
    "WorkflowUrl": "/$storage_account_name/inputs/test/test.wdl",
    "WorkflowInputsUrl": "/$storage_account_name/inputs/test/testInputs.json",
    "WorkflowInputsUrls": null,
    "WorkflowOptionsUrl": null,
    "WorkflowDependenciesUrl": null,
  }

For CoA revisions prior to 4.0 that use VM please add these additional steps.

4. Create CosmosDb to be used by CoA without public access, and establish private endpoints with virtual network we already created.

cosmos_db_name=coacosmosdbjsaun123

az cosmosdb create --name $cosmos_db_name --resource-group $resource_group_name \
    --default-consistency-level Eventual \
    --locations regionName="$location" failoverPriority=0 isZoneRedundant=False \
    --locations regionName="$failoverLocation" failoverPriority=1 isZoneRedundant=False

cosmosId="/subscriptions/$subscription/resourceGroups/$resource_group_name/providers/Microsoft.DocumentDB/databaseAccounts/$cosmos_db_name"
MSYS_NO_PATHCONV=1 az network private-endpoint create \
    --name $private_endpoint_name_cosmos \
    --resource-group $resource_group_name \
    --vnet-name $vnet_name  \
    --subnet pesubnet \
    --private-connection-resource-id $cosmosId \
    --group-id "Sql" \
    --connection-name "myConnection"
zoneName="privatelink.documents.azure.com"

az network private-dns zone create --resource-group $resource_group_name \
    --name  $zoneName

az network private-dns link vnet create --resource-group $resource_group_name \
    --zone-name  $zoneName\
    --name myzonelink \
    --virtual-network $vnet_name \
    --registration-enabled false 

az network private-endpoint dns-zone-group create \
    --resource-group $resource_group_name \
    --endpoint-name $private_endpoint_name_cosmos \
    --name "MyPrivateZoneGroup" \
    --private-dns-zone $zoneName \
    --zone-name "myzone" 

For CoA revisions prior to 4.2 please add the following.

6. Update Managed identity access to run test workflow

    // Add Network Contributor as Resource Group [issue 450](https://github.com/microsoft/CromwellOnAzure/issues/450)
    Add MI of CoA VM Network Contributor RBAC at the Resource Group it fixes this issues
    Restart CoA VM
Clone this wiki locally