New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/aml managed vnet #43
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More to come!
## Scenario | ||
|
||
<!-- Describe the usage scenario for this template. Describe the challenges this recipes aims to address. --> | ||
This scenario aims to address the challenge of correctly configuring an Azure machine learning workspace within a Microsoft managed VNet including ensuring appropriate connectivity with common services such as Azure Storage Account, Azure Key Vault, Azure Container Registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario aims to address the challenge of correctly configuring an Azure machine learning workspace within a Microsoft managed VNet including ensuring appropriate connectivity with common services such as Azure Storage Account, Azure Key Vault, Azure Container Registry. | |
This scenario aims to address the challenge of correctly configuring an Azure Machine Learning (AML) workspace within a managed virtual network (managed VNet). This includes ensuring appropriate connectivity with common Azure services such as Azure Storage Account, Azure Key Vault, and Azure Container Registry. |
### Problem Summary | ||
|
||
<!--Briefly describe the problme that this recipe intends to resolve or make easier. --> | ||
Azure machine learning workspace is composed of a number of different components: Machine Learning Studio, workspace storage account, key vault, machine learning Data Pipelines, container registry and other external data sources like Azure SQL Server, ADLS Gen2. Despite being under a single machine learning umbrella service, each of these sub-components require a slightly different VNet configuration treatment to properly isolate network traffic. For example, generally you need at least four Private Endpoints configured for a single workspace each with connecting to a different sub-component. Another example, while managed workspace are generally a single tenant service with compute resources spun up within a designated Managed VNet, data scientist vm's, azure dev ops pipelines could be multi-tenanted and therefore require provisioning a Private Endpoint within the bridge vnet in order to connect to the workspace successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Azure machine learning workspace is composed of a number of different components: Machine Learning Studio, workspace storage account, key vault, machine learning Data Pipelines, container registry and other external data sources like Azure SQL Server, ADLS Gen2. Despite being under a single machine learning umbrella service, each of these sub-components require a slightly different VNet configuration treatment to properly isolate network traffic. For example, generally you need at least four Private Endpoints configured for a single workspace each with connecting to a different sub-component. Another example, while managed workspace are generally a single tenant service with compute resources spun up within a designated Managed VNet, data scientist vm's, azure dev ops pipelines could be multi-tenanted and therefore require provisioning a Private Endpoint within the bridge vnet in order to connect to the workspace successfully. | |
Azure machine learning workspace is composed of a number of different components: Machine Learning Studio, workspace storage account, key vault, machine learning Data Pipelines, container registry and other external data sources like Azure SQL Server, Azure Data Lake Storage Gen2 (ADLS Gen2). Despite being under a single machine learning umbrella service, each of these sub-components require a slightly different VNet configuration treatment to properly isolate network traffic. For instance, configuring at least four private endpoints is necessary for a single workspace, each connecting to a distinct sub-component. Another example pertains to managed workspaces, which are typically single-tenant services with compute resources deployed within a designated managed VNet. However, data scientist VMs and Azure DevOps pipelines could be multi-tenanted, and thus require the provisioning of a private endpoint within the bridge VNet in order to connect to the workspace successfully. |
<!--Briefly describe the problme that this recipe intends to resolve or make easier. --> | ||
Azure machine learning workspace is composed of a number of different components: Machine Learning Studio, workspace storage account, key vault, machine learning Data Pipelines, container registry and other external data sources like Azure SQL Server, ADLS Gen2. Despite being under a single machine learning umbrella service, each of these sub-components require a slightly different VNet configuration treatment to properly isolate network traffic. For example, generally you need at least four Private Endpoints configured for a single workspace each with connecting to a different sub-component. Another example, while managed workspace are generally a single tenant service with compute resources spun up within a designated Managed VNet, data scientist vm's, azure dev ops pipelines could be multi-tenanted and therefore require provisioning a Private Endpoint within the bridge vnet in order to connect to the workspace successfully. | ||
|
||
In addition to this, customers will also need to ensure that traffic between the Azure machine learning workspace studio can still privately flow between the workspace components and additional Azure services such as storage external to the managed VNet. This is done through the use of Private Endpoints. Another important that one has to keep in mind is the secure integration of Azure Machine Learning with Azure DevOps pipelines and Github actions that are enabled through the bridge virtual network (VNet to access resources in the architecure diagram). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to this, customers will also need to ensure that traffic between the Azure machine learning workspace studio can still privately flow between the workspace components and additional Azure services such as storage external to the managed VNet. This is done through the use of Private Endpoints. Another important that one has to keep in mind is the secure integration of Azure Machine Learning with Azure DevOps pipelines and Github actions that are enabled through the bridge virtual network (VNet to access resources in the architecure diagram). | |
In addition to this, customers will also need to ensure that traffic within the Azure machine learning workspace studio can still privately flow between the workspace components and additional Azure services such as external storage accounts. This is done through the use of private endpoints. Another crucial aspect to consider is the secure integration of Azure Machine Learning with Azure DevOps pipelines and GitHub actions, which is facilitated through a bridge virtual network ("VNet to access AML workspace and resources" in the architecture diagram). |
|
||
In addition to this, customers will also need to ensure that traffic between the Azure machine learning workspace studio can still privately flow between the workspace components and additional Azure services such as storage external to the managed VNet. This is done through the use of Private Endpoints. Another important that one has to keep in mind is the secure integration of Azure Machine Learning with Azure DevOps pipelines and Github actions that are enabled through the bridge virtual network (VNet to access resources in the architecure diagram). | ||
|
||
This recipe aims to provide developers a starting point with an IaC example of an Azure machine learning managed-vnet workspace with all sub-components correctly configured to ensure traffic stays private, while still being able to connect to common additional services such as Azure Storage Account, ADLS Gen2 and Azure Key Vault. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This recipe aims to provide developers a starting point with an IaC example of an Azure machine learning managed-vnet workspace with all sub-components correctly configured to ensure traffic stays private, while still being able to connect to common additional services such as Azure Storage Account, ADLS Gen2 and Azure Key Vault. | |
This recipe aims to provide developers with a starting point, offering an Infrastructure as Code (IaC) example of an Azure Machine Learning managed VNet workspace with the required secured networking configured as described above. |
### Architecture | ||
|
||
<!-- Include a high-level architecture diagram of the components used in this recipe. --> | ||
![architecture](./media/AML_ManagedVNet_Secure_Architecture.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest removing the Azure services if they are not deployed as part of the recipe.
Also, suggest using a different font "helvetica" or "tahoma" for the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest providing a brief description (few paragraphs) of what's happening in this architecture diagram.
In addition to adding a private endpoint for the AML workspace, a private endpoint should be created for the AML dependant resources such as Azure Blob Storage and Key Vault inside your chosen VNet to enable connectivity to these resources. | ||
|
||
### Testing Solution | ||
To test Azure Machine Learning (AML) end-to-end, you can follow these steps using the **AzureML in a day notebook** from the **Samples** folder within the Notebooks section of Azure Machine Learning: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the notebook? If it's just one notebook, I suggest copying it in the repo and add reference to it. That way, user doesn't have to go to another repo.
Having said that, please feel free to add link to the "AML rep" for reference.
|
||
In addition to adding a private endpoint for the AML workspace, a private endpoint should be created for the AML dependant resources such as Azure Blob Storage and Key Vault inside your chosen VNet to enable connectivity to these resources. | ||
|
||
### Testing Solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, please following the markdown linting rules. For example, leaving spaces before and after the headers. "markdownlint" VS Code extension can help with this.
### Testing Solution | ||
To test Azure Machine Learning (AML) end-to-end, you can follow these steps using the **AzureML in a day notebook** from the **Samples** folder within the Notebooks section of Azure Machine Learning: | ||
- Create a Compute Instance: | ||
- On the left navigation, select Compute and then Compute Instance. Create a new compute instance and supply a name. Keep all the defaults, expect under networking select No Public IP. Select Review & Create and wait for the deployment to be completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a GIF image to show this?
name = "sa${random_string.sa_prefix.result}${var.environment}" | ||
location = azurerm_resource_group.default.location | ||
resource_group_name = azurerm_resource_group.default.name | ||
account_tier = "Standard" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason for not deploying "ADLS Gen2" account? (is_hns_enabled=true).
- Sign in to the Azure Machine Learning studio. | ||
- Navigate to Notebooks and select the Samples tab. | ||
- Look for the AzureML in a day notebook. | ||
- Open the notebook and clikc on 'Clone' to create a copy in your workspace file share. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yogitasrivastava / @oloomi / @RenSilvaAU , Overall, this is looking really good. Well done! The recipes in this repo also deploys the "customer VNet" and that's missing from this sample. Suggest you add the following Azure resources:
|
Refer https://github.com/Azure-Samples/virtual-network-integration-recipes/pull/27/files?short_path=64a62a4#diff-64a62a4611883e924d833faeb925f4339bb201c9a3b2df284c1446b9db30f94b for additional guidance. Happy to chat about the details. |
Add automation for managed-vnet AML workspace (allow internet outbound) to Azure recipes
Does this introduce a breaking change?
Pull Request Type
What kind of change does this Pull Request introduce?
How to Test
git clone [repo-address] cd [repo-name] git checkout [branch-name] npm install
What to Check
Verify that the following are valid
Other Information
Please follow the Readme.md to install the code for automation.