From 683185eb10752938210545c06afb12e39003ebe2 Mon Sep 17 00:00:00 2001 From: Chad Kittel Date: Tue, 22 Apr 2025 16:22:00 +0000 Subject: [PATCH] End to end walkthrough cleanup --- README.md | 25 +-- infra-as-code/bicep/acr.bicep | 16 +- infra-as-code/bicep/keyvault.bicep | 19 +- infra-as-code/bicep/machinelearning.bicep | 197 +++++++++++++++++- infra-as-code/bicep/main.bicep | 6 +- .../modules/keyvaultRoleAssignment.bicep | 2 +- infra-as-code/bicep/openai.bicep | 30 ++- infra-as-code/bicep/storage.bicep | 51 ++++- infra-as-code/bicep/webapp.bicep | 2 +- 9 files changed, 307 insertions(+), 41 deletions(-) diff --git a/README.md b/README.md index 81f5304..2ea040d 100644 --- a/README.md +++ b/README.md @@ -61,8 +61,6 @@ Follow these instructions to deploy this example to your Azure subscription, try If you're executing this from WSL, be sure the Azure CLI is installed in WSL and is not using the version installed in Windows. `which az` should show `/usr/bin/az`. -- The [az Bicep tools installed](https://learn.microsoft.com/azure/azure-resource-manager/bicep/install) - ### 1. :rocket: Deploy the infrastructure The following steps are required to deploy the infrastructure from the command line. @@ -98,7 +96,7 @@ The following steps are required to deploy the infrastructure from the command l 1. Create a resource group and deploy the infrastructure. *There is an optional tracking ID on this deployment. To opt out of its use, add the following parameter to the deployment code below: `-p telemetryOptOut true`.* - + ```bash RESOURCE_GROUP=rg-chat-basic-${LOCATION} az group create -l $LOCATION -n $RESOURCE_GROUP @@ -150,11 +148,9 @@ Here you'll test your flow by invoking it directly from the Azure AI Foundry por 1. Click **Start compute session**. -1. :clock8: Wait for that button to change to *Compute session running*. This may take about five minutes. - - If you get an error related to pip and dependency resolver, this is because of the temporary workaround you followed in the prior steps, this is safe to ignore. +1. :clock8: Wait for that button to change to *Compute session running*. This may take about six minutes. - *Do not advance until the serverless compute is running.* + *Do not advance until the serverless compute session is running.* 1. Click the enabled **Chat** button on the UI. @@ -174,7 +170,7 @@ Here you'll take your tested flow and deploy it to a managed online endpoint. - **Deployment name**: ept-chat-deployment - **Virtual machine**: Choose a small virtual machine size from which you have quota. 'Standard_D2as_v4' is plenty for this sample. - - **Instance count**: 3. This is the recommended minimum count. + - **Instance count**: 3. *This is the recommended minimum count.* - **Inferencing data collection**: Enabled 1. Set the following Advanced settings, and click **Next**. @@ -196,7 +192,7 @@ Here you'll take your tested flow and deploy it to a managed online endpoint. 1. :clock9: Wait for the deployment to finish creating. - The deployment can take over ten minutes to create. To check on the process, navigate to the **Deployments** screen using the link in the left navigation. Eventually 'ept-chat-deployment' will be on this list and then eventually the deployment will be listed with a State of 'Succeeded'. Use the **Refresh** button as needed. + The deployment can take over ten minutes to create. To check on the process, navigate to the deployments screen using **Models + endpoints** the link in the left navigation. Eventually 'ept-chat-deployment' will be on this list and the deployment will be listed with a State of 'Succeeded'. Use the **Refresh** button as needed. *Do not advance until this deployment is complete.* @@ -215,12 +211,12 @@ Workloads build chat functionality into an application. Those interfaces usually ```bash APPSERVICE_NAME=app-$BASE_NAME -az webapp deploy -g $RESOURCE_GROUP -n $APPSERVICE_NAME --type zip --src-url https://raw.githubusercontent.com/Azure-Samples/openai-end-to-end-basic/main/website/chatui.zip +az webapp deploy -g $RESOURCE_GROUP -n $APPSERVICE_NAME --type zip --src-url https://github.com/Azure-Samples/openai-end-to-end-basic/raw/refs/heads/main/website/chatui.zip ``` -> Sometimes the prior deployment will fail with a `GatewayTimeout`. If you receive that error, you're safe to simply execute the command again. +> Sometimes the prior command will fail with a `GatewayTimeout`. If you receive that error, you're safe to simply execute the command again. -## :checkered_flag: Try it out. Test the deployed application. +## :checkered_flag: Try it out. Test the deployed application After the deployment is complete, you can try the deployed application by navigating to the Web App's URL in a web browser. @@ -230,15 +226,16 @@ You can also execute the following from your workstation. Unfortunately, this co az webapp browse -g $RESOURCE_GROUP -n $APPSERVICE_NAME ``` -Once you're there, ask your solution a question. Like before, you question should ideally involve recent data or events, something that would only be known by the RAG process including content from Wikipedia. +Once you're there, ask your solution a question. Like before, you question should ideally involve recent data or events, something that would only be known by the RAG process including context from Wikipedia. ## :broom: Clean up resources -Most Azure resources deployed in the prior steps will incur ongoing charges unless removed. Additionally, a few of the resources deployed go into a soft delete status which may restrict the ability to redeploy another resource with the same name and may not release quota, so it is best to purge any soft deleted resources once you are done exploring. Use the following commands to delete the deployed resources and resource group and to purge each of the resources with soft delete. +Most Azure resources deployed in the prior steps will incur ongoing charges unless removed. Additionally, a few of the resources deployed go into a soft delete status which will restrict the ability to redeploy another resource with the same name and might not release quota. It's best to purge any soft deleted resources once you are done exploring. Use the following commands to delete the deployed resources and resource group and to purge each of the resources with soft delete. > **Note:** This will completely delete any data you may have included in this example and it will be unrecoverable. ```bash +# These deletes and purges take about 30 minutes to run. az group delete -n $RESOURCE_GROUP -y # Purge the soft delete resources diff --git a/infra-as-code/bicep/acr.bicep b/infra-as-code/bicep/acr.bicep index c97dfc7..8c30a4f 100755 --- a/infra-as-code/bicep/acr.bicep +++ b/infra-as-code/bicep/acr.bicep @@ -25,7 +25,7 @@ resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' exis name: logWorkspaceName } -resource acrResource 'Microsoft.ContainerRegistry/registries@2023-01-01-preview' = { +resource acrResource 'Microsoft.ContainerRegistry/registries@2024-11-01-preview' = { name: acrName location: location sku: { @@ -41,18 +41,28 @@ resource acrResource 'Microsoft.ContainerRegistry/registries@2023-01-01-preview' networkRuleBypassOptions: 'None' publicNetworkAccess: 'Enabled' zoneRedundancy: 'Disabled' + dataEndpointEnabled: true + metadataSearch: 'Disabled' } } //ACR diagnostic settings resource acrResourceDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${acrResource.name}-diagnosticSettings' + name: 'default' scope: acrResource properties: { workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' + category: 'ContainerRegistryRepositoryEvents' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'ContainerRegistryLoginEvents' enabled: true retentionPolicy: { enabled: false diff --git a/infra-as-code/bicep/keyvault.bicep b/infra-as-code/bicep/keyvault.bicep index f0f1839..ab43bd7 100755 --- a/infra-as-code/bicep/keyvault.bicep +++ b/infra-as-code/bicep/keyvault.bicep @@ -21,7 +21,7 @@ resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' exis name: logWorkspaceName } -resource keyVault 'Microsoft.KeyVault/vaults@2019-09-01' = { +resource keyVault 'Microsoft.KeyVault/vaults@2024-11-01' = { name: keyVaultName location: location properties: { @@ -32,6 +32,8 @@ resource keyVault 'Microsoft.KeyVault/vaults@2019-09-01' = { networkAcls: { defaultAction: 'Allow' // Production readiness change: This sample uses identity as the perimeter. Production scenarios should layer in network perimeter control as well. bypass: 'AzureServices' // Required for AppGW communication if firewall is enabled in the future. + ipRules: [] + virtualNetworkRules: [] } tenantId: subscription().tenantId @@ -39,6 +41,9 @@ resource keyVault 'Microsoft.KeyVault/vaults@2019-09-01' = { enableRbacAuthorization: true // Using RBAC enabledForDeployment: true // VMs can retrieve certificates enabledForTemplateDeployment: true // ARM can retrieve values + accessPolicies: [] // Using RBAC + publicNetworkAccess: 'Enabled' // Production readiness change: This sample uses identity as the perimeter. Production scenarios should layer in network perimeter control as well. + enabledForDiskEncryption: false enableSoftDelete: true softDeleteRetentionInDays: 7 @@ -48,13 +53,21 @@ resource keyVault 'Microsoft.KeyVault/vaults@2019-09-01' = { //Key Vault diagnostic settings resource keyVaultDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${keyVault.name}-diagnosticSettings' + name: 'default' scope: keyVault properties: { workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' + category: 'AuditEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AzurePolicyEvaluationDetails' enabled: true retentionPolicy: { enabled: false diff --git a/infra-as-code/bicep/machinelearning.bicep b/infra-as-code/bicep/machinelearning.bicep index 41874ef..6b24edf 100755 --- a/infra-as-code/bicep/machinelearning.bicep +++ b/infra-as-code/bicep/machinelearning.bicep @@ -103,7 +103,7 @@ resource amlWorkspaceSecretsReaderRole 'Microsoft.Authorization/roleDefinitions@ // Endpoint -> AcrPull to the Container Registry // Endpoint -> Storage Blob Data Contributor to the storage account -// To light up the Azure AI portal experience, the user themsleves need a few data plane permissions. To simulate that for this implementation +// To light up the Azure AI portal experience, the user themselves need a few data plane permissions. To simulate that for this implementation // we will assign the user that is running this deployment the following three roles: @description('Assign your user the ability to manage files in storage. This is needed to use the Prompt flow editor in the Azure AI Foundry portal.') @@ -153,7 +153,7 @@ resource aiHub 'Microsoft.MachineLearningServices/workspaces@2024-07-01-preview' tier: 'Basic' } identity: { - type: 'SystemAssigned' // This resource's identity is automatically assigned priviledge access to ACR, Storage, Key Vault, and Application Insights. + type: 'SystemAssigned' // This resource's identity is automatically assigned privileged access to ACR, Storage, Key Vault, and Application Insights. } properties: { friendlyName: 'Azure OpenAI Chat Hub' @@ -201,7 +201,7 @@ resource aiHub 'Microsoft.MachineLearningServices/workspaces@2024-07-01-preview' } } -@description('Azure Diagnostics: Azure AI Foundry hub - allLogs') +@description('Azure Diagnostics: Azure AI Foundry hub') resource aiHubDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { name: 'default' scope: aiHub @@ -209,7 +209,7 @@ resource aiHubDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-pre workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' // Production readiness change: In production, this is probably excessive. Please tune to just the log streams that add value to your workload's operations. + category: 'ComputeInstanceEvent' enabled: true retentionPolicy: { enabled: false @@ -232,7 +232,7 @@ resource chatProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = tier: 'Basic' } identity: { - type: 'SystemAssigned' // This resource's identity is automatically assigned priviledge access to ACR, Storage, Key Vault, and Application Insights. + type: 'SystemAssigned' // This resource's identity is automatically assigned privileged access to ACR, Storage, Key Vault, and Application Insights. } properties: { friendlyName: 'Chat with Wikipedia project' @@ -256,7 +256,7 @@ resource chatProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = } // TODO: Noticed that traffic goes back to 0% if this is template redeployed after the Prompt flow - // deplopyment is complete. How can we stop that? + // deployment is complete. How can we stop that? } } @@ -285,15 +285,176 @@ resource projectOpenAIUserForOnlineEndpointRoleAssignment 'Microsoft.Authorizati } } -@description('Azure Diagnostics: AI Foundry chat project - allLogs') +@description('Azure Diagnostics: AI Foundry chat project') resource chatProjectDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { name: 'default' scope: chatProject properties: { workspaceId: logWorkspace.id logs: [ + // Production readiness change: In production, these log categories are probably excessive. Please tune to just enable the log streams that add value to your workload's operations. { - categoryGroup: 'allLogs' // Production readiness change: In production, this is probably excessive. Please tune to just the log streams that add value to your workload's operations. + category: 'AmlComputeClusterEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlComputeClusterNodeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlComputeJobEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlComputeCpuGpuUtilization' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlRunStatusChangedEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'ModelsChangeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'ModelsReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'ModelsActionEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DeploymentReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DeploymentEventACI' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'InferencingOperationACI' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'EnvironmentChangeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'EnvironmentReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DataLabelChangeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DataLabelReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DataSetChangeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'DataSetReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'PipelineChangeEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'PipelineReadEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'RunEvent' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'RunReadEvent' enabled: true retentionPolicy: { enabled: false @@ -305,7 +466,7 @@ resource chatProjectDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05- } -@description('Azure Diagnostics: AI Foundry chat project -> endpoint allLogs') +@description('Azure Diagnostics: AI Foundry chat project -> endpoint') resource chatProjectEndpointDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { name: 'default' scope: chatProject::endpoint @@ -313,7 +474,23 @@ resource chatProjectEndpointDiagSettings 'Microsoft.Insights/diagnosticSettings@ workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' // Production readiness change: In production, this is probably excessive. Please tune to just the log streams that add value to your workload's operations. + category: 'AmlOnlineEndpointConsoleLog' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlOnlineEndpointTrafficLog' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AmlOnlineEndpointEventLog' enabled: true retentionPolicy: { enabled: false diff --git a/infra-as-code/bicep/main.bicep b/infra-as-code/bicep/main.bicep index a19900b..217b3cb 100755 --- a/infra-as-code/bicep/main.bicep +++ b/infra-as-code/bicep/main.bicep @@ -18,7 +18,7 @@ param telemetryOptOut bool = false var varCuaid = '6aa4564a-a8b7-4ced-8e57-1043a41f4747' // ---- Log Analytics workspace ---- -resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' = { +resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2023-09-01' = { name: 'log-${baseName}' location: location properties: { @@ -26,6 +26,10 @@ resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' = { name: 'PerGB2018' } retentionInDays: 30 + forceCmkForQuery: false + workspaceCapping: { + dailyQuotaGb: 10 // Production readiness change: In production, tune this value to ensure operational logs are collected, but a reasonable cap is set. + } publicNetworkAccessForIngestion: 'Enabled' publicNetworkAccessForQuery: 'Enabled' } diff --git a/infra-as-code/bicep/modules/keyvaultRoleAssignment.bicep b/infra-as-code/bicep/modules/keyvaultRoleAssignment.bicep index fde2522..a804e8d 100755 --- a/infra-as-code/bicep/modules/keyvaultRoleAssignment.bicep +++ b/infra-as-code/bicep/modules/keyvaultRoleAssignment.bicep @@ -15,7 +15,7 @@ param principalId string param keyVaultName string // ---- Existing resources ---- -resource keyVault 'Microsoft.KeyVault/vaults@2023-02-01' existing = { +resource keyVault 'Microsoft.KeyVault/vaults@2024-11-01' existing = { name: keyVaultName } diff --git a/infra-as-code/bicep/openai.bicep b/infra-as-code/bicep/openai.bicep index 33beb4f..34d1245 100755 --- a/infra-as-code/bicep/openai.bicep +++ b/infra-as-code/bicep/openai.bicep @@ -141,7 +141,7 @@ resource openAiAccount 'Microsoft.CognitiveServices/accounts@2023-10-01-preview' format: 'OpenAI' name: 'gpt-35-turbo' version: '0125' // If your selected region doesn't support this version, please change it. - // az cognitiveservices model list -l YOUR_REGION --query "sort([?model.name == 'gpt-35-turbo' && kind == 'OpenAI'].model.version)" -o tsv + // az cognitiveservices model list -l $LOCATION --query "sort([?model.name == 'gpt-35-turbo' && kind == 'OpenAI'].model.version)" -o tsv } raiPolicyName: openAiAccount::blockingFilter.name versionUpgradeOption: 'OnceNewDefaultVersionAvailable' // Production readiness change: Always be explicit about model versions, use 'NoAutoUpgrade' to prevent version changes. @@ -151,13 +151,37 @@ resource openAiAccount 'Microsoft.CognitiveServices/accounts@2023-10-01-preview' //OpenAI diagnostic settings resource openAIDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${openAiAccount.name}-diagnosticSettings' + name: 'default' scope: openAiAccount properties: { workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' + category: 'Audit' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'RequestResponse' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'AzureOpenAIRequestUsage' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'Trace' enabled: true retentionPolicy: { enabled: false diff --git a/infra-as-code/bicep/storage.bicep b/infra-as-code/bicep/storage.bicep index 199996a..952b958 100755 --- a/infra-as-code/bicep/storage.bicep +++ b/infra-as-code/bicep/storage.bicep @@ -23,7 +23,7 @@ resource logWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' exis } // ---- Storage resources ---- -resource aiStudioStorageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = { +resource aiStudioStorageAccount 'Microsoft.Storage/storageAccounts@2024-01-01' = { name: aiStudioStorageAccountName location: location sku: { @@ -35,7 +35,16 @@ resource aiStudioStorageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = accessTier: 'Hot' allowBlobPublicAccess: true allowSharedKeyAccess: true + isSftpEnabled: false + isHnsEnabled: false allowCrossTenantReplication: false + defaultToOAuthAuthentication: true + isLocalUserEnabled: false + routingPreference: { + publishInternetEndpoints: true + publishMicrosoftEndpoints: true + routingChoice: 'MicrosoftRouting' + } encryption: { keySource: 'Microsoft.Storage' requireInfrastructureEncryption: false @@ -75,13 +84,29 @@ resource aiStudioStorageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = @description('Azure AI Foundry\'s blob storage account diagnostic settings.') resource aiStudioStorageAccountBlobDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${aiStudioStorageAccount.name}-blobdiagnosticSettings' + name: 'default' scope: aiStudioStorageAccount::Blob properties: { workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' + category: 'StorageRead' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'StorageWrite' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'StorageDelete' enabled: true retentionPolicy: { enabled: false @@ -95,13 +120,29 @@ resource aiStudioStorageAccountBlobDiagSettings 'Microsoft.Insights/diagnosticSe @description('Azure AI Foundry\'s file storage account diagnostic settings.') resource aiStudioStorageAccountFileDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${aiStudioStorageAccount.name}-filediagnosticSettings' + name: 'default' scope: aiStudioStorageAccount::File properties: { workspaceId: logWorkspace.id logs: [ { - categoryGroup: 'allLogs' + category: 'StorageRead' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'StorageWrite' + enabled: true + retentionPolicy: { + enabled: false + days: 0 + } + } + { + category: 'StorageDelete' enabled: true retentionPolicy: { enabled: false diff --git a/infra-as-code/bicep/webapp.bicep b/infra-as-code/bicep/webapp.bicep index 7205fac..247fe1d 100755 --- a/infra-as-code/bicep/webapp.bicep +++ b/infra-as-code/bicep/webapp.bicep @@ -127,7 +127,7 @@ resource appsettings 'Microsoft.Web/sites/config@2022-09-01' = { //Web App diagnostic settings resource webAppDiagSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { - name: '${webApp.name}-diagnosticSettings' + name: 'default' scope: webApp properties: { workspaceId: logWorkspace.id