Skip to content

Azure Cloud Architect

rajeshdevopsengineer edited this page Sep 19, 2025 · 2 revisions

Azure Cloud Architect interviews

  • Section A: Azure Cloud Architect (200 Questions)
  • Section B: Terraform (100 Questions)
  • Section C: Azure ARM Templates (50 Questions)
  • Section D: DevOps Releases & Deployments (150 Questions)

✅ Section A – Azure Cloud Architect (200 Questions)


1. Explain the difference between Azure Regions, Availability Zones, and Availability Sets

Short explanation

  • Azure Region — a geographic area (e.g., East US, West Europe). A region contains multiple datacenters and is the highest-level location boundary. Choose a region for data residency, latency, compliance, and available services.
  • Availability Zone (AZ) — physically separate datacenters within a single Azure region. Each zone has independent power, cooling, and networking. Use AZs to tolerate datacenter-level failures. Zone-aware services (VMs with --zone, managed disks, Load Balancer, etc.) can be deployed to specific zones.
  • Availability Set — logical grouping of VMs within a single datacenter (single region). It distributes VMs across fault domains (hardware racks) and update domains (rolling upgrade groups) to protect against host or rack failures and planned maintenance. Availability Sets do not protect against a whole-datacenter outage — use Availability Zones for that.

Real-world project example Project: A mid-sized e-commerce site with a web tier (VMs), API tier (VMs), and database (managed service).

  • Use Availability Sets if you only need protection against host failures within the same datacenter (cost sensitive, region does not support zones or cost constraints).
  • Use Availability Zones to survive a datacenter outage (higher SLA, multi-zone architecture).
  • Choose an Azure Region close to customers and meeting compliance (e.g., southindia for India customers).

Step-by-step: create an Availability Set vs zone-aware VM (Portal + CLI)

Portal (Availability Set)

  1. Portal → Resource Group → Create → Search “Availability set”.
  2. Create → name myAVSet → choose Fault domains (2 or 3), Update domains (20) → Create.
  3. When creating VMs, in the “Availability options” choose “Availability set” and select myAVSet.

CLI (Availability Set)

# create resource group
az group create -n RG-AVSet -l southindia

# create availability set
az vm availability-set create \
  --resource-group RG-AVSet \
  --name myAVSet \
  --platform-fault-domain-count 2 \
  --platform-update-domain-count 20
# create VM in that availability set
az vm create -g RG-AVSet -n webvm1 --image UbuntuLTS \
  --availability-set myAVSet --admin-username azureuser --generate-ssh-keys

Portal (Availability Zones)

  1. Portal → Create VM → under “Availability options” select “Availability zone” and pick Zone 1/2/3.
  2. Repeat for other VMs in Zone 2 and Zone 3.

CLI (Availability Zones)

# create VMs distributed across zones
az group create -n RG-AZ -l eastus

az vm create -g RG-AZ -n web-az1 --image UbuntuLTS --zone 1 --admin-username azureuser --generate-ssh-keys
az vm create -g RG-AZ -n web-az2 --image UbuntuLTS --zone 2 --admin-username azureuser --generate-ssh-keys
az vm create -g RG-AZ -n web-az3 --image UbuntuLTS --zone 3 --admin-username azureuser --generate-ssh-keys

When to use which

  • Use Availability Zones for high SLA and cross-datacenter resilience (if region supports them).
  • Use Availability Sets when zones aren't available, or cost/architecture requires single-datacenter redundancy.
  • Regions: choose by geography, compliance, service availability, latency.

2. How do you choose the right Azure Region for a global workload?

Stepwise decision factors (explain)

  1. Latency / proximity to users — pick regions close to majority of users.
  2. Data residency & compliance — legal/regulatory constraints might require certain regions.
  3. Service availability — not all Azure services / SKUs exist in every region.
  4. Costs & SKUs — price and available VM SKUs differ by region.
  5. Disaster recovery / business continuity — choose paired/nearby regions for geo-replication.
  6. Capacity & quotas — some regions may have temporary capacity constraints.
  7. Network topology — proximity to on-premises sites, ExpressRoute/Peering locations.
  8. Latency testing / proof-of-concept — measure real latency from user locations.

Real-world project example Project: SaaS reporting platform with users in EU and APAC.

  • Primary region: westeurope (low latency to EU users).
  • Secondary (DR) region: northeurope or an APAC region for APAC users, with active read replicas or geo-replication.
  • Use a global front door (Azure Front Door) to route users to nearest backend.

Step-by-step checks you can run (Portal + CLI)

Portal checks

  1. Portal → Subscriptions → Locations to see regions available to your subscription.
  2. Use the portal when creating resources to confirm they support the required service.

CLI checks

# list all regions available to subscription
az account list-locations -o table

# check if a particular VM SKU is available in region
az vm list-skus --location westeurope --size Standard_D2s_v3 -o table

# check available services by region: you can try to create; or check doc / service availability programmatically via API

Practical steps

  1. Map user geography and legal constraints.
  2. Shortlist candidate regions (primary + DR).
  3. Validate required services and SKUs in those regions (CLI az vm list-skus, attempt small test deployments).
  4. Run latency tests from representative clients (from on-prem or performance tests).
  5. Decide DR strategy: same-region zones for rack failure + cross-region for region failure.
  6. Document cost differences and expected SLAs.

3. What is the difference between IaaS, PaaS, and SaaS in Azure?

Definition & Azure examples

  • IaaS (Infrastructure as a Service) — you manage OS, runtime, and apps; Azure provides VMs, networking, storage. Examples: Azure Virtual Machines, Managed Disks, Azure Load Balancer.
  • PaaS (Platform as a Service) — Azure handles OS and platform; you deploy apps. Examples: Azure App Service, Azure SQL Database, Azure Functions, Azure Kubernetes Service (AKS - partly PaaS).
  • SaaS (Software as a Service) — fully managed applications consumed by users. Examples: Microsoft 365, Dynamics 365, Salesforce (hosted on Azure sometimes).

Real-world project example Project: Corporate web app with job scheduling and analytics.

  • IaaS: legacy background worker running on an Ubuntu VM and tooling that requires admin access.
  • PaaS: migrate web front-end to Azure App Service and database to Azure SQL to reduce ops overhead.
  • SaaS: use Microsoft 365 for collaboration and PowerBI service for analytics.

Step-by-step: quick setups (Portal + CLI)

IaaS — create a VM (CLI)

az group create -n RG-IaaS -l eastus
az vm create -g RG-IaaS -n vm-legacy --image UbuntuLTS --admin-username azureuser --generate-ssh-keys

PaaS — create App Service + Azure SQL (CLI)

# resource group
az group create -n RG-PaaS -l eastus

# app service plan and web app
az appservice plan create -g RG-PaaS -n myPlan --sku S1
az webapp create -g RG-PaaS -p myPlan -n mywebapp12345 --runtime "NODE|16-lts"

# Azure SQL
az sql server create -g RG-PaaS -n mysqlsrv123 -l eastus -u sqladmin -p 'P@ssw0rd!'
az sql db create -g RG-PaaS -s mysqlsrv123 -n myappdb --service-objective S0

SaaS — consume Microsoft 365: (no infra to create; configure tenant subscriptions, users, SSO). Steps are administrative (portal.office.com).

When to prefer which

  • If you need full OS control: IaaS.
  • If you want to focus on code, reduce patching/ops: PaaS.
  • For off-the-shelf productivity tools: SaaS.

4. Explain Azure Resource Manager (ARM) vs Classic deployment

Short explanation

  • Classic (Azure Service Manager, ASM) — legacy management model, resources managed individually and not by resource groups; older APIs. Microsoft deprecated classic for most services.
  • Azure Resource Manager (ARM) — the modern management layer: declarative templates (ARM templates / Bicep), resource groups, role-based RBAC, tagging, dependency handling, and transactional deployments. ARM is the recommended and default model.

Benefits of ARM vs Classic

  • Grouping related resources (resource groups) and manage them as a unit.
  • Declarative templates (repeatable, idempotent).
  • Fine-grained RBAC and policies.
  • Tagging, auditing, and easier automation.

Real-world project example Project: Replace a manual multi-VM application deployment with ARM/Bicep pipelines for reproducibility. Use ARM templates in CI/CD to deploy VMs, network, and storage in a single deployment step.

Step-by-step: deploy a resource with ARM template (Portal + CLI)

Sample minimal ARM template (skeleton):

{
  "$schema":"https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion":"1.0.0.0",
  "parameters":{
    "vmName":{"type":"string"}
  },
  "resources":[
    {
      "type":"Microsoft.Compute/virtualMachines",
      "apiVersion":"2021-07-01",
      "name":"[parameters('vmName')]",
      "location":"eastus",
      "properties":{
        "hardwareProfile":{"vmSize":"Standard_B1s"},
        "storageProfile":{ /* ... */ },
        "osProfile":{ /* ... */ },
        "networkProfile":{ /* ... */ }
      }
    }
  ]
}

Deploy via CLI

az group create -n RG-ARM -l eastus
az deployment group create -g RG-ARM --template-file ./mainTemplate.json --parameters vmName=myVmFromARM

Migrate classic resources (if you still have classic resources)

  • Use Azure Portal → Search “Migrate classic resources to ARM” or the migration tool in the portal.
  • Prefer re-deploying via ARM/Bicep for predictable infrastructure-as-code rather than migrating ad-hoc.

5. How do you design a highly available architecture in Azure?

Key patterns & components

  • Zones & Regions: distribute across Availability Zones and have a cross-region DR strategy.
  • Stateless frontends: use autoscaling (VMSS) or PaaS (App Service with Scale-out).
  • Load balancing: internal/external Azure Load Balancer (L4), Application Gateway (L7 / WAF), Front Door / Traffic Manager for multi-region traffic management.
  • Resilient data: geo-replication for DBs, zone redundant storage, Cosmos DB multi-region.
  • Distributed network: hub-and-spoke VNets, Azure Firewall, NAT Gateway for stable outbound.
  • Monitoring & automation: alerts, runbooks, automated failover scripts.

Real-world project example Project: Global e-commerce app (web frontend, API, order DB, caching, search).

  • Front door (Azure Front Door) for global routing and WAF.
  • App Service (or VMSS) in two regions (primary + secondary) with autoscale.
  • Azure SQL primary in primary region + geo-replica in secondary (automatic failover group).
  • Redis Cache (Azure Cache for Redis) with geo-replication or separate caches in each region.
  • Blob Storage with RA-GRS or use multi-region replication via CDN.
  • Monitoring with Application Insights and Log Analytics.

Step-by-step: core HA components (simplified)

A. Create Resource Group, VNet, and subnets (CLI)

az group create -n RG-HA -l eastus

az network vnet create -g RG-HA -n vnet-ha --address-prefix 10.0.0.0/16 \
  --subnet-name web-subnet --subnet-prefix 10.0.1.0/24

B. Frontend: App Service + Front Door (Portal/CLI)

# App Service Plan
az appservice plan create -g RG-HA -n haPlan --sku P1v2

# Web app in primary region
az webapp create -g RG-HA -p haPlan -n ha-web-primary --runtime "DOTNET|6.0"

# For multi-region, repeat in secondary region resource group/plan

Then create Azure Front Door (Portal recommended for configuration) to route traffic to both endpoints with health probes and priority.

C. Compute: VM Scale Set across zones

az vmss create -g RG-HA -n vmss-ha --image UbuntuLTS --instance-count 2 \
  --upgrade-policy-mode automatic --vnet-name vnet-ha --subnet web-subnet \
  --zones 1 2 3 --admin-username azureuser --generate-ssh-keys

D. Database: Azure SQL + Failover Group (high level steps)

  1. Create primary Azure SQL server & DB in primary region.
  2. Create secondary server & DB in secondary region.
  3. Configure Auto-failover group between primary and secondary (this provides geo-failover). (Portal or az sql failover-group commands.)

E. Storage: use zone-redundant or geo-replicated options, enable soft delete and lifecycle management.

F. Traffic manager or Front Door for cross-region failover:

  • Front Door for global HTTP/HTTPS with low latency and WAF.
  • Traffic Manager for DNS-based failover.

G. Monitoring & Automation

  • Create Log Analytics workspace; enable diagnostics on App Service, SQL & Storage. Create alerts and automated runbooks.

Design checklist

  • Use zone-aware compute + regional DR.
  • Make app stateless (store session/objects centrally).
  • Use managed services that provide built-in replication.
  • Test failover regularly with planned drills.

6. What is Azure Landing Zone?

Definition An Azure Landing Zone is an opinionated, repeatable, enterprise-scale baseline environment in Azure that implements foundational constructs (identity, governance, networking, subscriptions, security, monitoring, and resource organization). It’s the starting point for deploying workloads at scale and aligns to governance and compliance.

Real-world project example Project: Onboarding a new business unit with 20 apps — you deploy a landing zone that includes:

  • Management group hierarchy and subscription structure (hub for shared services, spokes per app),
  • Azure Policy initiative for tagging and allowed locations,
  • Hub VNet with Azure Firewall / VPN / ExpressRoute,
  • Log Analytics workspace and centralized monitoring,
  • CI/CD pipeline for infrastructure as code (Bicep/ARM/Terraform),
  • Identity design (Azure AD, conditional access) and enterprise key vault strategy.

Step-by-step: implement a basic landing zone (high level)

  1. Plan: map subscriptions, naming, tag strategy, networking (hub-and-spoke), identities, and controls.

  2. Create management groups

    • Portal or CLI: create root → landing-zone → subscriptions mapping.

    • CLI example:

      az account management-group create --name LandingZone
      az account management-group create --name Platform --parent LandingZone
      az account management-group create --name LandingZone-Prod --parent LandingZone
  3. Policies & initiatives

    • Define policy definitions (allowed locations, enforce tag).

    • Assign policy initiatives at management group scope.

    • Example CLI to assign builtin policy:

      az policy assignment create --name 'enforceTag' --scope /providers/Microsoft.Management/managementGroups/LandingZone --policy "built-in-policy-id-or-path"
  4. Network hub:

    • Deploy a hub VNet, Azure Firewall, and create spoke VNets for apps. Use Bicep/Terraform templates for repeatability.
  5. Identity & security:

    • Configure Azure AD structure, conditional access, create service principals and managed identities.
    • Deploy Azure Key Vault (centralized secrets).
  6. Logging & monitoring:

    • Create Log Analytics workspace and Storage accounts for logs; set diagnostic settings for platform resources.
  7. Automation & CI/CD:

    • Store IaC in repo (Bicep/Terraform). Implement pipelines to deploy landing zone artifacts.
  8. Blueprints (optional): Use Azure Blueprints or IaC to orchestrate policy, role assignments, and resource deployments in one artifact.

Tip: Microsoft provides well-documented Landing Zone accelerators (Enterprise-Scale landing zone). Use those artifacts as a starting point when implementing a production-grade landing zone.


7. Explain Azure Well-Architected Framework pillars

The core pillars (concise)

  1. Reliability — design for availability, redundancy, recovery (backups, failover, health probes).
  2. Security — protect identities, data, network, and services; apply least privilege and encryption.
  3. Performance Efficiency — choose right resources and scale efficiently; use caching and autoscaling.
  4. Cost Optimization — right-size, reserved instances, cleanup unused resources, choose appropriate SKUs and tiers.
  5. Operational Excellence — automation, monitoring, deployments, incident response, and runbooks.

(These are the canonical five pillars. You may also consider Sustainability as an emerging focus in cloud architecture and organizational goals; Microsoft has guidance for sustainable cloud design.)

Real-world project example Project: SaaS app where the customer required high reliability and low cost — we:

  • Built multi-AZ frontend with VMSS and autoscale (Reliability + Performance).
  • Used Azure SQL with read replicas and automated backups (Reliability).
  • Implemented RBAC, Key Vault, and network NSGs (Security).
  • Enabled monitoring, alerts, and runbooks in Log Analytics (Operational Excellence).
  • Used reserved instances + autoscale + Azure Advisor recommendations to cut costs by 30% (Cost Optimization).

Implementation steps & practical checks

  • Reliability: configure health probes & LB, set recovery objectives (RTO/RPO), test failover.
  • Security: enable MFA + Conditional Access, use managed identities, Key Vault, limit NSG rules, enable DDoS protection and WAF.
  • Performance: instrument with Application Insights, use autoscale rules, use CDN for static content.
  • Cost: enable Cost Management, budgets, tagging, and use Reserved Instances / Savings Plans where appropriate.
  • Operational: create playbooks, CI/CD pipelines for infra, define runbooks and operational runbooks for incident recovery.

8. What are Azure Management Groups and why are they important in large organizations?

Explanation

  • Azure Management Groups organize subscriptions into a hierarchy for unified policy, role, and access management. Policies and role assignments applied to a management group flow down to all child resources and subscriptions.
  • They enable enterprise-wide governance, compliance, and consistent configuration across many subscriptions.

Real-world project example Company with multiple lines of business:

  • Root management group (organization),
  • Child groups: Platform, Security, Prod, Dev,
  • Apply Azure Policy initiative (e.g., "Deny public IP on Storage") at Platform so all subscriptions inherit it.
  • Apply role assignments for platform team at Platform mgmt group, so members have rights across those subscriptions.

Step-by-step (Portal + CLI)

CLI:

# create management groups
az account management-group create --name Contoso

# create child groups
az account management-group create --name Platform --parent Contoso
az account management-group create --name Production --parent Contoso

# move subscription under a management group
az account management-group subscription add --name Contoso --subscription <subscriptionId>

# assign a policy at management group scope
az policy assignment create --name 'enforceTag' --scope /providers/Microsoft.Management/managementGroups/Contoso --policy /path/to/policyDefinition.json

Portal:

  1. Portal → Management Groups → create groups and drag subscriptions into them.
  2. Portal → Policy → Assign policy/initiative at management group scope.

Why important

  • Centralized policies, role assignments, and cost controls.
  • Scales governance for large enterprises with many subscriptions.

9. Difference between Azure Policy and Azure RBAC?

High-level difference

  • Azure Policy — used to enforce resource properties / configurations. It’s about what resources look like (e.g., allowed locations, SKUs, tagging, resource types). Policies evaluate and optionally remediate resources.
  • Azure RBAC (Role-Based Access Control) — used to grant permissions to users, groups, or service principals. RBAC answers who can perform what actions (read/write/delete) on resources.

Real-world example

  • Policy: Enforce that all Storage Accounts have allowBlobPublicAccess = false and that resources must have tags costCenter and environment.
  • RBAC: Grant the DevOps team the “Contributor” role at a resource group so they can create resources; grant read-only to auditors.

Step-by-step: create and enforce a policy, and assign RBAC (CLI)

A. Assign a built-in policy (e.g., restrict locations)

# assign built-in "Allowed locations" policy at subscription scope
az policy assignment create \
  --name "AllowedLocationsProd" \
  --scope /subscriptions/<sub-id> \
  --policy "/providers/Microsoft.Authorization/policyDefinitions/allowed-locations" \
  --params '{"listOfAllowedLocations":{"value":["eastus","westeurope"]}}'

B. Create a custom policy (skeleton) then assign:

// customPolicy.json (sample: enforce tag)
{
  "properties": {
    "displayName": "Require costCenter tag",
    "policyRule": {
      "if": {
        "field": "tags.costCenter",
        "exists": "false"
      },
      "then": {
        "effect": "deny"
      }
    }
  }
}
az policy definition create --name RequireCostCenter --rules customPolicy.json --mode All
az policy assignment create --name EnforceCostCenter --policy RequireCostCenter --scope /subscriptions/<sub-id>

C. RBAC assignment (example: grant Contributor to a group)

az role assignment create --assignee "<objectId-or-user-principal-name>" \
  --role "Contributor" \
  --scope /subscriptions/<sub-id>/resourceGroups/myResourceGroup

Key takeaway

  • Use Policy to prevent or remediate undesired resource state.
  • Use RBAC to control who can do what on resources.

10. How do you secure Azure Storage Account against public access?

Threats to mitigate

  • Public blob/container access, unsecured network access, weak auth (shared keys/SAS misuse), older TLS versions.

Best practice controls (practical checklist)

  1. Disable public access at storage account level (allowBlobPublicAccess = false).
  2. Use Private Endpoint so storage is accessible via private VNet (recommended).
  3. Enable firewall rules — allow only selected IPs or VNets/subnets.
  4. Use Azure AD / RBAC for Blob access — prefer Azure AD over shared keys and SAS.
  5. Use SAS carefully — restrict permissions, time windows, and IP ranges. Prefer stored access policies.
  6. Enforce secure transfer (HTTPS only) and minimum TLS version (TLS 1.2).
  7. Enable encryption & customer-managed keys (CMK) if required.
  8. Enable soft delete / immutable storage / change feed for recovery & retention.
  9. Enable logging & monitoring (Storage analytics, diagnostic settings to Log Analytics).
  10. Use Microsoft Defender for Storage for advanced threat detection.

Real-world project example Project: A company storing sensitive documents used by internal apps only. They:

  • Disabled public access,
  • Deployed a private endpoint on the hub VNet,
  • Configured RBAC so only a service principal has Blob Data Contributor on container,
  • Enabled soft delete and CMK using Key Vault,
  • Set storage firewall to allow only the hub subnet.

Step-by-step (Portal + CLI)

A. Create storage account with public access disabled (CLI)

az group create -n RG-Storage -l eastus

az storage account create \
  --name mystorageacctxyz \
  --resource-group RG-Storage \
  --sku Standard_LRS \
  --kind StorageV2 \
  --https-only true \
  --allow-blob-public-access false \
  --min-tls-version TLS1_2

Portal steps

  1. Portal → Storage Account → Create → On “Security” tab set “Allow public access to blobs” = Disabled.
  2. On “Networking” choose “Selected networks” (to restrict access) or enable Private Endpoint.

B. Add VNet rule or private endpoint (CLI example — private endpoint)

# create vnet and subnet for private endpoint
az network vnet create -g RG-Storage -n hubVnet --address-prefix 10.1.0.0/16 --subnet-name pe-subnet --subnet-prefix 10.1.0.0/24

# create private endpoint to storage
az network private-endpoint create -g RG-Storage -n storagePE --vnet-name hubVnet --subnet pe-subnet \
  --private-connection-resource-id /subscriptions/<sub>/resourceGroups/RG-Storage/providers/Microsoft.Storage/storageAccounts/mystorageacctxyz \
  --group-ids blob

C. Use Azure AD & RBAC for blob access (recommended vs account keys)

  1. Create a managed identity or service principal for your app.
  2. Assign Storage Blob Data Contributor role to the identity on the storage account or container.
# assign role
az role assignment create --assignee <principalId> --role "Storage Blob Data Contributor" \
  --scope /subscriptions/<sub>/resourceGroups/RG-Storage/providers/Microsoft.Storage/storageAccounts/mystorageacctxyz

D. Disable account keys in code path or rotate often; prefer OAuth (Azure AD) + SAS with limited scope when necessary.

E. Enable diagnostic logs and alerting

  1. Portal → Storage account → Diagnostics → Send logs to Log Analytics or Storage.
  2. Create alerts for anomalous activities.

F. Enable soft delete and immutable storage (Portal or CLI)

# enable blob soft-delete
az storage blob service-properties delete-policy update --account-name mystorageacctxyz --enable true --days-retention 30

G. Restrict SAS usage (if used)

  • Use short-lived SAS tokens, restrict IP ranges, limit permissions.
  • Use stored access policy where possible to revoke SAS access centrally.

Final checklist / quick cheat-sheet for interviews

  • Regions vs Zones vs Availability Sets: Region = geo, Zone = datacenter-level isolation in a region, Availability Set = logical distribution inside one datacenter.
  • Region selection: proximity, compliance, services & SKU availability, cost, DR pairing.
  • IaaS/PaaS/SaaS: who manages OS/runtime? (I=you, P=platform, S=vendor).
  • ARM vs Classic: ARM is the modern IaC, use ARM/Bicep/Terraform.
  • HA design: stateless frontends, zone + region redundancy, managed DB geo-replication, load balancing, monitoring, DR drills.
  • Landing Zone: the enterprise baseline (MGs, policies, networking, security, monitoring, IaC pipelines).
  • Well-Architected pillars: Reliability, Security, Performance Efficiency, Cost Optimization, Operational Excellence.
  • Management Groups: scale governance across subscriptions.
  • Policy vs RBAC: Policy = resource state enforcement; RBAC = permission assignment.
  • Secure Storage: disable public access, private endpoints, firewall rules, AAD/RBAC, HTTPS/TLS, logging, soft delete, CMK.


11. When would you use Azure App Service vs Azure Kubernetes Service (AKS)?

Explanation (quick):

  • App Service = managed PaaS for web apps/APIs: fast to deploy, platform-managed scaling, built-in CI/CD, lower ops. Use when apps are standard web workloads and you want minimal infra overhead.
  • AKS = managed Kubernetes: full container orchestration, control over networking, multi-container microservices, custom scheduling, sidecars, service meshes. Use when you need container orchestration, complex deployments, or multi-container apps.

Real-world example:

  • Small B2B web portal → App Service for quick onboarding.
  • Microservices-based payment processing with multiple services, custom networking and sidecars (service mesh, canary releases) → AKS.

Step-by-step setup (high-level):

App Service (CLI)

az group create -n RG-App -l eastus
az appservice plan create -g RG-App -n AppPlan --sku P1v2
az webapp create -g RG-App -p AppPlan -n mywebapp123 --runtime "DOTNET|6.0"
# deploy via zip or configure GitHub Actions in portal

AKS (CLI)

az group create -n RG-AKS -l eastus
az aks create -g RG-AKS -n aks-cluster --node-count 3 --enable-addons monitoring --generate-ssh-keys
az aks get-credentials -g RG-AKS -n aks-cluster
# deploy workloads with kubectl apply -f deployment.yaml

Decision checklist: choose App Service for fast time-to-market and standard apps; choose AKS for microservices, portability, and complex infra requirements.


12. What is the role of Azure Blueprints in governance?

Explanation: Azure Blueprints let you define a repeatable set of governance artifacts (ARM templates, Policy assignments, Role assignments, Resource Groups) and apply them consistently to subscriptions or management groups. Blueprints help enforce a standard landing zone quickly.

Real-world example: Onboard a new business unit by assigning a blueprint that provisions a hub VNet, Log Analytics workspace, Tagging policy, and RBAC roles automatically.

Step-by-step (conceptual):

  1. Plan artifacts (policy definitions, role assignments, ARM templates).
  2. Create a Blueprint (Portal → Blueprints → Create blueprint) and add artifacts: ARM template (network), policy assignment (allowed locations/tags), role assignment (platform team).
  3. Publish the blueprint version.
  4. Assign the blueprint to a subscription or management group (this creates the resources and applies policies & RBAC).
  5. Track assignments, update blueprint versions and reassign/upgrade when needed.

(You can also automate Blueprint creation via REST/PowerShell/az blueprint extension; many teams implement Blueprints as part of IaC pipelines.)


13. Explain hub-and-spoke vs mesh topology in Azure networking

Explanation:

  • Hub-and-spoke: central hub VNet (shared services: firewall, DNS, peering/ExpressRoute) with multiple spoke VNets for applications. Traffic to shared services funnels through the hub — simpler to govern and scale.
  • Mesh (full peer-to-peer): VNets peer with each other directly (or use many-to-many). Can be flexible but becomes operationally complex and harder to secure at scale.

Real-world example: Enterprise uses hub-and-spoke: hub has Azure Firewall + VPN/ExpressRoute + private DNS; each application team gets a spoke VNet peered to hub. Mesh might be used in small fleet where each VNet needs direct connectivity to every other VNet.

Step-by-step (hub-and-spoke setup CLI summary):

# create hub VNet
az network vnet create -g RG -n vnet-hub --address-prefix 10.0.0.0/16 --subnet-name AzureFirewallSubnet --subnet-prefix 10.0.0.0/24

# create spoke VNet
az network vnet create -g RG -n vnet-spoke1 --address-prefix 10.1.0.0/16 --subnet-name app-subnet --subnet-prefix 10.1.0.0/24

# create peering hub->spoke and spoke->hub
az network vnet peering create -g RG --name hub-to-spoke1 --vnet-name vnet-hub --remote-vnet vnet-spoke1 --allow-vnet-access
az network vnet peering create -g RG --name spoke1-to-hub --vnet-name vnet-spoke1 --remote-vnet vnet-hub --allow-vnet-access

Design tips: use hub for shared services, central inspection, centralized egress; use peering + route tables to control traffic flows.


14. What are Private Endpoints, and how are they different from Service Endpoints?

Explanation:

  • Private Endpoint: Azure PaaS resource is reachable via a private IP in your VNet (Private Link). Traffic stays on the Azure backbone; you can limit access to that private IP and disable public network access.
  • Service Endpoint: extends your VNet identity to the PaaS service, keeping traffic over backbone, but the service still has a public endpoint and access control is done by allowing the VNet/subnet to the service. Service endpoints do NOT create a private IP in your VNet.

Real-world example: Highly secure app: use Private Endpoint to connect to Azure Storage and Key Vault; disable public access. For less strict workloads where you want subnet-level access without private IPs, use Service Endpoints.

Step-by-step (CLI):

Service Endpoint (add to subnet)

az network vnet subnet update -g RG --vnet-name vnet1 --name app-subnet --service-endpoints Microsoft.Storage

Private Endpoint (creates NIC with private IP)

az network private-endpoint create -g RG -n pe-storage --vnet-name vnet1 --subnet app-subnet \
  --private-connection-resource-id /subscriptions/<sub>/resourceGroups/RG/providers/Microsoft.Storage/storageAccounts/myStorage \
  --group-ids blob

Recommendation: prefer Private Endpoints for fine-grained, identity-based access and to fully block public network exposure.


15. How do you handle multi-region deployment for a financial application with strict compliance?

Explanation: A finance app must meet compliance (data residency, encryption, audit) and resilience (RTO/RPO). Approach: define allowed regions, use multiple subscriptions/landing zones, separate duties, use strong encryption (CMK) and private connectivity, implement strict logging/retention and routine compliance reporting.

Real-world example: Payment system: active-active in two permitted regions (both in-country or approved jurisdictions), data encrypted with customer-managed keys stored in Key Vault with purge protection; traffic routed via Azure Front Door and origin private endpoints; DB uses geo-replication with legal-approved region pairing.

Step-by-step (high-level):

  1. Policy & planning: use management groups + Azure Policy to allow only approved regions and enforce encryption/tags.
  2. Network: private endpoints, hub in each region, use Azure Front Door or Traffic Manager for traffic routing with geo/priority rules.
  3. Data layer: for relational DB use Geo-replication (Azure SQL failover groups) or active-active pattern depending on DB type. For data residency, ensure replicas only in approved regions. Enable Transparent Data Encryption (TDE) + CMK in Key Vault.
  4. Secrets & keys: store keys in Key Vault with soft-delete & purge-protection and RBAC/PIM for key admins.
  5. Backups & retention: encrypted backups, cross-region retention policies that respect compliance.
  6. Monitoring & Audit: central Log Analytics workspace (or per region with central ingestion), Activity Log retention, alerts for policy drift.
  7. DR testing: regularly test failover, maintain runbooks, and keep audit trails.

CLI snippets (examples)

# create Log Analytics workspace in each region
az monitor log-analytics workspace create -g RG-Primary -n law-prim -l eastus
az monitor log-analytics workspace create -g RG-DR -n law-dr -l centralindia

# create SQL failover group (high level)
az sql server create -g RG-Primary -n sql-prim -l eastus -u sqladmin -p 'P@ssw0rd!'
az sql server create -g RG-DR -n sql-dr -l westus -u sqladmin -p 'P@ssw0rd!'
# create DBs and configure failover group via az sql failover-group

Key governance: use Azure Policy to enforce region and cryptography rules and keep separation of duties for admin roles.


16. Explain how Azure Monitor, Log Analytics, and Application Insights integrate

Explanation:

  • Azure Monitor = umbrella for metrics, logs, alerts, and visualizations.
  • Log Analytics workspace = where collected logs are stored and queried with Kusto (KQL).
  • Application Insights = Application Performance Monitoring (APM) that collects telemetry (requests, exceptions, traces) and can send data into a Log Analytics workspace for unified querying.

Real-world example: Microservices app: App Insights collects request traces and exceptions; infrastructure metrics and activity logs flow into a Log Analytics workspace. Alerts combine metrics and log queries; workbooks present combined dashboards.

Step-by-step (CLI):

az group create -n RG-Monitor -l eastus
az monitor log-analytics workspace create -g RG-Monitor -n law-prod -l eastus

# create App Insights and link to workspace
az monitor app-insights component create -g RG-Monitor -n ai-prod -l eastus \
  --application-type web --workspace /subscriptions/<sub>/resourceGroups/RG-Monitor/providers/Microsoft.OperationalInsights/workspaces/law-prod

# enable diagnostics for a resource to send logs to workspace
az monitor diagnostic-settings create --resource /subscriptions/<sub>/resourceGroups/RG/providers/Microsoft.Sql/servers/sql-prim \
  --name sendToLA --workspace /subscriptions/<sub>/resourceGroups/RG-Monitor/providers/Microsoft.OperationalInsights/workspaces/law-prod --logs '[{"category":"SQLSecurityAuditEvents","enabled":true}]'

Then create alerts (metric or log query based) and build dashboards/workbooks in portal.


17. What are Availability Zones, and how do you ensure resiliency with them?

Explanation: Availability Zones are physically separate datacenters inside an Azure region. Resiliency is achieved by distributing compute, disks, and networking across zones so a single datacenter failure doesn’t take down your application.

Real-world example: High-SLAs for an online trading app: web tier in VMSS distributed across zones, zone-redundant load balancer, zone-redundant storage (ZRS), cross-region DB replicas for region failure.

Step-by-step (CLI examples):

# VM Scale Set across zones
az vmss create -g RG-AZ -n vmss-az --image UbuntuLTS --instance-count 4 --zones 1 2 3 --vnet-name vnet --subnet web-subnet --admin-username azureuser --generate-ssh-keys

# Standard Load Balancer (zone redundant) in front of VMSS
az network lb create -g RG-AZ -n lb-std --sku Standard --vnet-name vnet --subnet web-subnet

# Use ZRS for storage
az storage account create -g RG-AZ -n stazstore --sku Standard_ZRS --kind StorageV2 --location eastus

Design rules: ensure all critical resources are zone-aware (VMSS, managed disks, load balancer, public IP Standard), test zone failure scenarios, and combine zones with cross-region DR when necessary.


18. How do you implement Zero Trust in Azure?

Explanation: Zero Trust = “never trust, always verify.” Core pillars: verify identity, enforce least privilege, micro-segmentation, device health, and continuous monitoring.

Real-world example: Company enforces Conditional Access (MFA, device compliance), uses PIM for privileged roles, isolates workloads with NSGs & Firewall, uses Private Endpoints, and continuously monitors with Defender & Log Analytics.

Step-by-step (practical controls):

  1. Identity: enable Azure AD MFA + Conditional Access policies (block legacy auth, require compliant devices).
  2. Least privilege: enable Azure AD PIM for admin roles and assign minimal roles via RBAC.
  3. Device posture: integrate Intune for device compliance and require compliant devices in Conditional Access.
  4. Network segmentation: use NSGs, Azure Firewall, and Private Endpoints for microsegmentation.
  5. Encryption & keys: use Key Vault + CMK for data at rest.
  6. Monitoring & response: enable Defender for Cloud, Application Insights, central Log Analytics and alerts.
  7. Enforce policies: use Azure Policy to prevent risky configurations.

CLI snippets (examples)

# create PIM and role assignments via portal (PIM setup typically via portal)
# create Key Vault
az keyvault create -g RG -n kv-zt --enable-purge-protection true --enable-soft-delete true
# create firewall
az network firewall create -g RG -n az-fw -l eastus

Start with identity + device controls and then extend microsegmentation and monitoring.


19. What is the difference between Azure Bastion and Just-In-Time (JIT) VM Access?

Explanation:

  • Azure Bastion: fully managed platform service that provides secure RDP/SSH to VMs via the Azure Portal over TLS — no public IPs needed on the VM. Good for continuous secure access from browser without exposing NSG rules.
  • JIT VM Access: a Security Center/Defender feature that temporarily opens a port on the VM’s NSG for a limited time and source IP. Good for reducing a VM’s attack surface by keeping management ports closed until needed.

Real-world example:

  • Admins needing seamless browser-based access → Bastion.
  • Emergency access for occasional maintenance windows with audit trails → JIT.

Step-by-step:

Azure Bastion (CLI)

# ensure AzureBastionSubnet (/27) exists
az network bastion create -g RG -n bastion-host --public-ip-address bastion-pip --vnet-name vnet --subnet AzureBastionSubnet -l eastus

JIT (portal / Defender for Cloud)

  1. Enable Microsoft Defender for Cloud.
  2. Go to Defender → Just-in-time VM access → Configure JIT on selected VMs → set allowed ports, allowed source ranges and maximum request duration.
  3. When admin requests access, JIT opens NSG rules for the window; all activity is logged.

Recommendation: use Bastion for secure routine access; use JIT for short, auditable elevation of access.


20. Explain cost optimization strategies for Azure workloads

Strategies (concise):

  • Right-size VMs and databases; autoscale and use serverless where possible.
  • Use Reserved Instances / Savings Plans and Azure Hybrid Benefit for licenses.
  • Use spot VMs for noncritical workloads.
  • Migrate to PaaS where appropriate (lower ops + cost).
  • Remove idle resources and enforce lifecycle policies.
  • Use Azure Cost Management + budgets + tags.
  • Use Azure Advisor recommendations and automate remediation.

Real-world example: After migrating a dev/test fleet to Azure, buy 1-yr reserved instances for steady production workloads, move some background batch jobs to serverless functions and spot VMs for non-critical batch runs — cut cost by ~35%.

Step-by-step tips (portal/CLI):

  1. Enable Azure Advisor and review top recommendations in portal.
  2. Create budgets & alerts: Portal → Cost Management → Budgets.
  3. Purchase reservation (portal or az reservations) for VMs or SQL compute.
  4. Implement Autoscale rules for VMSS/App Service.
  5. Tag resources and run periodic reports to find unused resources.

CLI (example budget)

az consumption budget create --category cost --amount 1000 --time-grain monthly --name MyBudget --scope /subscriptions/<sub>

21. How do you handle disaster recovery in Azure using Site Recovery?

Explanation: Azure Site Recovery (ASR) orchestrates replication, failover and failback of VMs (on-prem → Azure, Azure → Azure, or VMware/Hyper-V scenarios). It provides continuous replication and orchestrated failover/runbooks.

Real-world example: On-prem Hyper-V VMs replicating to Azure for DR; in a datacenter outage, failover the complete tier to Azure and fail back when on-prem restored.

Step-by-step (portal & high level):

  1. Create a Recovery Services vault: Portal → Recovery Services vault → Create.
  2. Configure the vault for the environment (Azure/On-Prem).
  3. Install/prepare replication agents (for on-prem Hyper-V/VMware) or enable replication for Azure VMs (Azure → Azure).
  4. Configure replication policies (RPO settings, retention).
  5. Enable replication for the VMs and perform an initial replication (test replication available).
  6. Create recovery plans (orchestrated failover steps) and test failover regularly.
  7. Execute planned/unplanned failover when needed; perform failback after recovery.

(Portal provides most guided steps for on-prem to Azure and Azure-to-Azure replication. You can automate some tasks via REST API/PowerShell.)


22. What is Azure Advisor and how do you use it in governance?

Explanation: Azure Advisor analyzes resource telemetry and provides recommendations across cost, security, reliability, performance, and operational excellence. It’s an advisory engine to improve your environment.

Real-world example: Use Advisor to detect underutilized VMs (recommend downsizing), security gaps, and high-cost SKU choices; integrate recommendations into governance pipelines for remediation.

Step-by-step usage:

  1. Open Azure Advisor in Portal → review recommendations by category.
  2. Export recommendations or pin to dashboards.
  3. Automate remediations where safe (e.g., schedule downsizing or use runbooks).
  4. Incorporate Advisor outputs into governance reviews and change processes.

You can pull recommendations programmatically (REST/API) and integrate into ticketing or automation pipelines.


23. When to use Azure Functions vs Logic Apps vs WebJobs?

Explanation & use-cases:

  • Azure Functions = code-first serverless for event-driven compute, custom logic. Good for compute tasks, microservices, triggers (queues, HTTP).
  • Logic Apps = low-code/no-code workflow/orchestration with hundreds of connectors (SaaS, on-prem). Good for business workflows, approvals, ETL with connectors.
  • WebJobs = background tasks tied to App Service (runs alongside an App Service app). Good for legacy apps that already run on App Service and need background processing.

Real-world example:

  • Use Logic Apps to orchestrate a payment workflow with connectors to banking APIs and Teams notifications.
  • Use Functions to process incoming payment events (high throughput, custom code).
  • Use WebJobs for an App Service that needs scheduled background cleanups.

Step-by-step (creating each):

Function (CLI)

az functionapp create -g RG -n func-app --storage-account mystorage --consumption-plan-location eastus --runtime dotnet

Logic App (Portal recommended)

  • Portal → Logic Apps → Create workflow → design triggers/actions with connectors (e.g., HTTP trigger, SQL action, send email).

WebJob (deploy to App Service)

  • Zip the job and deploy to App Service: Portal → App Service → WebJobs → Add; or include as part of web app deployment (continuous job).

24. Explain Azure Hybrid Benefits

Explanation: Azure Hybrid Benefit (AHB) lets you use existing Microsoft Server licenses (Windows Server, SQL Server) with Software Assurance or qualifying subscriptions to save on Azure compute or SQL costs. You pay reduced rates for base software and only pay Azure for compute.

Real-world example: A company migrates 200 Windows Server VMs to Azure and applies AHB to lower VM and SQL Managed Instance costs.

How to apply (CLI example):

# use license-type when creating a Windows VM to indicate Hybrid Benefit
az vm create -g RG -n winvm --image Win2019Datacenter --license-type Windows_Server --admin-username azureuser --generate-ssh-keys

For SQL, during deployment choose “use existing license” (Portal or specific CLI flags like --license-type BasePrice depending on resource).

Always verify license eligibility and retain Software Assurance or qualifying licensing.


25. How would you design Azure storage for a global e-commerce platform?

Design principles:

  • Use regionally close storage accounts for low latency with CDN for static assets.
  • Use geo-redundancy appropriate to SLA (ZRS/RA-GRS) and ensure compliance constraints for data locality.
  • Partition data by tenant/region to reduce cross-region transfer.
  • Use caching (Azure CDN + Azure Cache for Redis) for hot assets, lifecycle policies for infrequently accessed blobs.
  • Secure using Private Endpoints, RBAC, encryption with CMK, and enforce logging/soft-delete.

Real-world example: Store product images in regional blob accounts + Azure CDN with origin failover; transactional logs in geo-replicated storage; object metadata in regionally distributed Cosmos DB or regional SQL with read replicas; central backups encrypted & archived.

Step-by-step (CLI/summary):

# create regional storage accounts
az storage account create -g RG-EU -n st-eu --location northeurope --sku Standard_LRS --kind StorageV2 --min-tls-version TLS1_2 --allow-blob-public-access false

# enable CDN endpoint for static assets
az cdn profile create -g RG-EU -n cdnProfile --sku Standard_Microsoft
az cdn endpoint create -g RG-EU --profile-name cdnProfile -n cdn-eu --origin st-eu.blob.core.windows.net

# create Redis cache
az redis create -g RG-EU -n redis-eu --sku Basic --vm-size C0

Add lifecycle rules, soft-delete, immutable blob storage where needed, and monitor costs and performance.


26. Explain the Shared Responsibility Model in Azure

Explanation: Azure secures the cloud infrastructure (physical hosts, network, datacenter facilities). The customer is responsible for what they put in the cloud: applications, data, identity, OS (for IaaS), network controls, backups, and configuration. The split varies by service model (IaaS > more customer responsibility; PaaS/SaaS > Azure manages more).

Mapping (example):

  • IaaS: Azure → physical infrastructure; Customer → OS patching, app security, firewall, backups.
  • PaaS: Azure → OS & platform; Customer → app code, data and access controls.
  • SaaS: Azure → full app/platform; Customer → data, identity and user configuration.

Use this model to define controls, responsibilities, and audit evidence.


27. What is ExpressRoute and when would you use it over VPN Gateway?

Explanation: ExpressRoute provides a private, dedicated network connection (via a connectivity provider) between on-premises and Azure, bypassing the public internet. It offers higher bandwidth, lower latency, SLA-backed connectivity and optional connectivity to Microsoft 365/online services. VPN Gateway is encrypted IPsec over the public internet — easier and cheaper but lower performance and without SLAs for the public internet path.

When to use ExpressRoute: large data transfers, low/consistent latency needs (financial trading), higher throughput, regulatory requirements for private connectivity.

Step-by-step (high-level):

  1. Order ExpressRoute circuit from a connectivity provider or use ExpressRoute Direct.
  2. Create an ExpressRoute circuit in Azure Portal and share circuit ID with provider.
  3. Configure ExpressRoute Gateway (virtual network gateway of type ExpressRoute) and link to the hub VNet.
  4. Configure routing (private peering, Microsoft peering as needed) and attach VNets via ExpressRoute gateway or use Global Reach.

VPN Gateway (quick CLI)

az network vnet-gateway create -g RG -n vpnGateway --vnet vnet-hub --public-ip-address gw-pip --gateway-type Vpn --vpn-type RouteBased --sku VpnGw1

Choose VPN for lower cost or simple site-to-site; pick ExpressRoute for enterprise-level private connectivity and consistent performance.


28. Explain design considerations for Azure Firewall vs NSG vs Application Gateway

Roles & differences:

  • NSG (Network Security Group) — stateless packet filters for subnets/NICs (L3/L4). Good for simple allow/deny IP/port rules at subnet or NIC scope. Lightweight & cheap.
  • Azure Firewall — managed, stateful network firewall (L3–L7), supports FQDN filtering, threat intelligence, NAT, and central rule management. Use for centralized east–west and north–south control.
  • Application Gateway (with WAF) — L7 load balancer and WAF for HTTP/HTTPS traffic, cookie-based affinity, URL-based routing, end-to-end TLS, and protection from OWASP threats.

Real-world design:

  • Hub contains Azure Firewall for centralized outbound/ingress filtering and FQDN rules.
  • NSGs in spokes for micro-level segmentation and least privilege at subnet level.
  • App Gateway with WAF sits in front of web apps (App Service / AKS ingress) to provide L7 routing and security.

Step-by-step (CLI examples): NSG

az network nsg create -g RG -n nsg-app
az network nsg rule create -g RG --nsg-name nsg-app -n AllowAppInbound --priority 100 --protocol Tcp --destination-port-ranges 443 --access Allow --direction Inbound

Azure Firewall

az network firewall create -g RG -n azfw -l eastus
# configure IP configuration and public IPs, then application rules and network rules

Application Gateway

az network application-gateway create -g RG -n appgw --sku WAF_v2 --capacity 2 --vnet-name vnet --subnet appgw-subnet --public-ip-address appgw-pip

Combine: NSGs for micro-perimeter control; Azure Firewall for centralized policy and inspection; App Gateway for L7 traffic and WAF.


29. What is a Bicep file and how is it different from ARM?

Explanation: Bicep is a domain-specific language (DSL) for authoring Azure infrastructure as code that compiles to ARM JSON templates. Bicep is more concise, easier to read, supports modules, type-safety, and better tooling. ARM templates are the JSON declarative format (verbose) that Bicep generates under the hood.

Real-world example: Use Bicep for modular landing zone templates (reusable modules for VNet, NSG, Key Vault). Compile to ARM when deploying via CI/CD if required.

Example Bicep vs ARM (tiny):

Bicep (main.bicep)

param storageName string
resource st 'Microsoft.Storage/storageAccounts@2022-09-01' = {
  name: storageName
  location: 'eastus'
  sku: { name: 'Standard_LRS' }
  kind: 'StorageV2'
}

Deploy Bicep

az deployment group create -g RG --template-file ./main.bicep --parameters storageName=stdemo123

Use Bicep for readable, maintainable IaC; it reduces ARM complexity and encourages modular patterns.


30. Explain Azure Key Vault soft-delete and purge-protection

Explanation:

  • Soft-delete: when a key/secret/certificate is deleted it is retained in a recoverable state for a retention period, allowing recovery. This prevents accidental or malicious immediate irrevocable deletion.
  • Purge-protection: when enabled, prevents permanently purging (hard delete) of deleted vault objects even by privileged users; only a vault owner can disable purge-protection (and once enabled, it cannot be disabled for a while), providing extra protection for compliance.

Real-world example: Critical production keys are protected with CMK in Key Vault with soft-delete enabled and purge-protection on so even if an attacker deletes a key, it cannot be permanently purged before investigation.

Step-by-step (CLI):

# create Key Vault with soft-delete and purge-protection enabled
az keyvault create -g RG -n kv-prod --enable-purge-protection true --enable-soft-delete true --location eastus

# delete a secret (goes to soft-delete)
az keyvault secret delete --vault-name kv-prod --name my-secret

# recover the deleted secret (during retention)
az keyvault secret recover --vault-name kv-prod --name my-secret

# attempt to purge (will fail while purge-protection enabled)
az keyvault purge --name kv-prod --location eastus

Operational notes: enable soft-delete & purge-protection for production vaults. Combine with RBAC, access policies, logging, and Key Vault firewalls or Private Endpoints for maximum security.



41. How do you design a multi-tenant SaaS platform on Azure?

Explanation (core choices) Multi-tenant SaaS design choices center on tenant isolation (cost vs isolation), identity, data partitioning, and onboarding / telemetry / metering:

  • Tenant isolation models: Shared (single app, single DB, tenantId filter) (max efficiency), Shared app / DB per shard (balanced), Isolated per-tenant subscription/DB (highest isolation/compliance).
  • Identity: Azure AD multi-tenant app (external customers sign in with their AAD), or Azure AD B2C for consumer identities.
  • Data: Choose tenantId column, sharding, or per-tenant DB based on scale & compliance.
  • Extensibility: feature flags per tenant, metering & billing, onboarding automation.

Real-world project example SaaS HR platform serving SMBs: uses single App Service cluster (stateless) + Azure SQL elastic pool (DB-per-tenant for moderate isolation) + Azure AD B2C for user sign-up, API gateway per tenant for rate limiting, and tenant provisioning via automation.

Step-by-step (high-level)

  1. Plan tenancy model — pick shared DB, sharded DB, or per-tenant DB by risk/compliance.

  2. Identity — register a multi-tenant app (or use B2C).

    • Portal: Azure AD → App registrations → New registration → Supported account types → Accounts in any organizational directory (Any Azure AD directory - Multitenant).
  3. Compute — deploy stateless service:

    • Example CLI (App Service):

      az group create -n rg-saas -l eastus
      az appservice plan create -g rg-saas -n saas-plan --sku P1v2
      az webapp create -g rg-saas -p saas-plan -n saas-app-001 --runtime "DOTNET|6.0"
    • For microservices: AKS with ingress and namespace per tenant (if needed).

  4. Data layer — choose database topology:

    • DB-per-tenant with Elastic Pool:

      az sql server create -g rg-saas -n saas-sql -l eastus -u sqladmin -p 'P@ssw0rd!'
      az sql elastic-pool create -g rg-saas -s saas-sql -n saas-pool --dtu 50
      # create databases in pool per tenant
    • Or shared DB: implement tenantId in tables + row-level security (RLS).

  5. Secrets & keys — store in Key Vault, use managed identity for app access.

  6. Networking & security — use Private Endpoints for DB/Key Vault; WAF on front door or App Gateway.

  7. Onboarding & metering — implement automated provisioning pipelines: Resource Manager templates or Bicep to create DB & assign quotas; use Application Insights / Azure Monitor for telemetry and Azure Cost Management for billing.


42. Explain reference architecture for a secure AKS deployment

Explanation (core components) Secure AKS should combine: private cluster / private endpoint, Azure AD integration and RBAC, Azure Policy / OPA Gatekeeper for governance, network policy (Azure CNI + Calico), Key Vault + Secrets Store CSI driver, Azure Defender for Containers, monitoring (Container Insights), and hardened node image + node pool isolation.

Real-world project example Financial microservices cluster: private AKS cluster with Azure AD pod identity, Calico network policies isolating namespaces, secrets fetched via Key Vault CSI driver, policy gates to block privileged containers, and Defender for Container scanning images.

Step-by-step (core CLI sample)

Create a secure AKS cluster:

# resource group
az group create -n rg-aks-sec -l eastus

# create AKS with private cluster, Azure AD, managed identity, azure CNI, monitoring and azure-policy addon
az aks create -g rg-aks-sec -n aks-sec \
  --node-count 3 --node-vm-size Standard_DS3_v2 \
  --network-plugin azure \
  --enable-aad \
  --enable-azure-rbac \
  --enable-managed-identity \
  --enable-private-cluster \
  --enable-addons monitoring,azure-policy \
  --generate-ssh-keys

Hardening & extensions:

  • Enable Azure Policy for AKS (addon already enabled above) — enforces pod security/privileged restrictions.

  • Install Calico/network policy if advanced network policies needed.

  • Install Secrets Store CSI driver + Key Vault provider, give cluster MSI access to Key Vault:

    az keyvault set-policy -n kv-sec --spn <aks_managed_identity_clientId> --secret-permissions get
  • Enable Defender for Cloud (Defender for containers) — portal or Security Center.

  • Use private container registry (ACR) and attach ACR to AKS:

    az acr create -g rg-aks-sec -n myacr --sku Standard
    az aks update -n aks-sec -g rg-aks-sec --attach-acr myacr

Operational best practices: node pool separation (system vs user), automated image scanning in CI, RBAC for cluster and Azure resources, and regular vulnerability scans.


43. How would you implement encryption at rest and in transit for SQL Database?

Explanation

  • At rest: Azure SQL uses TDE (Transparent Data Encryption) by default (service-managed keys). For customer control, use Customer-Managed Keys (CMK) stored in Azure Key Vault (bring your own key). Also consider Always Encrypted for column-level encryption where application-side encryption is required (client holds keys).
  • In transit: enforce TLS 1.2+ for DB connections, restrict endpoints via VNet/private endpoint, and require enforce TLS connections and firewall rules.

Real-world project example Banking app: Azure SQL with TDE + CMK in Key Vault (Key Vault accessible only via Private Endpoint), enforce TLS 1.2, and use private endpoint for DB so traffic never traverses internet. Use Always Encrypted for PII columns.

Step-by-step (high level + CLI snippets)

  1. Create Key Vault & key (CMK)
az keyvault create -g rg-sql -n kv-sql -l eastus --enable-purge-protection true --enable-soft-delete true
az keyvault key create -g rg-sql -n sql-cmk --vault-name kv-sql --kty RSA
  1. Create Azure SQL server & DB
az sql server create -g rg-sql -n sqlsrv01 -l eastus -u sqladmin -p 'P@ssword!'
az sql db create -g rg-sql -s sqlsrv01 -n sqldb01 --service-objective S0
  1. Grant SQL Server access to Key Vault key

    • Create a user-assigned managed identity or use server-level identity and grant get & wrapKey permissions on Key Vault key. (Portal or az keyvault set-policy.)
  2. Configure TDE with CMK (portal steps or REST/PowerShell)

    • In portal: SQL Server → Transparent data encryption → Choose customer-managed key → select Key Vault key.
    • CLI / PowerShell steps vary by API version; typically create server key and set encryption protector.
  3. Enable Private Endpoint for DB

az network private-endpoint create -g rg-sql -n pe-sql --vnet-name vnet-hub --subnet private-subnet \
  --private-connection-resource-id /subscriptions/<sub>/resourceGroups/rg-sql/providers/Microsoft.Sql/servers/sqlsrv01 --group-ids sqlServer
  1. Force TLS / minimum TLS version
az sql server update -g rg-sql -n sqlsrv01 --minimal-tls-version TLS1_2
  1. Always Encrypted: configure column master key stored in Key Vault and column encryption keys; enable client driver support in application to encrypt/decrypt sensitive columns.

Notes: TDE protects data files/backup; Always Encrypted protects sensitive data from DB administrators. Use private endpoints + firewall + managed identities + auditing for compliance.


44. What’s the difference between Standard and Premium SSDs in Azure?

Explanation (summary)

  • Standard SSD: cost-effective SSD storage with moderate IOPS/throughput — good for dev/test, web servers, general purpose workloads with moderate performance needs.
  • Premium SSD: high-performance enterprise SSD with low latency, higher IOPS and throughput — ideal for production databases, high-IOPS VMs, transactional systems. Differences: IOPS, throughput, latency, durability SLA, bursting behavior (some sizes allow bursting).

Real-world project example Use Premium SSDs for production SQL Server or high-transaction AKS PVs; use Standard SSD for frontend app servers and non-critical workloads.

Step-by-step (create disk CLI):

# Standard SSD disk
az disk create -g rg-disks -n disk-std --sku StandardSSD_LRS --size-gb 128

# Premium SSD disk
az disk create -g rg-disks -n disk-prem --sku Premium_LRS --size-gb 128

Attach to VM or use as Managed Disk for VMSS/VM. Choose SKU based on required IOPS and throughput (consult Azure docs for SKU limits).


45. How do you migrate on-prem VMware workloads to Azure?

Explanation (process)

  • Use Azure Migrate for discovery, assessment, and migration planning. For lift-and-shift, Azure Migrate Server Migration or Azure Site Recovery (ASR) replicates VMs to Azure. Steps: assess dependencies, sizing/cost estimates, choose Azure target (VMs, VMSS, specialized services), replicate/test, cutover.

Real-world project example Migrate 200 VMware VMs: run Azure Migrate appliance for discovery and dependency mapping, assess VM readiness and cost, prioritize tiers, replicate with Azure Migrate/ASR, run test failover, then cut over in scheduled windows.

Step-by-step (high level)

  1. Assess

    • Portal: Azure Migrate → create a project → add the Server Assessment tool → download and run the Azure Migrate appliance on-prem to discover VMs.
  2. Plan

    • Review sizing, bandwidth, storage type choices, and target regions.
  3. Replicate

    • Use Azure Migrate: Server Migration or Site Recovery:

      • Azure Migrate → Server Migration → add source (VMware) → replicate selected VMs to Azure.
      • OR configure ASR for continuous replication.
  4. Test

    • Run test failover into isolated network in Azure to validate functionality without cutting over.
  5. Cutover

    • Perform planned failover/cutover and finalize (DNS updates, reconfigure IPs).
  6. Optimize

    • After migration, right-size VM SKUs, consider reserved instances, and adopt PaaS where suitable.

Notes: For production databases consider re-platforming to managed services (Azure SQL) for operational benefits.


46. Explain how to design an Azure VDI solution

Explanation Azure VDI (Virtual Desktop Infrastructure) is implemented via Azure Virtual Desktop (AVD). Key design elements: host pools (pooled vs personal), FSLogix for profile containers, image management (golden images), scale plans, storage for profile containers (Azure Files or Azure NetApp Files), networking, identity integration, and security controls.

Real-world project example A global company provides secure desktops for remote employees: AVD host pools in each region, FSLogix profile containers stored in Azure Files premium with identity-based access, conditional access policies to require compliant devices.

Step-by-step (core tasks)

  1. Create resource group & networking

  2. Create storage for FSLogix profiles (Azure Files premium or ANF)

    az storage account create -g rg-avd -n stavdprofiles --sku Premium_LRS --kind StorageV2
  3. Create host pool

    • Portal: Azure Virtual Desktop → Host pools → Add → choose pooled/personal, set VM size, image, domain join config.
    • You can create host pool VMs via ARM/Bicep/Scale Set for scale capabilities.
  4. Configure FSLogix

    • Create file share and configure group policy or AVD property to point FSLogix to the file share.
  5. Create Application Groups & Workspaces — assign users to desktop app groups.

  6. Scale & image management

    • Use Shared Image Gallery or Azure Compute Galleries for golden images.
    • Use autoscale solutions (Azure Automation runbooks or Scale Plans) to reduce costs.
  7. Security

    • Domain join hosts (Azure AD hybrid join or Azure AD join with FSLogix), conditional access, MFA, NSGs, and endpoint protections.

Notes: Use Azure Files Premium or Azure NetApp Files if FSLogix IOPS requirements are high. Leverage Managed Identities and private endpoints for storage access.


47. What is Azure Arc and where would you use it?

Explanation Azure Arc extends Azure management & governance to resources outside Azure: servers (Windows/Linux), Kubernetes clusters, and data services running on-premises or in other clouds. Arc lets you apply policies, inventory, tagging, GitOps deployment, and centralized management.

Real-world project example An enterprise runs apps across Azure, AWS, and on-prem VMware. Use Azure Arc to onboard on-prem servers into Azure for centralized policy, monitoring, and deploy consistent configuration via GitOps to Arc-enabled Kubernetes clusters.

Step-by-step (onboard a machine)

  1. Install Connected Machine agent on target server (Linux/Windows):

    • In portal: Azure Arc → Servers → Add → follow script to run on machine (script registers the machine with resource group).
  2. For Kubernetes: install Azure Arc agents (az connectedk8s connect).

    az connectedmachine connect --resource-group rg-arc --name my-onprem-server --location eastus
    # For k8s:
    az connectedk8s connect -g rg-arc -n my-cluster
  3. Manage: apply Azure Policy at Arc resource group scope, enable monitoring, deploy configuration via GitOps using k8s-configuration feature.

Use cases: centralized policies for hybrid, deploy consistent config across clusters, manage SQL Server instances via Arc Data Services, and enable inventory & compliance reporting.


48. How do you integrate on-prem Active Directory with Azure AD?

Explanation Integrate via Azure AD Connect which synchronizes identities (users/groups) and optionally passwords (Password Hash Sync), pass-through authentication, or federation (AD FS). You can enable Seamless Single Sign-On and configure writeback (password/mail). This creates hybrid identity enabling SSO to Azure services.

Real-world project example Enterprise enables cloud SSO for Office 365: install Azure AD Connect with Password Hash Sync and Seamless SSO to provide users with the same credentials on-prem and in the cloud.

Step-by-step (summary)

  1. Prerequisites: verify AD DS health, network connectivity, and an Azure AD global admin account.

  2. Install Azure AD Connect (on a server with connectivity to AD):

    • Download installer from Microsoft.
    • Choose sign-in method: Password Hash Sync (PHS) (simple), Pass-through Auth (PTA), or Federation (AD FS).
  3. Configure options: OU filtering, attribute filtering, password writeback, device writeback if needed.

  4. Enable Seamless SSO during configuration (adds computer account to on-prem AD).

  5. Verify synchronization in portal: Azure AD → Users → check synchronized flag.

Notes: Use PHS for simplicity and resiliency; choose AD FS only if required for advanced scenarios (e.g., complex claims rules).


49. Explain conditional access policies in Azure AD

Explanation Conditional Access (CA) enforces access controls based on conditions: user/group, location, device state, client app, risk level, and more. Policies specify conditions and controls (require MFA, block access, require compliant device, require terms of use). Use CA to implement Zero Trust controls.

Real-world project example Require MFA for all admins and block legacy protocols; allow access to corporate apps only from devices marked compliant by Intune and when users are using modern authentication clients.

Step-by-step (Portal)

  1. Azure AD → Security → Conditional Access → New policy.
  2. Assignments: select users/groups (e.g., All users except breakglass), cloud apps (Office 365, custom apps).
  3. Conditions: sign-in risk, device platform, locations (include/exclude trusted IPs).
  4. Access controls: grant — require MFA + require device to be marked compliant; or block access.
  5. Enable & Test: first configure policy in Report-Only mode or exclude break-glass accounts. Use Sign-in logs to validate effect.

Best practices: Always exclude emergency access / break-glass accounts from CA; test in report mode before full enablement.


50. How would you implement API throttling in Azure API Management (APIM)?

Explanation APIM provides policies to control rate, quota and call throttling at different scopes (product, API, operation). Use <rate-limit>, <quota> and <rate-limit-by-key> to enforce calls per period and per key (subscription or header). Combine with caching and backend circuit breaker policies.

Real-world project example Public API product: enforce 1000 calls/day per subscription, with per-minute burst limit to 60 calls/minute. Use policies to protect backend and to differentiate paid/free tiers.

Step-by-step (Portal & policy snippet)

  1. Create APIM instance (Portal or CLI):

    az apim create -g rg-apim -n apim-prod --publisher-email ops@contoso.com --publisher-name Contoso
  2. Create Product and associate APIs & subscriptions.

  3. Apply policy (inbound policy at product or API level). Example policy snippet:

<inbound>
  <rate-limit-by-key calls="60" renewal-period="60" counter-key="@(context.Subscription?.Id ?? context.Request.IpAddress)" />
  <quota-by-key calls="1000" renewal-period="86400" counter-key="@(context.Subscription?.Id ?? context.Request.IpAddress)" />
  <!-- other policies (validate JWT, transform) -->
</inbound>
  1. Test and monitor throttling via APIM analytics.

Notes: Use <rate-limit-by-key> to implement per-subscriber throttles; use caching and retry strategies in client SDKs.


51. Explain architecture of Azure Data Lake for analytics

Explanation (core components) A modern analytics architecture usually includes: Ingest (Event Hubs, IoT Hub, Data Factory), landing ADLS Gen2 (hierarchical namespace), processing (Azure Databricks / Synapse Spark), serving layers (Parquet curated datasets, Synapse SQL pools), catalog & governance (Microsoft Purview), and visualization (Power BI). Security: RBAC, ACLs on ADLS, private endpoints, Key Vault CMKs.

Real-world project example Retail analytics: ingest POS events to Event Hubs, process with Databricks streaming into curated Parquet on ADLS Gen2, catalog with Purview, use Synapse to run near-real-time SQL queries, and Power BI for dashboards.

Step-by-step (core CLI snippets)

  1. Create ADLS Gen2 storage
az storage account create -g rg-data -n dlsgen2 --sku Standard_LRS --kind StorageV2 --hierarchical-namespace true
  1. Create Data Factory & Integration Runtime — for orchestrations and copy jobs.
  2. Create Databricks workspace
az databricks workspace create -g rg-data -n dbricks01 -l eastus
  1. Provision Synapse workspace for SQL analytics.
  2. Catalog & governance — provision Purview and register ADLS.
  3. Security — enable firewall, private endpoints, and set ACLs on filesystem (POSIX style), use managed identities for jobs.

Notes: Design separate layers (raw/bronze, curated/silver/gold), partition data, compress and use columnar formats, and implement lifecycle rules.


52. How do you design a secure multi-tier app in Azure?

Explanation Design a DMZ + web tier + app tier + data tier with least privilege, network segmentation (hub-and-spoke), central firewall, NSGs in spokes, WAF for web facing ingress, private endpoints for backend PaaS, Key Vault & managed identities, logging and monitoring.

Real-world project example Customer portal: Azure Front Door + WAF → App Gateway → Web App in spoke → internal API in spoke with NSG & Application Gateway internal → SQL DB with private endpoint → Key Vault for secrets.

Step-by-step (core elements)

  1. Create hub VNet with Azure Firewall and hub peering.
  2. Create spoke VNets for web, application, and data (or use PaaS components with private endpoints).
  3. Deploy WAF (Front Door or App Gateway) in front of web tier.
  4. Use NSGs on subnets to allow only required ports between tiers.
  5. Use Private Endpoints for SQL/Storage/KeyVault and assign RBAC permissions (no public access).
  6. Managed Identities for apps to access Key Vault.
  7. Monitoring/Auditing — Log Analytics, App Insights, Azure Sentinel/Defender.

Example CLI (private endpoint for SQL):

az network private-endpoint create -g rg-sec -n pe-sql --vnet-name hub-vnet --subnet private-subnet \
  --private-connection-resource-id /subscriptions/<sub>/resourceGroups/rg-db/providers/Microsoft.Sql/servers/sqlsrv01 --group-ids sqlServer

53. What’s the difference between NSG vs ASG?

Explanation

  • NSG (Network Security Group): contains rules that allow/deny traffic by source IP, dest IP, port, protocol and is applied to subnets or NICs.
  • ASG (Application Security Group): a logical grouping of NICs/VMs to allow use of application-centric names in NSG rules (instead of IP addresses). ASGs simplify rule management as workloads scale.

Real-world project example In a multi-tier app use ASGs web-tier, app-tier, db-tier and create NSG rules referencing those ASGs (e.g., allow web-tier → app-tier on 443).

Step-by-step (CLI)

# create ASG
az network asg create -g rg-net -n asg-web --vnet-name vnet-hub

# create NSG and use ASG in rule (reference by id)
az network nsg create -g rg-net -n nsg-app
az network nsg rule create -g rg-net --nsg-name nsg-app -n AllowWebToApp --priority 100 \
  --direction Inbound --access Allow --source-asgs /subscriptions/<sub>/resourceGroups/rg-net/providers/Microsoft.Network/applicationSecurityGroups/asg-web \
  --destination-port-ranges 443

Best practice: use ASGs for dynamic scaling scenarios; NSGs enforce actual policies.


54. How do you implement multi-region SQL failover?

Explanation

  • For Azure SQL Database use Auto-Failover Groups (read/write failover groups or geo-replication).
  • For SQL Managed Instance use Auto-failover groups.
  • For IaaS SQL servers, use Always On Availability Groups across regions with careful networking and replication (requires Azure Load Balancer / traffic configuration or ASR for DR).

Real-world project example E-commerce DB uses Azure SQL with an automatic failover group to replicate to a secondary region; app uses DNS name of the failover group listener to connect and supports read-only secondary reads.

Step-by-step (CLI for Azure SQL failover group)

  1. Create primary and secondary servers & DBs
  2. Create failover group
az sql failover-group create -g rg-sql --server sqlsrv-primary --name fg1 \
  --partner-server /subscriptions/<sub>/resourceGroups/rg-sql-dr/providers/Microsoft.Sql/servers/sqlsrv-secondary \
  --failover-policy Automatic
  1. Configure read/write listener endpoint for connection strings; application uses read/write listener and read-only routing to secondaries as needed.

Notes: Test failover regularly and design client retry logic; for strict RPO/RTO pick appropriate service tiers.


55. How do you monitor cost for multi-subscription Azure environments?

Explanation Use Management Groups hierarchy and set Cost Management & Billing policies at required scopes; apply tags for chargeback; create budgets at subscription/management group scope; export billing and usage to storage or Power BI for cross-subscription reporting.

Real-world project example Global company groups subscriptions by region under management groups and applies budgets & alerts for each BU; exports usage daily to a storage account and loads into Power BI for aggregated dashboards.

Step-by-step

  1. Organize subscriptions into Management Groups.

  2. Tagging: enforce required tags via Azure Policy (costCenter, environment).

  3. Budgets: create budgets per subscription or mgmt group:

    az consumption budget create --amount 5000 --time-grain Monthly --name ProdBudget --scope /subscriptions/<sub-id>
  4. Alerts & actions: attach action group to send email/webhook or trigger automation.

  5. Export daily usage to storage (Cost Management exports) and build Power BI reports.

  6. Use Cost Management APIs for automated reports.


56. Explain Secure Score in Azure Security Center (Defender for Cloud)

Explanation Secure Score is an assessment metric that scores your security posture based on Microsoft Defender for Cloud recommendations. It shows prioritized improvement actions and expected security posture increase if you implement them.

Real-world project example An operations team monitors Secure Score weekly, implements high-impact recommendations (enable MFA, enable endpoint protection, enable disk encryption), and tracks progress across subscriptions.

How to use

  1. Open Microsoft Defender for Cloud → Secure Score.
  2. Review top recommendations, see impact & resources affected.
  3. Implement remediation (Azure Policy assignments, resource changes) and track score improvement.

Automation: Use automated playbooks to remediate low-effort/high-impact recommendations.


57. How do you handle compliance requirements (GDPR, HIPAA, PCI-DSS) in Azure?

Explanation Compliance is about people/process/tech. In Azure: choose compliant regions, enable encryption (CMK), use Private Endpoints, implement least privilege & auditing, and use Azure services with compliance attestations. Use Azure Policy, Azure Blueprints (compliance blueprints), Compliance Manager, and Azure Security Center to measure & remediate.

Real-world project example Healthcare app (HIPAA): host data in approved regions, use Key Vault + CMK, restrict access with private endpoints, enable audit logging/retention, apply HIPAA/ISO/PCI blueprints and document controls in Compliance Manager.

Step-by-step

  1. Assess: map required controls to Azure services using Compliance Manager.
  2. Deploy baseline: apply compliance blueprint (Azure Blueprint for HIPAA/PCI), policy assignments for encryption, allowed regions, resource types.
  3. Implement technical controls: encryption, access control, network segmentation, logging retention.
  4. Continuous compliance: enable Defender for Cloud, continuous export of findings, and remediation playbooks.

58. Explain how Azure Lighthouse helps MSPs manage customer subscriptions

Explanation Azure Lighthouse enables managed service providers (MSPs) to manage customer subscriptions/resources at scale using delegated resource management—customers grant explicit, auditable access to the MSP. MSPs can use unified view, templates, automation, and scale managed services across tenants.

Real-world project example An MSP manages 50 customers; each customer grants the MSP delegated access via ARM template or Marketplace offer; MSP uses Azure Policy, Sentinel, and automation across customers from a single management tenant.

Step-by-step (onboard customer via ARM template)

  1. Customer runs a Lighthouse ARM template which includes Microsoft.ManagedServices/registrationDefinitions with delegated role assignments.
  2. Customer assigns the registration definition scope to subscriptions/resource groups granting the MSP defined roles.
  3. MSP sees customer tenancy resources in their portal under “Customer resources” and can manage per delegated roles.

Notes: Use minimum required roles, audit delegated access, and use Lighthouse with automation and monitoring tools.


59. When would you use Azure Dedicated Host vs Scale Set?

Explanation

  • Azure Dedicated Host: provides a physical server dedicated to single tenant — use when hardware isolation, regulatory/licensing (e.g., BYOL) or compliance require dedicated hardware.
  • Virtual Machine Scale Set (VMSS): auto-scalable group of identical VMs across the Azure shared infrastructure — use for elastic, high-scale stateless workloads.

Real-world project example Government workload requiring physical isolation uses Dedicated Hosts; a web front end with autoscaling uses VMSS.

Step-by-step (create Dedicated Host group & host)

# host group
az vm host group create -g rg-hosts -n myHostGroup -l eastus

# create a host (specify SKU)
az vm host create -g rg-hosts -n myHost --host-group-name myHostGroup --sku DSv3-Type1
# then create VM on that host (assign host)
az vm create -g rg-hosts -n vm1 --image Win2019Datacenter --host myHost ...

When to pick which

  • Dedicated Host → compliance/ isolation / licensing.
  • VMSS → horizontal scale, autoscale, cloud-native designs.

60. Explain how to secure AKS nodes with Azure Policy

Explanation Azure Policy for AKS (Gatekeeper/ADOP) enables cluster-level enforcement: restrict privileged containers, disallow hostPath, ensure allowed sysctls, ensure use of approved images, enforce pod security standards, and require node auto-upgrade / baseline security. The Azure Policy addon for AKS enforces policies at the Kubernetes API level.

Real-world project example Enterprise enforces policies to block privileged containers and required Pod Security Standards across all AKS clusters via management group policy assignment.

Step-by-step (enable & use)

  1. Create AKS with Azure Policy addon
az aks create -g rg-aks -n aks-sec --enable-addons azure-policy --enable-managed-identity --node-count 3 --node-vm-size Standard_DS3_v2 --generate-ssh-keys
  1. Assign built-in policy definitions (e.g., “Kubernetes cluster should not allow privileged containers”) at management group or subscription scope in the portal or via az policy assignment create.
  2. Use Initiative of Kubernetes policies (MSFT provides built-in initiatives) to apply a set of constraints.
  3. Monitor non-compliant resources in Azure Policy & remediate (deny or audit). For blocking new non-compliant workloads, use deny/OPA constraints.

Additional node hardening

  • Use latest node image (kured/automatic node image upgrades) and use system node pools with minimal privileges; enable Azure Defender for containers.

Clone this wiki locally