Skip to content

[FEATURE] refactor(blueprints): consolidate full-single-node-cluster into unified full-multi-node-cluster #264

@katriendg

Description

@katriendg

Target Components

  • Terraform modules
  • Bicep modules
  • Blueprints
  • GitHub Actions
  • Documentation
  • Other

Other Component Details

Copilot instructions, agent files, learning katas, scripts, sitemap, and Go test utilities.

Problem Statement

The full-single-node-cluster and full-multi-node-cluster blueprints share ~90% of their Terraform and Bicep code but diverge in multi-node support, Arc machine integration, and a handful of variables. Both blueprints must be kept in sync whenever components change, creating maintenance burden, code duplication, and frequent drift. The single-node blueprint has historically been the "default" recommendation for getting started, yet the multi-node variant adds only ~14 extra variables and conditional VM/Arc logic. There is no technical reason to maintain two separate blueprints when a single configurable blueprint can serve both use cases.

Proposed Solution

Consolidate both blueprints into a single full-multi-node-cluster blueprint that supports single-node and multi-node deployments via a configurable nodes count variable (defaulting to 1).

The work proceeds in five phases:

  1. Backport unique multi-node features into full-single-node-cluster (14 variables, Arc machine data sources, conditional VM host creation, multi-node cluster slicing, server token generation)
  2. Delete the current full-multi-node-cluster and rename full-single-node-cluster to full-multi-node-cluster
  3. Update all ~195 references across ~58 files in documentation, copilot instructions, agents, learning content, application READMEs, sidebars, scripts, and sitemap
  4. Update documentation language so the unified blueprint is the recommended starting point with guidance: "Set nodes to 1 for single-node deployments (default). Increase to support multi-node clusters."
  5. Validate no orphaned references remain, Terraform validates, and Bicep builds pass

Benefits

  • Eliminates code duplication and blueprint drift between single-node and multi-node variants
  • Reduces ongoing maintenance burden when components change (update one blueprint instead of two)
  • Provides a single entry point for all "full" deployments, simplifying the getting started experience
  • Makes the single-node experience a natural subset of multi-node without losing any functionality
  • Simplifies documentation by removing the need to describe and distinguish two overlapping blueprints

Alternative Solutions

  • Keep both blueprints and use shared modules. This reduces duplication in IaC code but does not eliminate the documentation and reference sprawl across ~58 files. Both blueprints would still need independent testing and validation.
  • Create a third "unified" blueprint alongside the existing two. This avoids breaking existing references but adds a third blueprint to maintain and creates confusion about which one to use.

Implementation Ideas

Phase 1 — Backport unique multi-node capabilities

Terraform variables only in full-multi-node-cluster (14 variables):

Variable Purpose
should_use_arc_machines Toggle Arc machine mode vs. VM-hosted nodes
arc_machine_count Number of pre-existing Arc machines to use
arc_machine_name_prefix Naming prefix for Arc machine lookups
arc_machine_resource_group_name Resource group containing Arc machines
host_machine_count Number of VMs to create (multi-node)
cluster_server_ip Explicit server IP override
cluster_server_host_machine_username SSH username for cluster server
onboard_identity_type Identity type for Arc onboarding
enable_auto_scaling AKS auto-scaling toggle
min_count / max_count AKS auto-scaling bounds
dns_prefix AKS DNS prefix
subnet_address_prefixes_aks AKS subnet CIDR
subnet_address_prefixes_aks_pod AKS pod subnet CIDR
aks_private_dns_zone_id Private DNS zone for AKS
should_enable_otel_collector OpenTelemetry collector for edge IoT Ops

Terraform main.tf logic only in full-multi-node-cluster:

  • Arc machine data sources (data "azurerm_arc_machine") and deferred prefix computation
  • Conditional VM host creation (count = local.should_use_arc_machines ? 0 : 1)
  • Multi-node cluster machine slicing for server vs. worker nodes
  • cluster_node_machine / cluster_node_machine_count parameters to edge_cncf_cluster
  • should_generate_cluster_server_token for multi-node join
  • arc_onboarding_principal_ids for Arc machine identities

Logic to keep from full-single-node-cluster:

  • should_deploy_aio conditional gating on IoT Ops, assets, messaging modules
  • should_create_secret_sync_identity linked to should_deploy_aio
  • should_create_schema_registry and should_create_adr_namespace conditionals in cloud_data
  • tags variable passed to cloud_ai_foundry
  • All connector .tfvars.example files (dataflow, REST, ONVIF, SSE, Avro-to-JSON, Foundry)

Bicep: Unify main.bicep, main.bicepparam, and types.core.bicep similarly.

Example files to consolidate: simple.tfvars.example and simple-arc.tfvars.example from multi-node plus all .tfvars.example files from single-node.

Phase 2 — Rename

  1. Delete blueprints/full-multi-node-cluster/
  2. Rename blueprints/full-single-node-cluster/blueprints/full-multi-node-cluster/
  3. Update blueprint tag "full-single-cluster""full-multi-cluster"
  4. Update module doc comment header
  5. Update tests/go.mod module path and Go test file package comments

Phase 3 — Update all references (~58 files, ~195 references)

.github/ (instructions, agents, prompts) — 9 files, ~15 refs:

File Change
.github/copilot-instructions.md (6 refs) Replace all paths; merge "Complete Deployments" entry; update examples
.github/agents/security-plan-creator.agent.md (2 refs) Merge single/multi blueprint descriptions
.github/agents/learning-content-creator.agent.md (1 ref) Update main.tf path
.github/agents/wasm-operator-builder.agent.md (1 ref) Update .tfvars.example path
.github/instructions/wasm-build-deploy.instructions.md Update .tfvars.example path
.github/instructions/wasm-operator-templates.instructions.md Update .tfvars.example path
.github/instructions/README.md Update bicep file path example
.github/prompts/terraform-from-blueprint.prompt.md Update example blueprint name
.github/prompts/iotops-version-upgrade.prompt.md Update main.tf path

copilot/ — 4 files, ~5 refs:

File Change
copilot/getting-started.md Change default blueprint name
copilot/deploy.md Change suggested blueprint name
copilot/bicep/bicep.md Verify reference (already multi)
copilot/terraform/terraform.md Update directory listing

docs/ — 16 files, ~55 refs:

File Change
docs/getting-started/general-user.md (8 refs) Merge table rows, update tree/commands/links
docs/getting-started/blueprint-developer.md (8 refs) Update cp command and test reference links
docs/contributing/testing-validation.md (4 refs) Update test paths and cd commands
docs/contributing/documentation-development.md (1 ref) Update sidebar loading example
docs/_sidebar.md (6 refs) Remove duplicate entries; point to unified
docs/_parts/_sidebar.md (6 refs) Same
docs/_parts/blueprints-sidebar.md (6 refs) Same
docs/_parts/_navbar.md (2 refs) Merge into single entry
docs/solution-adr-library/sse-connector-real-time-event-streaming.md (2 refs) Update paths
docs/solution-adr-library/edge-video-streaming-and-image-capture.md (1 ref) Update blueprint name
docs/solution-adr-library/akri-connector-component-organization.md (2 refs) Update name and test reference
docs/solution-adr-library/onvif-connector-camera-integration.md (2 refs) Update paths
docs/project-planning/scenarios/packaging-line-performance-optimization/prerequisites.md Merge table rows; update link defs
docs/project-planning/scenarios/yield-process-optimization/prerequisites.md Same
docs/project-planning/scenarios/operational-performance-monitoring/prerequisites.md Same
docs/project-planning/scenarios/quality-process-optimization-automation/prerequisites.md Same
docs/project-planning/scenarios/predictive-maintenance/prerequisites.md Same

blueprints/ (cross-references) — 8 files, ~20 refs:

File Change
blueprints/README.md (6 refs) Merge two rows; update all paths
blueprints/only-output-cncf-cluster-script/README.md Verify link (already correct after rename)
blueprints/only-edge-iot-ops/README.md (2 refs) Remove single-node line or point to unified
blueprints/only-cloud-single-node-cluster/README.md Same
blueprints/partial-single-node-cluster/README.md Update comparison reference
blueprints/fabric/README.md Update blueprint name
blueprints/fabric-rti/README.md Update prerequisite blueprint name

src/500-application/ — 7 files, ~45 refs:

File Change
src/500-application/505-akri-rest-http-connector/README.md (11 refs) Rename all paths, commands, descriptions
src/500-application/508-media-connector/README.md (8 refs) Same
src/500-application/508-media-connector/.nobuild (1 ref) Update comment
src/500-application/509-sse-connector/README.md (8 refs) Same
src/500-application/510-onvif-connector/README.md (8 refs) Same
src/500-application/511-rust-embedded-wasm-provider/README.md (7 refs) Same
src/500-application/512-avro-to-json/README.md (2 refs) Same

src/ (other) — 3 files, ~25 refs:

File Change
src/000-cloud/085-ai-foundry/README.md Update blueprint names
src/100-edge/README.md (2 refs) Update "Choose Blueprint" links
src/900-tools-utilities/904-test-utilities/README.md (20+ refs) Update all test reference paths

learning/ — 8 files, ~25 refs:

File Change
learning/katas/ai-assisted-engineering/100-getting-started-basics.md Update blueprint name and link
learning/katas/ai-assisted-engineering/100-ai-development-fundamentals.md Update prompt example
learning/katas/edge-deployment/README.md Update link definition
learning/katas/fabric-integration/200-prerequisite-full-deployment.md Bulk rename all paths/descriptions
learning/katas/fabric-integration/README.md Update blueprint reference
learning/katas/troubleshooting/400-multi-component-debugging.md Update all paths
learning/katas/troubleshooting/400-performance-optimization.md Update prerequisite
learning/katas/troubleshooting/README.md Update blueprint reference

Root files — 3 files, ~5 refs:

File Change
sitemap.xml (2 refs) Update both URLs
scripts/location-check.sh Verify reference (already multi)
scripts/build/Detect-Folder-Changes.ps1 Update example comments

Phase 4 — Documentation language

Update all references previously describing full-single-node-cluster as "the recommended starting blueprint" to describe full-multi-node-cluster with guidance:

Set the nodes count to 1 for single-node deployments (default). Increase to support multi-node clusters.

Key docs: copilot/getting-started.md, copilot/deploy.md, docs/getting-started/general-user.md, learning/katas/ai-assisted-engineering/100-getting-started-basics.md, blueprints/README.md.

Phase 5 — Validation

  • Confirm no CI/CD pipeline YAML files reference full-single-node-cluster (checked: none in .azdo/ or azure-pipelines.yml)
  • Confirm .copilot-tracking/ research documents are internal (advisory only, no updates required)
  • Verify tests/go.mod module path updated along with all _test.go files
  • Run grep -r "full-single-node-cluster" . after all changes to catch remaining references
  • Terraform validate and lint on unified blueprint
  • Bicep build validation on unified blueprint

Potential Challenges

  • ~195 cross-references across ~58 files require careful bulk renaming; a missed reference will produce broken links or incorrect documentation.
  • Bicep unification requires merging two separate main.bicep / main.bicepparam / types.core.bicep files, which may have diverged differently than Terraform.
  • Go test module path in tests/go.mod and all _test.go package comments must be updated consistently to avoid build failures.
  • Blueprint layering documentation (e.g., fabric-rti layered on top of full-single-node-cluster) must be updated to reference the new name while preserving the layering semantics.
  • Users with existing deployments using full-single-node-cluster will need to be aware of the path change if they reference the blueprint by directory path.

Additional Context

Reference count summary:

Category Files Affected Approximate Reference Count
.github/ (instructions, agents, prompts) 9 ~15
copilot/ 4 ~5
docs/ 16 ~55
blueprints/ (cross-references) 8 ~20
src/500-application/ 7 ~45
src/ (other) 3 ~25
learning/ 8 ~25
Root files (sitemap.xml, scripts) 3 ~5
Total ~58 files ~195 references

Acceptance criteria:

  • blueprints/full-single-node-cluster/ directory no longer exists
  • blueprints/full-multi-node-cluster/ contains the unified blueprint with all features from both predecessors
  • Setting nodes = 1 (or the equivalent default) produces identical behavior to the old full-single-node-cluster
  • Multi-node capabilities (Arc machines, host_machine_count > 1, worker node slicing) function correctly
  • All .tfvars.example files from both blueprints are present in the unified blueprint
  • Zero references to full-single-node-cluster remain in the repository
  • All documentation, sidebar, navigation, and learning content updated
  • Terraform validate passes
  • Bicep build passes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestneeds-triageRequires triage and prioritizationrefactorCode refactoring, no version bump

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions