AIITOps is an enterprise platform for running AI agents that plan, build, test, deploy, document, and operate customer applications through a governed control plane.
This repository is the control-plane foundation. It currently contains the Phase 0 bootstrap assets:
- project requirements and PRD
- repo standards and contribution workflow
- shared agent instruction baseline
- Python project metadata managed with
uv - local setup automation for the first development environment
Phase 1 adds:
- modular control-plane service architecture
- observability and secure configuration foundations
- Terraform scaffolding for Azure dev environments
- CI/CD workflows for validation and non-production deployment
Phase 2 adds:
- backend authentication modes for development and Microsoft Entra
- tenancy models for organizations, projects, and scoped memberships
- role-to-feature authorization checks
- approval-aware policy evaluation and audit tracking
- governed agent identity cataloging for managed-identity planning
Phase 3 adds:
- control-plane domain records for apps, environments, resources, events, and notifications
- a workflow engine with manual and scheduled definitions
- task state transitions with waiting, retry, and resume behavior
- notification routing for workflow and approval activity
- event capture with unusual-event filtering and archival hooks
- customer resource topology and dependency graph APIs
Phase 4 adds:
- a governed agent registry for the first six core agents
- context assembly from workflows, app metadata, requirements, and asset-library guidance
- Azure Foundry-compatible model access with safe preview-mode fallback
- policy-aware agent invocation with approval blocking for sensitive runs
- structured agent outputs, branch-level plans, and agent run history APIs
Phase 5 adds:
- GitHub-linked repository records for seeded applications
- backlog synchronization from Project Manager plans into GitHub-style work items
- assignment flows for humans and agents
- branch creation using Coding Agent branch proposals
- pull request creation with workflow and agent metadata
- testing evidence attachment and release-gate summaries for review
Phase 6 adds:
- a Flutter human UI under
apps/control_plane_ui - a web-first responsive shell for dashboard, workflows, inbox, delivery, and conversation views
- live control-plane API integration with deterministic demo-data fallback
- persona switching for Project Manager, Approver, and Local Admin workflows
- Flutter validation in both local tooling and CI
Phase 7 adds:
- a customer-plane application factory service for one reference MVP archetype
- requirement capture, backlog generation, repo registration, and scaffold planning in one flow
- per-app dev environment planning and Azure resource-topology registration
- automatic use of Project Manager, Coding, Testing, and Author agent runs inside the factory flow
- factory APIs for archetypes and run history
Phase 8 adds:
- governance review services for security, budget, operations, and nightly health bundles
- expanded agent coverage for Security, Budget, Operations, UserAdmin, Versioning, and Cloud Admin
- approval-aware admin and cloud change planning
- automatic execution of the nightly health workflow into a reviewable governance record
- governance APIs for findings, budget reviews, and stored nightly health-check runs
Phase 9 adds:
- persistent regression suites and readiness snapshots under
debug/ - provider-readiness records for Azure-active and AWS/GCP interface-ready expansion
- platform testing-readiness reporting across docs, automation, governance, and artifacts
- release-readiness reporting for pull requests using evidence, approvals, governance, and regression signals
- a final pre-testing API surface that prepares the platform state for QA handoff
The initial implementation is Azure-first and currently centers on:
- a Flutter-based human UI
- Python backend services
- Microsoft Agent Framework orchestration
- Azure Foundry model usage
- GitHub-based planning and delivery workflows
- one reference customer application factory flow
- governed operations, security, budget, and admin review flows
pwsh -File .\scripts\bootstrap.ps1pwsh -File .\scripts\install-external-tools.ps1
pwsh -File .\scripts\dev-shell.ps1.\.venv\Scripts\Activate.ps1uv run uvicorn aiitops.main:app --reloaduv run pytestSet-Location .\apps\control_plane_ui
& ".\.local-tools\flutter\flutter\bin\flutter.bat" run -d chromepwsh -File .\scripts\validate.ps1pwsh -File .\scripts\prepare-testing.ps1 -RunValidationdocs/ Product docs, ADRs, and design records
apps/ Flutter UI and future human-facing applications
infra/ Infrastructure-as-code and environment definitions
src/ Python source for the control plane
tests/ Automated tests
scripts/ Bootstrap and developer automation
temp/ Local temporary files (ignored)
debug/ Local debug output (ignored)
Reqs/ Source requirements documents
src/aiitops/api: FastAPI app factory, middleware, and routerssrc/aiitops/application: service wiring and dependency containersrc/aiitops/orchestration: orchestration runtime primitivessrc/aiitops/workflows: workflow service foundationsrc/aiitops/policy: approval and policy guard foundationssrc/aiitops/integrations: Azure and GitHub integration boundariessrc/aiitops/observability: logging and telemetry setupsrc/aiitops_shared: shared contracts and reusable library primitives
developmentauth mode uses seeded local users and the optionalx-aiitops-user-idheader.entraauth mode validates bearer tokens against Microsoft Entra signing keys.- Organizations and projects are access-scoped through memberships and feature mappings.
- Sensitive actions can return
require_approvaland create approval records before execution.
GET /api/v1/identity/meGET /api/v1/identity/agentsGET /api/v1/tenancy/organizationsGET /api/v1/tenancy/organizations/{org_id}/projectsPOST /api/v1/policy/evaluateGET /api/v1/policy/approvalsPOST /api/v1/policy/approvals/{approval_id}/decisionsGET /api/v1/policy/audit
GET /api/v1/domain/appsGET /api/v1/domain/apps/{app_id}GET /api/v1/domain/apps/{app_id}/resourcesGET /api/v1/workflows/definitionsGET /api/v1/workflows/runsPOST /api/v1/workflows/runsGET /api/v1/workflows/runs/{workflow_run_id}POST /api/v1/workflows/runs/{workflow_run_id}/tasks/{task_run_id}/transitionPOST /api/v1/workflows/runs/{workflow_run_id}/retryPOST /api/v1/workflows/runs/{workflow_run_id}/resumePOST /api/v1/workflows/scheduler/tickGET /api/v1/notificationsPOST /api/v1/notifications/{notification_id}/acknowledgeGET /api/v1/eventsPOST /api/v1/events/archive
GET /api/v1/agents/definitionsGET /api/v1/agents/definitions/{agent_id}POST /api/v1/agents/definitions/{agent_id}/contextPOST /api/v1/agents/definitions/{agent_id}/invokeGET /api/v1/agents/runsGET /api/v1/agents/runs/{agent_run_id}
GET /api/v1/github/repositoriesGET /api/v1/github/repositories/{repo_id}GET /api/v1/github/repositories/{repo_id}/work-itemsPOST /api/v1/github/repositories/{repo_id}/work-items/syncPOST /api/v1/github/work-items/{work_item_id}/assignmentsGET /api/v1/github/repositories/{repo_id}/branchesPOST /api/v1/github/repositories/{repo_id}/branchesGET /api/v1/github/repositories/{repo_id}/pull-requestsPOST /api/v1/github/repositories/{repo_id}/pull-requestsGET /api/v1/github/pull-requests/{pull_request_id}POST /api/v1/github/pull-requests/{pull_request_id}/evidence
apps/control_plane_ui/lib/src/presentation/control_plane_shell.dartprovides the responsive command-deck shell.apps/control_plane_ui/lib/src/data/control_plane_repository.dartmaps Flutter interactions to the Phase 2-5 backend APIs.- The UI defaults to the Azure Container App API and can be overridden with
AIITOPS_API_BASE_URLfor alternate environments. - When the backend is unavailable, the app falls back to seeded demo data so flows remain reviewable.
GET /api/v1/factory/archetypesGET /api/v1/factory/runsGET /api/v1/factory/runs/{factory_run_id}POST /api/v1/factory/runs
src/aiitops/factory/service.pyorchestrates the MVP factory across workflows, agents, GitHub delivery, Azure planning, and topology registration.src/aiitops_shared/models/factory.pydefines the archetype, deployment-plan, scaffold, and factory-run contracts.- The current MVP archetype is a customer service portal built with Flutter Web, FastAPI, and Azure-first infrastructure.
- Creating a factory run registers a new app, seeds a repository/backlog/PR, records deployment artifacts, and exposes the generated app through the existing domain routes.
GET /api/v1/governance/security/findingsGET /api/v1/governance/budget/reviewsGET /api/v1/governance/operations/findingsGET /api/v1/governance/health-checks
src/aiitops/governance/service.pyderives scoped security, budget, and operations signals and stores nightly health-check bundles.- The nightly
wf-nightly-health-checkworkflow now auto-completes with governance outputs instead of remaining an empty scheduled shell. - The agent registry now includes governance and admin-focused agents, all enforced through the same policy and approval model as earlier phases.
- Identity posture for the expanded agent set is visible through
GET /api/v1/identity/agents.
GET /api/v1/readiness/providersGET /api/v1/readiness/platformPOST /api/v1/readiness/platform/prepareGET /api/v1/readiness/regression-suitesPOST /api/v1/readiness/regression-suitesGET /api/v1/readiness/snapshotsPOST /api/v1/readiness/snapshotsGET /api/v1/readiness/releases/{pull_request_id}
src/aiitops/readiness/service.pypersists regression suites and readiness snapshots todebug/regression-suitesanddebug/readiness-snapshots.scripts/prepare-testing.ps1generates a local QA handoff pack, including JSON and Markdown summaries indebug/testing-handoffs.src/aiitops/readiness/handoff.pyprovides the CLI entry point for that handoff generation flow.POST /api/v1/readiness/platform/prepareruns the platform regression suite, ensures a governance health bundle exists, captures a snapshot, and returns a consolidated readiness report.- Release-readiness reports evaluate passed evidence, release checkpoints, pending approvals, scoped health-check coverage, and the latest regression suite for the target app.
- Provider readiness is now explicit: Azure is the active implementation and AWS/GCP are represented as interface-ready expansion targets.
- Security first and least privilege by default
- Reuse the shared asset library before introducing new components
- Keep
docs/andREADME.mdcurrent with every meaningful change - Use
.envlocally and never commit secrets - Prefer cloud services over custom infrastructure when possible
- Do not use mockups in place of working flows