Skip to content

AxiomNode/platform-infra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

platform-infra

Last updated: 2026-05-03.

Infrastructure and deployment orchestration for the AxiomNode platform.

Responsibility

What this repository owns

  • Kubernetes base manifests and overlays.
  • Environment-specific compose assets.
  • Infrastructure validation and deployment automation.
  • Cross-repository image build orchestration.
  • Dev local orchestration for full-stack container runtime.

Runtime role

Distribution logic

  • dev

    • Local-only distribution.
    • Full stack runs via Docker Compose on a developer machine.
    • Service-to-service connections are local (localhost/host.docker.internal).
    • Entry point: scripts/dev-local-stack.sh.
  • stg

    • Remote Kubernetes distribution on sebss@amksandbox.cloud.
    • Public domains route through ingress.
    • The default staging overlay currently deploys the split AI runtime (ai-engine-api, ai-engine-stats, ai-engine-cache) in-cluster.
    • The llama runtime may remain external and is therefore still a runtime-routing concern rather than a pure manifest concern.
    • When you need the full in-cluster AI topology again, use the optional kubernetes/overlays/stg-with-ai-engine variant through manual deploy.
    • CI/CD auto-deploy target after successful image builds on main.
  • prod

    • Final distribution tier for production scalability.
    • Can run distributed services and external cloud-managed resources (DB, ingress, scaling).
    • Deployment is manual/controlled, not the default automatic target.

Runtime surface

Repository structure

  • kubernetes/: base resources + dev/stg/prod overlays.
  • environments/: compose-based integration environments.
  • terraform/: infrastructure as code modules.
  • .github/workflows/: CI/CD workflows.

Documentation

  • kubernetes/README.md
  • kubernetes/base/README.md
  • environments/dev/README.md
  • environments/stg/README.md
  • environments/prod/README.md
  • terraform/README.md

Ownership boundary

This repository owns deployment and packaging policy, not business behavior.

Concrete ownership includes:

  • manifest shape and overlay composition
  • image selection and rollout behavior
  • cross-repository build orchestration
  • environment rendering and rollout validation

Service-specific business contracts, route semantics, and domain validation belong in their respective service repositories.

Local setup

CI/CD workflows

  • validate-infra.yml

    • Trigger: push (main, develop), pull request, manual dispatch.
    • Purpose: validates required infrastructure directories, blocks mutable Kubernetes :latest image tags, and renders dev/stg/prod overlays with kubectl kustomize.
  • build-push.yaml (Build & Push Docker Images)

    • Trigger: push (main, develop) and manual dispatch.
    • Purpose: detects changed services (or selected service), checks out source repos, and publishes images to GHCR.
    • Notes:
      • Uses CROSS_REPO_READ_TOKEN to access private source repos.
      • Publishes dev tags, and on main also publishes stg.
      • Covered service repos dispatch this workflow only after their own validation jobs succeed on main.
      • Optional publish_prod_tag=true on manual dispatch adds mutable prod tags for controlled production promotion.
  • deploy.yaml (Deploy to Kubernetes)

    • Trigger: successful completion of build-push.yaml on main, or manual dispatch.
    • Current policy: automatic deployment is pinned to stg.
    • Purpose: validates manifests, renders the selected overlay, applies manifests to k3s, and waits for rollout.
    • Notes:
      • Workflow-driven staging deploys pin changed services to the immutable short-SHA tags produced by the triggering build run.
      • Manual deploys keep the environment tags (stg/prod) and still force restarts when a mutable tag must be refreshed.
      • Manual staging deploys can opt into the stg-with-ai-engine overlay with include_ai_engine=true.
    • Safety: rollout status + available replica checks fail the workflow if services are not healthy.

Deployment and operations notes

Current automation chain

  1. A service repo receives a push on main.
  2. That repo CI validates build, tests, lint, and any service-specific smoke checks.
  3. Only after those checks succeed, the repo dispatches platform-infra/.github/workflows/build-push.yaml with a service input.
  4. Build/push publishes updated image tags in GHCR.
  5. deploy.yaml runs and applies changes to axiomnode-stg.

Covered automatic chain services:

  • api-gateway
  • bff-mobile
  • bff-backoffice
  • backoffice
  • ai-engine-api
  • ai-engine-stats
  • microservice-quizz
  • microservice-wordpass
  • microservice-users

Not covered by this automatic GHCR-to-k3s chain:

  • mobile-app
  • external llama runtime hosts

The optional stg-with-ai-engine overlay is still required when you want the full in-cluster AI topology, including the llama runtime, for controlled smoke tests or diagnostics measurements.

Effective runtime caveat

platform-infra describes deployed resources, but not the whole effective topology.

In particular:

  • bff-backoffice can persist service-target overrides
  • api-gateway can persist the live ai-engine target
  • ai-engine-api can persist the active llama target

Operational documentation must therefore be read together with the central runtime-routing documents.

Local dev stack

Run all dev services locally with a single script:

./scripts/dev-local-stack.sh up cpu

Useful commands:

./scripts/dev-local-stack.sh status
./scripts/dev-local-stack.sh logs api-gateway
./scripts/dev-local-stack.sh down

Staging canary

Run an in-cluster ai-engine canary against staging without port-forwarding, but only when you deliberately deploy the optional in-cluster ai-engine manifests:

./scripts/ai-engine-stg-canary.sh

Run a public staging smoke for edge, aggregated services, apps, and AI-exposed checks, excluding the llama runtime:

./scripts/smoke-stg-edge.sh

Useful overrides:

GAME_TYPE=word-pass QUERY="sistema solar" ./scripts/ai-engine-stg-canary.sh
QUERY="teorema de pitagoras" CATEGORY_ID=19 NUM_QUESTIONS=3 ./scripts/ai-engine-stg-canary.sh

Dependencies and contracts

Required secrets in this repository

  • CROSS_REPO_READ_TOKEN
  • GHCR_PULL_USERNAME
  • GHCR_PULL_TOKEN
  • K3S_HOST
  • K3S_USER
  • K3S_SSH_KEY

GITHUB_TOKEN is used by the build workflow to publish packages to GHCR.

Local sealed-secret inputs

  • Keep real environment secret files only in untracked local paths such as secrets/dev.env, secrets/stg.env, and secrets/prod.env.
  • Use the committed templates under secrets/*.env.example as the starting point.
  • Do not commit populated secrets/*.env files; .gitignore intentionally blocks them.

Repository-local documentation scope

This repository should document:

  • what gets built and deployed
  • which overlays exist and when each one is used
  • how automatic versus manual rollout works
  • which parts of runtime behavior are outside declarative manifest ownership

References

  • kubernetes/README.md
  • environments/dev/README.md
  • environments/stg/README.md
  • environments/prod/README.md

About

Infrastructure

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages