Skip to content

History

Revisions

  • monitoring-gmp: new page — prod GMP stack (PodMonitoring + materializer-lag Rules) + #103 flip prep status; cross-linked Home/Sidebar

    @kadyapam kadyapam committed Jun 19, 2026
    8f51b63
  • wiki: mtls-rust-stack — add Helm chart values-gated mTLS (Phase 4d) automation/helm/noetl/ now exposes the same mTLS shape via values; off by default. Documents the tls.* values block, the cert-manager template, the server + worker-pool patches, and the GKE issuer choice (GCP CAS / SPIRE ClusterIssuer for prod, self-signed bootstrap for kind). noetl/ops#165 (Secrets Wallet Phase 4 GKE flip, noetl/ai-meta#61).

    @kadyapam kadyapam committed Jun 6, 2026
    b2069ad
  • wiki: mTLS (Rust stack) page — cert-manager certs + manifests (Phase 4c) cert-manager-issued mutual TLS between noetl-server-rust + noetl-worker-rust; ci/manifests/noetl/tls/ runbook + the two probe/init caveats. Cross-linked from Home + _Sidebar. noetl/ops#163 (Secrets Wallet Phase 4c, noetl/ai-meta#61).

    @kadyapam kadyapam committed Jun 6, 2026
    b85a01e
  • docs(home): document validate-shard-routing-n2.sh Phase F R4-5 of noetl/ai-meta#49. Adds a row to the "Kind-cluster validation rigs" table for the new validate-shard-routing-n2.sh script -- the end-to-end DbPoolMap routing validation that complements the R3b drift- guard with actual data-residency checks across N=2 shards. Refs noetl/ops#160 Refs noetl/ai-meta#49

    @kadyapam kadyapam committed Jun 4, 2026
    7460902
  • docs(home): document validate-shard-drift-guard.sh Phase F R3b-3 of noetl/ai-meta#49. Adds a row to the "Kind-cluster validation rigs" table for the new validate-shard-drift-guard.sh script — the end-to-end shard-routing drift-guard that posts to both the noetl-server shard-info endpoint (R3b-1) and the noetl-gateway twin (R3b-2) across a battery of (execution_id, shard_count) pairs. Refs noetl/ops#157 Refs noetl/ai-meta#49

    @kadyapam kadyapam committed Jun 4, 2026
    8c7035b
  • add System worker pool deploy page (proposed) Operational companion to the docs-site ADR (https://noetl.dev/docs/architecture/system_pool_and_wasm_plugins). Reserves the manifest + Helm shape for the proposed ``worker-system-pool``: - Workload inventory after the Python → Rust migration completes. - NATS routing: per-pool scheme extension to ``noetl.commands.system.>`` and the ``system_*`` POOL_FILTER_MAP family. - KEDA scaler manifest (smaller cap, tighter lag threshold than user pools). - Deployment manifest with ``--mode=system`` args and WASM cache volume. - ServiceAccount + RBAC differences (distinct from user pool SA). - Helm chart integration via opt-in ``workerPools.system-pool`` values section (default disabled). - Kind validation rig sketch. Wired into Home + _Sidebar. Tracked under: - noetl/ai-meta#45 — compiled rewrite of publisher / projector / server. - noetl/ai-meta#46 — system pool plug-in surface.

    @kadyapam kadyapam committed Jun 2, 2026
    94852ff
  • docs(home): rust-worker-r2 rig now auto-pins to Rust worker Row mirrors the result-fetch + flight-tls rigs' worker-pinning annotation — all three are Rust-only and scale the Python worker pool to 0 for the duration of the run. Also notes the SQL probes' post-EE-4 filter shape (execution_id via psql -v, not worker_id). Refs noetl/ai-meta#35.

    @kadyapam kadyapam committed Jun 1, 2026
    aed2dfb
  • wiki: flight-tls-validation rig (R-2.3 Phase C2.5 + C2.6)

    @kadyapam kadyapam committed Jun 1, 2026
    de523b6
  • wiki: kind-cluster validation rigs catalog Documents the existing automation/development/ validation rigs as a discoverable list in Home.md. Two entries today: - rust-worker-r2-validation — R-2.1 / R-2.2 surface (Sep 2026). - result-fetch-validation — new (noetl/ops#131). Exercises the `result_fetch` tool kind end-to-end + codifies the Python-worker-pool-scale-to-0 convention required for the over-budget Arrow IPC branch. Per-rig deep-dive pages can land later; this list at least gives operators a discoverable entry point. Refs noetl/ai-meta#33 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

    @kadyapam kadyapam committed Jun 1, 2026
    ff4f4ee
  • wiki(agents): add firestore MCP page

    @kadyapam kadyapam committed May 24, 2026
    be04744
  • wiki(automation-gcp-gke): expand PgBouncer budget + Pending PVC cleanup Two operational sections fleshed out beyond the install-page brief: PgBouncer connection budget (replaces the prior 2-paragraph stub): - Layer diagram showing app sessions -> PgBouncer -> Cloud SQL. - Knobs table listing pgbouncer_pool_size, pgbouncer_max_client_conn, pgbouncer_replicas, pgbouncer_pool_mode, Cloud SQL max_connections, and per-pod inflight, plus where each one lives. - Cloud SQL tier defaults table (~50 conns on db-g1-small, etc). - Math: cloud_sql_max_connections >= pgbouncer_replicas * pool_size + reserved_admin pgbouncer_replicas * pgbouncer_max_client_conn >= sum(app inflight) - Sizing checklist: backends first, client capacity, bump PgBouncer before Cloud SQL tier, max_connections is a flag not a tier limit. - Diagnostics: pg_stat_activity from a server pod, pgbouncer SHOW POOLS and SHOW STATS via the admin port; how to read cl_waiting / maxwait to spot throttling. Stale Pending PVCs cleanup (replaces the prior 2-command snippet): - 5-step safe-cleanup recipe: inventory, confirm no pod / deploy / sts / job references the PVC, snapshot before delete, delete one at a time without force-removing finalizers, verify. - jq filters to find references in pod specs and workload templates. - Explanation of why a non-empty VOLUME field means the PVC was bound at some point and should not be deleted blindly. - Root-cause callout: stale Pendings come from kubectl apply of kind- profile manifests on GKE; the Helm chart's dynamic-provisioning PVCs avoid the issue.

    @kadyapam kadyapam committed May 24, 2026
    e0a9b4e
  • wiki(automation-gcp-gke): add GKE Helm install + clarify kind/GKE KEDA split New page automation-gcp-gke.md documenting the GKE install path: - Topology (Cloud SQL + PgBouncer + Helm NATS + chart-templated KEDA). - Prerequisites: gcloud / kubectl / helm, GCP project setup, one-time KEDA + cert-manager installs. - Install via the noetl_gke_fresh_stack.yaml playbook with the frequently-edited workload variables called out. - Upgrade flow (helm upgrade --reuse-values), including the gotcha where --reuse-values does not merge new chart defaults (PR #116 migration hit this). - Verify checklist including KEDA scaledobject, NATS durable consumer, Cloud SQL connectivity through PgBouncer, and a smoke run. - Tuning section for KEDA, PgBouncer connection budget, Cloud SQL HA. - Common pitfalls: two autoscalers fighting (HPA conflict that drove ops #115), live-patching the autoscaler (the anti-pattern that drove ops #116), worker durable consumer drift (noetl #600), and stale Pending PVCs. Update Home.md to add an Automation playbooks row for the new page. Update _Sidebar.md with an Automation section. Update manifests-keda.md to be honest about the kind/GKE split: the existing page sample (account: NOETL, nats.nats.svc:8222) is the kind-cluster artifact; the GKE artifact is chart-rendered with account: $G and nats-headless. Added a profile-note callout near the top and a cross-link in Related. Cross-references: - ai-meta decision doc 2026-05-24-gke-postgres-topology (Option A). - ops PRs #115, #116 (HPA conflict + KEDA chart promotion). - noetl PR #600 (worker consumer self-heal).

    @kadyapam kadyapam committed May 24, 2026
    bfd81eb
  • wiki(nats-supercluster): document single-node kind CPU footprint + mitigations

    @kadyapam kadyapam committed May 23, 2026
    48c5107
  • wiki(home): note Scope B consolidation — single source of manifests

    @kadyapam kadyapam committed May 23, 2026
    06529ef
  • wiki(ops): bootstrap with KEDA + NATS supercluster operational pages First content for the noetl/ops wiki: - Home.md — landing page; index of operational topics + cross-link to the noetl/noetl wiki for application docs. - _Sidebar.md — navigation. - manifests-keda.md — operational guide for ci/manifests/keda/ (architecture, sample manifest, helm install + apply + verify, smoke-test recipe, tuning, multi-cluster + per-tenant note). - manifests-nats-supercluster.md — operational guide for ci/manifests/nats-supercluster/ (cluster vs. supercluster, generated nats.conf example, apply + verify, three live-validation operational caveats: server_name / split /healthz / publishNotReadyAddresses, tuning). Companion: noetl/noetl wiki retains the Python generator API reference pages and adds prominent 'Operations:' callouts pointing here. Convention going forward: deployment-shaped content lives in this wiki next to the manifests + automation; Python API + DSL semantics in the noetl/noetl wiki.

    @kadyapam kadyapam committed May 23, 2026
    e755987
  • Initial Home page

    @kadyapam kadyapam committed May 23, 2026
    59946bb