Skip to content

Phase 2 Service Onboarding

Ivan P edited this page Jun 30, 2026 · 5 revisions

OTel auto-instrumentation for all 8 services, enabling distributed tracing and application-level metrics (request rate, error rate, latency) for each service. Each service has two issues: code changes (instrumentation library setup) and Helm chart changes (environment variable injection). Services are independent and can be done in any order.

Both signals — traces and metrics — are exported via the same OTLP connection to Alloy. Alloy routes traces → Tempo and metrics → Mimir. Logs are not exported via OTLP; they continue via stdout → Alloy scraping → Loki. OTel log correlation (trace_id injection into stdout logs) is enabled via OTEL_PYTHON_LOG_CORRELATION=true (Python) or the appropriate Node.js logging instrumentation.

Total estimated effort: ~29h across all 8 services (includes ~½h local dev setup per service) Recommended starting points: acapy-agent and traction

For detailed per-service Helm chart changes see Helm Chart Assessments.


acapy-agent

Issue: acapy-agent — Local Development Setup

Labels: phase-2 local-dev python acapy Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local development environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: acapy-agent uses a VS Code devcontainer (.devcontainer/) with docker-in-docker. Developers run aca-py directly in the devcontainer shell — not as a compose service. There is no root-level docker-compose.yml for general local dev.

Requirements:

  • Create docker-compose.observability.yml at the repo root with only the grafana/otel-lgtm service (ports 3000:3000, 4317:4317, 4318:4318)
  • Set the following env vars before running aca-py in the devcontainer shell (e.g. exported in the terminal or sourced from a .env.observability file):
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 (LGTM port is mapped to the devcontainer localhost via docker-in-docker)
    • OTEL_SERVICE_NAME=acapy-agent
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
    • OTEL_PYTHON_LOG_CORRELATION=true
  • Update README or CONTRIBUTING with local observability instructions

Acceptance Criteria:

  • docker compose -f docker-compose.observability.yml up -d starts the LGTM stack
  • Grafana accessible at http://localhost:3000
  • Traces visible in Tempo after instrumentation is added and test requests are made locally
  • Developer docs updated

Issue: acapy-agent — OTel Instrumentation (Code)

Labels: phase-2 instrumentation python acapy Estimated Effort: ~2.5h

Description: Add OpenTelemetry auto-instrumentation to acapy-agent to enable distributed tracing and application metrics for HTTP endpoints and PostgreSQL queries.

Requirements:

  • Add opentelemetry-api, opentelemetry-sdk, opentelemetry-exporter-otlp, opentelemetry-instrumentation-aiohttp-server to dependencies in pyproject.toml
  • Initialize OTel SDK in acapy_agent/commands/start.py inside run_app(), before conductor.setup()not in __main__.py (that file is a thin debug/dispatch wrapper only); set up both TracerProvider and MeterProvider with OTLP exporters
  • Add aiohttp server instrumentation in acapy_agent/admin/server.pymake_application() where the middleware list is built
  • Add PostgreSQL tracing via opentelemetry-instrumentation-psycopg (psycopg v3); instrument at pool creation in acapy_agent/database_manager/databases/postgresql_normalized/connection_pool.py

Acceptance Criteria:

  • OTel SDK initializes on startup without errors
  • Inbound HTTP requests to the admin API generate trace spans visible in Tempo
  • PostgreSQL queries appear as child spans within HTTP request traces
  • HTTP RED metrics (http.server.request.duration histogram) appear in Mimir
  • No regression in existing test suite

Issue: acapy-agent — Helm Chart Updates

Labels: phase-2 helm acapy Estimated Effort: ~1h

Requirements:

  • Add a native otel: section to values.yaml for per-chart settings (serviceName defaulting to the release name) and shared settings (enabled, endpoint, protocol, tracesExporter, metricsExporter, logCorrelation)
  • Support global.otel.* for all shared settings so that when acapy is deployed as a sub-chart (e.g. inside vc-authn-oidc), the parent's global values cascade down automatically; templates should prefer global with a local fallback: {{ .Values.global.otel.endpoint | default .Values.otel.endpoint }}
  • Render OTEL env vars conditionally in templates/deployment.yaml when otel.enabled (or global.otel.enabled) is true — treat OTel as a first-class chart feature

Acceptance Criteria:

  • otel.enabled: true + otel.endpoint in values.yaml results in correct env vars in the deployed pod
  • When deployed as a sub-chart, global.otel.* values from the parent chart are respected
  • Traces appear in Tempo after deployment
  • HTTP metrics appear in Mimir after deployment

acapy-vc-authn-oidc

Issue: acapy-vc-authn-oidc — Local Development Setup

Labels: phase-2 local-dev python vc-authn Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: The local dev stack is fully compose-based (docker/docker-compose.yaml). No devcontainer. The controller runs as a compose service on the vc_auth network, so http://lgtm:4318 is the correct OTLP endpoint.

Requirements:

  • Add grafana/otel-lgtm service to docker/docker-compose.yaml on the vc_auth network (ports 3000:3000, 4317:4317, 4318:4318)
  • Add the following to the controller service environment block:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=acapy-vc-authn-oidc
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
    • OTEL_PYTHON_LOG_CORRELATION=true
  • Update README or CONTRIBUTING with local observability instructions

Acceptance Criteria:

  • docker compose -f docker/docker-compose.yaml up starts the LGTM stack alongside the service
  • Grafana accessible at http://localhost:3000
  • Traces visible in Tempo after instrumentation is added and test requests are made locally
  • Developer docs updated

Issue: acapy-vc-authn-oidc — OTel Instrumentation (Code)

Labels: phase-2 instrumentation python vc-authn Estimated Effort: ~2h

Description: Add OTel auto-instrumentation to the FastAPI backend with MongoDB and Redis driver tracing and application metrics.

Requirements:

  • Add opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-httpx, opentelemetry-instrumentation-pymongo, opentelemetry-instrumentation-redis to pyproject.toml
  • Initialize OTel SDK at the top of oidc-controller/api/main.py, before app = get_application() — set up both TracerProvider and MeterProvider with OTLP exporters
  • Apply FastAPIInstrumentor.instrument_app(app) after app = get_application()
  • httpx calls (to ACA-Py admin) are instrumented automatically by opentelemetry-instrumentation-httpx once registered; no code changes needed for the httpx client itself

Acceptance Criteria:

  • FastAPI request spans visible in Tempo
  • MongoDB and Redis operations appear as child spans
  • Outbound httpx calls to ACA-Py appear as child spans
  • HTTP RED metrics appear in Mimir
  • No regression in existing test suite

Issue: acapy-vc-authn-oidc — Helm Chart Updates

Labels: phase-2 helm vc-authn Estimated Effort: ~1h

Context: The vc-authn-oidc chart (charts/vc-authn-oidc in helm-charts repo) has two components to instrument:

  1. Controller (the FastAPI OIDC controller) — templates/deployment.yaml
  2. ACA-Py agent — deployed via the acapy sub-chart dependency

Use Helm's global: section so shared OTel settings are declared once and cascade to both components automatically.

Requirements:

  • Add a global.otel: section to values.yaml with shared fields: enabled, endpoint, protocol, tracesExporter, metricsExporter, logCorrelation
  • Controller: Add otel.serviceName (local, defaults to release name); render all OTEL env vars in templates/deployment.yaml from global.otel.* (with local fallback) when global.otel.enabled is true
  • ACA-Py agent: Once the acapy sub-chart has native otel: support (see acapy-agent Helm Chart issue), it will automatically inherit global.otel.* from the parent — set only acapy.otel.serviceName in the parent values.yaml to distinguish the agent's service name

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint once results in correct env vars in both the controller pod and the aca-py pod
  • Each pod has its own distinct OTEL_SERVICE_NAME
  • Traces from both components appear in Tempo after deployment
  • HTTP metrics from both components appear in Mimir after deployment

traction

Issue: traction — Local Development Setup

Labels: phase-2 local-dev nodejs python traction Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: Both tenant-ui and traction-agent run as compose services in scripts/docker-compose.yml on a shared network, so http://lgtm:4318 is the correct endpoint for both. The traction_innkeeper plugin runs inside the ACA-Py process — no separate container.

Requirements:

  • Add grafana/otel-lgtm service to scripts/docker-compose.yml on the same network (ports 3000:3000, 4317:4317, 4318:4318)
  • Add to the tenant-ui service environment:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=traction-tenant-ui
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
  • Add to the traction-agent service environment:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=traction-agent
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
    • OTEL_PYTHON_LOG_CORRELATION=true
  • Update README or CONTRIBUTING with local observability instructions

Acceptance Criteria:

  • docker compose -f scripts/docker-compose.yml up starts the LGTM stack alongside both components
  • Grafana accessible at http://localhost:3000
  • Traces from both tenant-ui and the ACA-Py agent visible in Tempo after instrumentation is added
  • Developer docs updated

Issue: traction — OTel Instrumentation (Code)

Labels: phase-2 instrumentation nodejs traction Estimated Effort: ~2h

Description: Add OTel instrumentation to the tenant-ui (Node.js/Express) component. The ACA-Py plugin (traction_innkeeper) requires no separate code changes — it runs inside the same ACA-Py process and inherits all instrumentation (aiohttp server spans, psycopg spans, SDK init) from the acapy-agent OTel issue.

Requirements:

  • tenant-ui: Add @opentelemetry/sdk-node, @opentelemetry/auto-instrumentations-node, @opentelemetry/sdk-metrics, @opentelemetry/exporter-trace-otlp-http, @opentelemetry/exporter-metrics-otlp-http; create src/tracing.ts with NodeSDK configured with both trace and metric exporters using the HTTP/protobuf exporters (port 4318, matching all other services — do not use gRPC exporters); import as the first line of src/index.ts before Express initializes

Acceptance Criteria:

  • tenant-ui HTTP spans visible in Tempo
  • HTTP RED metrics from tenant-ui appear in Mimir
  • No regression in existing test suite

Issue: traction — Helm Chart Updates

Labels: phase-2 helm traction Estimated Effort: ~1h

Context: The traction chart (charts/traction/) has two components: tenant-ui (own deployment template at templates/ui/deployment.yaml with hardcoded env + envFrom configmap) and traction-agent (via acapy sub-chart dependency). The traction_innkeeper plugin runs inside the ACA-Py process — no separate deployment.

Requirements:

  • Add a global.otel: section to values.yaml with shared fields: enabled, endpoint, protocol, tracesExporter, metricsExporter, logCorrelation
  • tenant-ui: Add ui.otel.serviceName (local, defaults to traction-tenant-ui); render all OTEL env vars conditionally in templates/ui/deployment.yaml from global.otel.* when global.otel.enabled is true
  • ACA-Py agent: Once the acapy sub-chart has native otel: support (see acapy-agent Helm Chart issue), it inherits global.otel.* automatically — set only acapy.otel.serviceName: traction-agent in this chart's values.yaml

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint once results in correct env vars in both the tenant-ui pod and the traction-agent pod
  • Each pod has its own distinct OTEL_SERVICE_NAME
  • Traces from both components visible in Tempo
  • Metrics from both components visible in Mimir

acapy-endorser-service

Issue: acapy-endorser-service — Local Development Setup

Labels: phase-2 local-dev python endorser Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local development environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: The repo has a VS Code devcontainer (.devcontainer/) with docker-in-docker and a full compose stack at docker/docker-compose.yml (including acapy-endorser-api as a service with source mounted for hot reload). Developers may run the API either via compose or directly in the devcontainer shell.

Requirements:

  • Create docker/docker-compose.observability.yml with only the grafana/otel-lgtm service (ports 3000:3000, 4317:4317, 4318:4318) on the same network as docker/docker-compose.yml
  • When running via compose, add to the acapy-endorser-api service environment in docker/docker-compose.yml:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=acapy-endorser-service
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
    • OTEL_PYTHON_LOG_CORRELATION=true
  • When running uvicorn directly in the devcontainer shell, use OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 instead
  • Update README or CONTRIBUTING with local observability instructions

Acceptance Criteria:

  • docker compose -f docker/docker-compose.observability.yml up -d starts the LGTM stack
  • Grafana accessible at http://localhost:3000
  • Traces visible in Tempo after instrumentation is added and test requests are made locally
  • Developer docs updated

Issue: acapy-endorser-service — OTel Instrumentation (Code)

Labels: phase-2 instrumentation python endorser Estimated Effort: ~2h

Description: Add OTel auto-instrumentation to the FastAPI endorser service with SQLAlchemy tracing and application metrics.

Requirements:

  • Add opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-aiohttp-client, opentelemetry-instrumentation-sqlalchemy to pyproject.toml
  • Do not add opentelemetry-instrumentation-asyncpg separately — the endorser uses SQLAlchemy with an asyncpg backend (create_async_engine in endorser/api/db/session.py); the SQLAlchemy instrumentor covers database spans at the correct level and adding asyncpg on top produces duplicate spans
  • Initialize OTel SDK at the top of endorser/api/main.py, before app = get_application() — set up both TracerProvider and MeterProvider with OTLP exporters
  • Apply SQLAlchemyInstrumentor().instrument(engine=engine) after the engine is created in endorser/api/db/session.py

Acceptance Criteria:

  • FastAPI request spans visible in Tempo
  • SQLAlchemy query spans appear as children of request spans
  • Outbound aiohttp calls to ACA-Py appear as child spans
  • HTTP RED metrics appear in Mimir
  • No regression in existing test suite

Issue: acapy-endorser-service — Helm Chart Updates

Labels: phase-2 helm endorser Estimated Effort: ~1h

Context: The endorser chart (charts/endorser-service/) has two components: the endorser API (templates/api/deployment.yaml, hardcoded env) and the acapy sub-chart dependency.

Requirements:

  • Add a global.otel: section to values.yaml with shared fields: enabled, endpoint, protocol, tracesExporter, metricsExporter, logCorrelation
  • Endorser API: Add api.otel.serviceName (local); render all OTEL env vars conditionally in templates/api/deployment.yaml from global.otel.* when global.otel.enabled is true
  • ACA-Py agent: Once the acapy sub-chart has native otel: support, it inherits global.otel.* automatically — set only acapy.otel.serviceName in this chart's values.yaml

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint once results in correct env vars in both the endorser API pod and the aca-py pod
  • Traces appear in Tempo after deployment
  • HTTP metrics appear in Mimir after deployment

didwebvh-server-py

Issue: didwebvh-server-py — Local Development Setup

Labels: phase-2 local-dev python didwebvh Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: No devcontainer. The compose file is demo/docker-compose.ymlwebvh-server runs as a compose service on webvh-network, so http://lgtm:4318 is the correct endpoint.

Requirements:

  • Add grafana/otel-lgtm service to demo/docker-compose.yml on the webvh-network (ports 3000:3000, 4317:4317, 4318:4318)
  • Add to the webvh-server service environment:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=didwebvh-server-py
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
    • OTEL_PYTHON_LOG_CORRELATION=true
  • Update README or CONTRIBUTING with local observability instructions

Acceptance Criteria:

  • docker compose -f demo/docker-compose.yml up starts the LGTM stack alongside the service
  • Grafana accessible at http://localhost:3000
  • Traces visible in Tempo after instrumentation is added and test requests are made locally
  • Developer docs updated

Issue: didwebvh-server-py — OTel Instrumentation (Code)

Labels: phase-2 instrumentation python didwebvh Estimated Effort: ~2.5h

Description: Add OTel auto-instrumentation to the didwebvh FastAPI server with centralized OTel init covering both traces and metrics, and SQLAlchemy database tracing.

Requirements:

  • Add opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-logging, opentelemetry-instrumentation-sqlalchemy to server/pyproject.toml
  • Create server/app/otel.py for centralized OTel init — configure both TracerProvider (BatchSpanProcessor + OTLPSpanExporter) and MeterProvider (PeriodicExportingMetricReader + OTLPMetricExporter)
  • Call OTel init from server/main.py before uvicorn.run (must be active before uvicorn imports the app module)
  • Apply FastAPIInstrumentor.instrument_app(app) in server/app/__init__.py where app is defined (uses modern lifespan context manager)
  • Apply SQLAlchemyInstrumentor().instrument(engine=self._engine) in server/app/plugins/storage.pyStorageManager.__init__, immediately after self._engine is first created (engine is a lazy singleton — instrument once on first init)

Acceptance Criteria:

  • FastAPI request spans visible in Tempo
  • SQLAlchemy/psycopg2 query spans appear as children
  • HTTP RED metrics appear in Mimir
  • Log correlation enabled (trace_id present in stdout log records)
  • No regression in existing test suite

Issue: didwebvh-server-py — Helm Chart Updates

Labels: phase-2 helm didwebvh Estimated Effort: ~1h

Context: The upstream chart (didwebvh-server-py v0.7.0) lives in ~/code/helm-charts/charts/didwebvh-server/ (DIF helm-charts repo). The deployment template (templates/server/deployment.yaml) has hardcoded env vars with no extraEnvVars support. Changes go to that repo; DITP gitops per-env overrides in services/didwebvh-server-py/charts/{env}/values.yaml pick them up automatically once the chart is updated.

Requirements:

  • Add a native otel: section to values.yaml in the DIF chart with fields: enabled, endpoint, protocol, tracesExporter, metricsExporter, logCorrelation, serviceName
  • Support global.otel.* for all shared fields (prefer global with local fallback), consistent with the pattern used across all other services in this project
  • Render OTEL env vars conditionally in templates/server/deployment.yaml when otel.enabled (or global.otel.enabled) is true

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint in the gitops values.yaml results in correct env vars in the deployed pod
  • Traces appear in Tempo after deployment
  • HTTP metrics appear in Mimir after deployment

didcomm-mediator-credo

Issue: didcomm-mediator-credo — Local Development Setup

Labels: phase-2 local-dev typescript mediator Estimated Effort: ~½h

Description: Update the existing local observability compose file to use grafana/otel-lgtm for consistency with other services. See Local Development for background.

Context: apps/mediator/docker-compose.observability.yml already exists but uses a full custom stack (OTel Collector → Jaeger, Prometheus, Grafana). For Phase 2 local dev, replace this with the simpler grafana/otel-lgtm image. The devcontainer (.devcontainer/docker-compose.yml) has a mediator service; LGTM goes on the same network.

Requirements:

  • Replace the contents of apps/mediator/docker-compose.observability.yml with a minimal grafana/otel-lgtm service (ports 3000:3000, 4317:4317, 4318:4318)
  • Update the mediator service environment in that file to:
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318
    • OTEL_SERVICE_NAME=didcomm-mediator-credo
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp (use OTLP for local dev; Prometheus scraping is for production)
  • Update README with updated local observability instructions

Acceptance Criteria:

  • docker compose -f apps/mediator/docker-compose.observability.yml up starts the LGTM stack alongside the mediator
  • Grafana accessible at http://localhost:3000
  • Traces visible in Tempo after test requests are made
  • Developer docs updated

Issue: didcomm-mediator-credo — OTel Instrumentation (Code)

Labels: phase-2 instrumentation typescript mediator Estimated Effort: ~1h

Description: Wire the existing instrumentation.js preloader into the main Dockerfile and add OTel packages as proper dependencies. Most of the instrumentation work is already done.

Context: apps/mediator/instrumentation.js already exists and is complete — it configures NodeSDK with auto-instrumentations for http, express, ws, pg (enhancedDatabaseReporting: true), OTLP trace/metrics exporters, and Prometheus. The problem is:

  1. OTel packages are not in apps/mediator/package.json — the main Dockerfile cannot resolve them
  2. The main Dockerfile does not load instrumentation.js — only Dockerfile.observability does, via a hacky separate npm install into /otel/ with NODE_PATH

Requirements:

  • Add all required OTel packages to apps/mediator/package.json as proper dependencies: @opentelemetry/api, @opentelemetry/sdk-node, @opentelemetry/auto-instrumentations-node, @opentelemetry/exporter-trace-otlp-http, @opentelemetry/exporter-metrics-otlp-http, @opentelemetry/exporter-prometheus, @opentelemetry/sdk-metrics, @opentelemetry/resources, @opentelemetry/semantic-conventions
  • Update the main Dockerfile to copy apps/mediator/instrumentation.js into the final image and set NODE_OPTIONS="--require /app/instrumentation.js" (or pass --require in the entrypoint)
  • Dockerfile.observability can be removed or kept as a reference — its NODE_PATH workaround is no longer needed once deps are in package.json

Acceptance Criteria:

  • Main Dockerfile build succeeds with OTel packages resolved via package.json
  • Mediator HTTP, WebSocket, and pg spans visible in Tempo
  • HTTP RED metrics visible in Mimir
  • No regression in existing behaviour

Issue: didcomm-mediator-credo — Helm Chart Updates

Labels: phase-2 helm mediator Estimated Effort: ~1h

Context: The deployment chart is the OWF chart at owf/helm-charts/charts/didcomm-mediator-credo/. The deployment template accepts arbitrary env vars via {{- toYaml .Values.environment }}, which is flexible but not a native otel: integration. DITP gitops per-env overrides live in services/mediator-credo/charts/{env}/values.yaml.

Requirements:

  • Add a native otel: section to the OWF chart's values.yaml with fields: enabled, endpoint, protocol, tracesExporter, metricsExporter, serviceName
  • Support global.otel.* for all shared fields (prefer global with local fallback), consistent with the pattern used across all other services
  • Render OTEL env vars conditionally in templates/deployment.yaml when otel.enabled (or global.otel.enabled) is true, replacing reliance on the generic environment passthrough for OTel config

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint in the gitops values.yaml results in correct env vars in the deployed pod
  • Traces appear in Tempo after deployment
  • HTTP metrics appear in Mimir after deployment

bc-wallet-demo

Issue: bc-wallet-demo — Local Development Setup

Labels: phase-2 local-dev typescript bc-wallet Estimated Effort: ~½h

Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.

Context: No devcontainer. The server runs on the host via yarn server:dev (ts-node-dev) — the server compose service in docker-compose.yml is under profiles: [prod] and is not used during development. Only MongoDB runs via compose. LGTM is added to the same compose file; the server running on the host connects via localhost.

Requirements:

  • Add grafana/otel-lgtm service to docker-compose.yml (ports 3000:3000, 4317:4317, 4318:4318) on the same network as mongodb
  • The server runs on the host, not in compose — set OTEL env vars in server/.env.example (and server/.env locally):
    • OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
    • OTEL_SERVICE_NAME=bc-wallet-demo
    • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    • OTEL_TRACES_EXPORTER=otlp
    • OTEL_METRICS_EXPORTER=otlp
  • Update README with local observability instructions

Acceptance Criteria:

  • docker compose up starts LGTM and MongoDB together
  • Grafana accessible at http://localhost:3000
  • Running yarn server:dev with OTEL env vars set produces traces in Tempo
  • Developer docs updated

Issue: bc-wallet-demo — OTel Instrumentation (Code)

Labels: phase-2 instrumentation typescript bc-wallet Estimated Effort: ~2h

Description: Add OTel auto-instrumentation to the bc-wallet-demo server via a preloader file covering traces, metrics, and Mongoose database spans.

Context: Express 4 with routing-controllers (decorator-based). MongoDB is accessed via mongoose. The server entry is server/src/index.ts.

Requirements:

  • Create server/src/instrumentation.ts:
    • Use NodeSDK with getNodeAutoInstrumentations() — this covers HTTP, Express, and MongoDB driver spans automatically
    • Add MongooseInstrumentation from @opentelemetry/instrumentation-mongoose explicitly for ODM-level spans (not included in auto-instrumentations-node)
    • Add PeriodicExportingMetricReader with OTLPMetricExporter from @opentelemetry/exporter-metrics-otlp-http
    • Use OTLPTraceExporter from @opentelemetry/exporter-trace-otlp-http
    • Read config from OTEL_ENABLED, OTEL_SERVICE_NAME, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_TRACES_EXPORTER, OTEL_METRICS_EXPORTER; no-op when disabled
  • Import as the first line of server/src/index.ts (before Express and mongoose initialization)
  • Add required packages to server/package.json: @opentelemetry/api, @opentelemetry/sdk-node, @opentelemetry/auto-instrumentations-node, @opentelemetry/exporter-trace-otlp-http, @opentelemetry/exporter-metrics-otlp-http, @opentelemetry/sdk-metrics, @opentelemetry/instrumentation-mongoose
  • Do NOT add @opentelemetry/instrumentation-mongodb separately — it is already included in auto-instrumentations-node and adding it again creates duplicate spans

Acceptance Criteria:

  • Server HTTP spans visible in Tempo
  • Mongoose operation spans appear as children of request spans
  • HTTP RED metrics visible in Mimir
  • No regression in existing test suite

Issue: bc-wallet-demo — Helm Chart Updates

Labels: phase-2 helm bc-wallet Estimated Effort: ~1h

Context: The Helm chart is in the repo itself at charts/showcase/. The server deployment template (charts/showcase/templates/server/deployment.yaml) already has a {{- range .Values.showcase.server.extraEnv }} mechanism with {name, value} pairs — but following the pattern established across all other services, OTel config should be native rather than passed through extraEnv.

Requirements:

  • Add a native otel: section to charts/showcase/values.yaml under showcase.server.otel with fields: enabled, serviceName
  • Support global.otel.* for shared fields (endpoint, protocol, tracesExporter, metricsExporter), with local showcase.server.otel.* as overrides where needed
  • Render OTEL env vars conditionally in charts/showcase/templates/server/deployment.yaml when otel.enabled (or global.otel.enabled) is true

Acceptance Criteria:

  • Setting global.otel.enabled: true and global.otel.endpoint in the gitops values.yaml results in correct env vars in the deployed pod
  • Traces appear in Tempo after deployment
  • HTTP metrics appear in Mimir after deployment

Clone this wiki locally