-
Notifications
You must be signed in to change notification settings - Fork 8
Phase 2 Service Onboarding
OTel auto-instrumentation for all 8 services, enabling distributed tracing and application-level metrics (request rate, error rate, latency) for each service. Each service has two issues: code changes (instrumentation library setup) and Helm chart changes (environment variable injection). Services are independent and can be done in any order.
Both signals — traces and metrics — are exported via the same OTLP connection to Alloy. Alloy routes traces → Tempo and metrics → Mimir. Logs are not exported via OTLP; they continue via stdout → Alloy scraping → Loki. OTel log correlation (trace_id injection into stdout logs) is enabled via OTEL_PYTHON_LOG_CORRELATION=true (Python) or the appropriate Node.js logging instrumentation.
Total estimated effort: ~29h across all 8 services (includes ~½h local dev setup per service)
Recommended starting points: acapy-agent and traction
For detailed per-service Helm chart changes see Helm Chart Assessments.
Labels: phase-2 local-dev python acapy
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml(or a separatedocker-compose.observability.yml) - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=acapy-agent,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlp,OTEL_PYTHON_LOG_CORRELATION=trueto the service container in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python acapy
Estimated Effort: ~2.5h
Description: Add OpenTelemetry auto-instrumentation to acapy-agent to enable distributed tracing and application metrics for HTTP endpoints and PostgreSQL queries.
Requirements:
- Add
opentelemetry-api,opentelemetry-sdk,opentelemetry-exporter-otlp,opentelemetry-instrumentation-aiohttp-serverto dependencies - Initialize OTel SDK in
acapy_agent/__main__.py— set up bothTracerProviderandMeterProviderwith OTLP exporters - Add aiohttp server middleware in
acapy_agent/admin/server.py - Add PostgreSQL tracing via
opentelemetry-instrumentation-psycopg(psycopg v3)
Acceptance Criteria:
- OTel SDK initializes on startup without errors
- Inbound HTTP requests to the admin API generate trace spans visible in Tempo
- PostgreSQL queries appear as child spans within HTTP request traces
- HTTP RED metrics (
http.server.request.durationhistogram) appear in Mimir - No regression in existing test suite
Labels: phase-2 helm acapy
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER,OTEL_PYTHON_LOG_CORRELATIONto the deployment env block viavalues.yaml
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev python vc-authn
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=acapy-vc-authn-oidc,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlp,OTEL_PYTHON_LOG_CORRELATION=trueto theoidc-controllercontainer in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python vc-authn
Estimated Effort: ~2h
Description: Add OTel auto-instrumentation to the FastAPI backend with MongoDB and Redis driver tracing and application metrics.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-httpx,opentelemetry-instrumentation-pymongo,opentelemetry-instrumentation-redis - Initialize OTel SDK in
oidc-controller/api/main.py— set up bothTracerProviderandMeterProviderwith OTLP exporters - Apply
FastAPIInstrumentor.instrument_app(app)
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- MongoDB and Redis operations appear as child spans
- HTTP RED metrics appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm vc-authn
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER,OTEL_PYTHON_LOG_CORRELATION
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev nodejs python traction
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlpto both thetenant-uiand ACA-Py plugin containers; addOTEL_PYTHON_LOG_CORRELATION=trueto the ACA-Py plugin container only - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside both components - Grafana accessible at
http://localhost:3000 - Traces from both tenant-ui and ACA-Py plugin visible and linked in Tempo after instrumentation is added
- Developer docs updated
Labels: phase-2 instrumentation nodejs python traction
Estimated Effort: ~3h
Description: Add OTel instrumentation to both the tenant-ui (Node.js/Express) and the ACA-Py plugin (Python/aiohttp) components, enabling traces and application metrics for both.
Requirements:
-
tenant-ui: Add
@opentelemetry/sdk-node,@opentelemetry/auto-instrumentations-node,@opentelemetry/sdk-metrics,@opentelemetry/exporter-metrics-otlp-grpc; createsrc/tracing.tswithNodeSDKconfigured with both trace and metric exporters; import before Express starts insrc/index.ts -
ACA-Py plugin: Wrap startup command with
opentelemetry-instrumentCLI inplugins/docker/Dockerfile; setOTEL_METRICS_EXPORTER=otlp— the CLI instrumentor enables metrics export automatically; addopentelemetry-instrumentation-asyncpgfor PostgreSQL
Acceptance Criteria:
- tenant-ui HTTP spans visible in Tempo
- ACA-Py plugin spans visible in Tempo, linked to tenant-ui parent spans
- PostgreSQL spans appear as children of ACA-Py spans
- HTTP RED metrics from both components appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm traction
Estimated Effort: ~30min
Requirements:
-
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTERfor both tenant-ui and ACA-Py plugin deployments -
OTEL_PYTHON_LOG_CORRELATION=truefor the ACA-Py plugin
Acceptance Criteria:
- Env vars present in both deployed pods
- Traces from both components visible and linked in Tempo
- Metrics from both components visible in Mimir
Labels: phase-2 local-dev python endorser
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=acapy-endorser-service,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlp,OTEL_PYTHON_LOG_CORRELATION=trueto the endorser container in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python endorser
Estimated Effort: ~2h
Description: Add OTel auto-instrumentation to the FastAPI endorser service with asyncpg and SQLAlchemy tracing and application metrics.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-aiohttp-client,opentelemetry-instrumentation-asyncpg,opentelemetry-instrumentation-sqlalchemy - Initialize OTel SDK in
endorser/api/main.py— set up bothTracerProviderandMeterProviderwith OTLP exporters
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- Database query spans (asyncpg / SQLAlchemy) appear as children
- HTTP RED metrics appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm endorser
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER,OTEL_PYTHON_LOG_CORRELATION
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev python didwebvh
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=didwebvh-server-py,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlp,OTEL_PYTHON_LOG_CORRELATION=trueto the server container in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python didwebvh
Estimated Effort: ~5h
Description: Add OTel auto-instrumentation to the didwebvh FastAPI server with centralized OTel init covering both traces and metrics, and SQLAlchemy database tracing.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-logging,opentelemetry-instrumentation-sqlalchemytoserver/pyproject.toml - Create
server/app/otel.pyfor centralized OTel init — configure bothTracerProvider(BatchSpanProcessor + OTLPSpanExporter) andMeterProvider(PeriodicExportingMetricReader + OTLPMetricExporter) - Call OTel init from
server/main.pybeforeuvicorn.run - Apply
FastAPIInstrumentor.instrument_app(app)inserver/app/__init__.py - Apply
SQLAlchemyInstrumentor().instrument(engine=self.engine)inserver/app/plugins/storage.py
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- SQLAlchemy/psycopg2 query spans appear as children
- HTTP RED metrics appear in Mimir
- Log correlation enabled (
trace_idpresent in stdout log records) - No regression in existing test suite
Labels: phase-2 helm didwebvh
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER,OTEL_PYTHON_LOG_CORRELATION
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev typescript credo
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local development environment so developers can validate credo-ts instrumentation locally before deploying to OpenShift. Because credo-ts is a library, the LGTM stack is added to the consuming application's docker-compose (or a standalone compose file in the credo-ts repo for integration testing). See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice to the docker-compose used for local integration testing - Set
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=credo-ts,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlpin the test environment - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
- LGTM stack starts alongside the integration test environment
- Grafana accessible at
http://localhost:3000 - Traces from credo-ts operations visible in Tempo after instrumentation is added
- Developer docs updated
Labels: phase-2 instrumentation typescript credo
Estimated Effort: ~2h
Description: Add OTel instrumentation via a preloader bootstrap wrapper for the credo-ts library, enabling traces and application metrics for consuming services.
Requirements:
- Add
@opentelemetry/sdk-node,@opentelemetry/api,@opentelemetry/auto-instrumentations-node,@opentelemetry/exporter-trace-otlp-grpc,@opentelemetry/sdk-metrics,@opentelemetry/exporter-metrics-otlp-grpc - Create
otel-wrapper.tsbootstrap preloader withNodeSDKconfigured with both trace and metric exporters (PeriodicExportingMetricReader) - Initialize in the client agent constructor in
packages/core/src/agent/Agent.ts - Add
@opentelemetry/instrumentation-pgfor PostgreSQL or@opentelemetry/instrumentation-sqlite3for SQLite
Note: credo-ts is a library — Helm chart changes target the consuming service's chart.
Acceptance Criteria:
- OTel spans generated for outbound HTTP calls and DB queries
- Traces visible in Tempo from a consuming service
- HTTP/DB metrics visible in Mimir from a consuming service
- No regression in existing test suite
Labels: phase-2 helm credo
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTERon the consuming service deployment
Acceptance Criteria:
- Env vars present in consuming service pod
- Traces from credo-ts operations visible in Tempo
- Metrics from credo-ts operations visible in Mimir
Labels: phase-2 local-dev typescript mediator
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=didcomm-mediator-credo,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlpto the mediator container in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation typescript mediator
Estimated Effort: ~2.5h
Description: Resolve the existing Dockerfile OTel exporter build issue and add OTel instrumentation via a preloader file covering both traces and metrics.
Requirements:
- Resolve Dockerfile build failure: add
@opentelemetry/exporter-trace-otlp-httpand@opentelemetry/exporter-metrics-otlp-httpin the compilation phase - Create/update
apps/mediator/instrumentation.jsas the Node.js preloader with both trace (BatchSpanProcessor) and metric (PeriodicExportingMetricReader) exporters - Enable
enhancedDatabaseReporting: truefor@opentelemetry/instrumentation-pg
Acceptance Criteria:
- Docker image builds without OTel exporter errors
- Mediator HTTP and pg spans visible in Tempo
- HTTP RED metrics visible in Mimir
- No regression in existing behaviour
Labels: phase-2 helm mediator
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT(port 4318 HTTP for this service),OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev typescript bc-wallet
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml - Add
OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318,OTEL_SERVICE_NAME=bc-wallet-demo,OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf,OTEL_TRACES_EXPORTER=otlp,OTEL_METRICS_EXPORTER=otlpto the server container in docker-compose - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation typescript bc-wallet
Estimated Effort: ~6h
Description: Add OTel auto-instrumentation to the bc-wallet-demo server with MongoDB/Mongoose database tracing and application metrics.
Requirements:
- Create
server/src/instrumentation.tsusingNodeSDKwithgetNodeAutoInstrumentations(),PeriodicExportingMetricReader, andOTLPMetricExporter - Import as first line of
server/src/index.ts(or use--requireflag in start script) - Add
@opentelemetry/instrumentation-mongooseand@opentelemetry/instrumentation-mongodbfor MongoDB
Acceptance Criteria:
- Server HTTP spans visible in Tempo
- MongoDB/Mongoose operations appear as child spans
- HTTP RED metrics visible in Mimir
- No regression in existing test suite
Labels: phase-2 helm bc-wallet
Estimated Effort: ~30min
Requirements:
- Add
OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER,OTEL_NODEJS_AUTO_INSTRUMENTATIONS_ENABLED
Acceptance Criteria:
- Env vars present in deployed pod
- Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment