-
Notifications
You must be signed in to change notification settings - Fork 8
Phase 2 Service Onboarding
OTel auto-instrumentation for all 8 services, enabling distributed tracing and application-level metrics (request rate, error rate, latency) for each service. Each service has two issues: code changes (instrumentation library setup) and Helm chart changes (environment variable injection). Services are independent and can be done in any order.
Both signals — traces and metrics — are exported via the same OTLP connection to Alloy. Alloy routes traces → Tempo and metrics → Mimir. Logs are not exported via OTLP; they continue via stdout → Alloy scraping → Loki. OTel log correlation (trace_id injection into stdout logs) is enabled via OTEL_PYTHON_LOG_CORRELATION=true (Python) or the appropriate Node.js logging instrumentation.
Total estimated effort: ~29h across all 8 services (includes ~½h local dev setup per service)
Recommended starting points: acapy-agent and traction
For detailed per-service Helm chart changes see Helm Chart Assessments.
Labels: phase-2 local-dev python acapy
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local development environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: acapy-agent uses a VS Code devcontainer (.devcontainer/) with docker-in-docker. Developers run aca-py directly in the devcontainer shell — not as a compose service. There is no root-level docker-compose.yml for general local dev.
Requirements:
- Create
docker-compose.observability.ymlat the repo root with only thegrafana/otel-lgtmservice (ports3000:3000,4317:4317,4318:4318) - Set the following env vars before running aca-py in the devcontainer shell (e.g. exported in the terminal or sourced from a
.env.observabilityfile):-
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318(LGTM port is mapped to the devcontainer localhost via docker-in-docker) OTEL_SERVICE_NAME=acapy-agentOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlpOTEL_PYTHON_LOG_CORRELATION=true
-
- Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose -f docker-compose.observability.yml up -dstarts the LGTM stack - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python acapy
Estimated Effort: ~2.5h
Description: Add OpenTelemetry auto-instrumentation to acapy-agent to enable distributed tracing and application metrics for HTTP endpoints and PostgreSQL queries.
Requirements:
- Add
opentelemetry-api,opentelemetry-sdk,opentelemetry-exporter-otlp,opentelemetry-instrumentation-aiohttp-serverto dependencies inpyproject.toml - Initialize OTel SDK in
acapy_agent/commands/start.pyinsiderun_app(), beforeconductor.setup()— not in__main__.py(that file is a thin debug/dispatch wrapper only); set up bothTracerProviderandMeterProviderwith OTLP exporters - Add aiohttp server instrumentation in
acapy_agent/admin/server.py→make_application()where the middleware list is built - Add PostgreSQL tracing via
opentelemetry-instrumentation-psycopg(psycopg v3); instrument at pool creation inacapy_agent/database_manager/databases/postgresql_normalized/connection_pool.py
Acceptance Criteria:
- OTel SDK initializes on startup without errors
- Inbound HTTP requests to the admin API generate trace spans visible in Tempo
- PostgreSQL queries appear as child spans within HTTP request traces
- HTTP RED metrics (
http.server.request.durationhistogram) appear in Mimir - No regression in existing test suite
Labels: phase-2 helm acapy
Estimated Effort: ~1h
Requirements:
- Add a native
otel:section tovalues.yamlfor per-chart settings (serviceNamedefaulting to the release name) and shared settings (enabled,endpoint,protocol,tracesExporter,metricsExporter,logCorrelation) - Support
global.otel.*for all shared settings so that when acapy is deployed as a sub-chart (e.g. insidevc-authn-oidc), the parent's global values cascade down automatically; templates should prefer global with a local fallback:{{ .Values.global.otel.endpoint | default .Values.otel.endpoint }} - Render OTEL env vars conditionally in
templates/deployment.yamlwhenotel.enabled(orglobal.otel.enabled) is true — treat OTel as a first-class chart feature
Acceptance Criteria:
-
otel.enabled: true+otel.endpointinvalues.yamlresults in correct env vars in the deployed pod - When deployed as a sub-chart,
global.otel.*values from the parent chart are respected - Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev python vc-authn
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: The local dev stack is fully compose-based (docker/docker-compose.yaml). No devcontainer. The controller runs as a compose service on the vc_auth network, so http://lgtm:4318 is the correct OTLP endpoint.
Requirements:
- Add
grafana/otel-lgtmservice todocker/docker-compose.yamlon thevc_authnetwork (ports3000:3000,4317:4317,4318:4318) - Add the following to the
controllerservice environment block:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=acapy-vc-authn-oidcOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlpOTEL_PYTHON_LOG_CORRELATION=true
- Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose -f docker/docker-compose.yaml upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python vc-authn
Estimated Effort: ~2h
Description: Add OTel auto-instrumentation to the FastAPI backend with MongoDB and Redis driver tracing and application metrics.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-httpx,opentelemetry-instrumentation-pymongo,opentelemetry-instrumentation-redistopyproject.toml - Initialize OTel SDK at the top of
oidc-controller/api/main.py, beforeapp = get_application()— set up bothTracerProviderandMeterProviderwith OTLP exporters - Apply
FastAPIInstrumentor.instrument_app(app)afterapp = get_application() -
httpxcalls (to ACA-Py admin) are instrumented automatically byopentelemetry-instrumentation-httpxonce registered; no code changes needed for the httpx client itself
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- MongoDB and Redis operations appear as child spans
- Outbound httpx calls to ACA-Py appear as child spans
- HTTP RED metrics appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm vc-authn
Estimated Effort: ~1h
Context: The vc-authn-oidc chart (charts/vc-authn-oidc in helm-charts repo) has two components to instrument:
-
Controller (the FastAPI OIDC controller) —
templates/deployment.yaml -
ACA-Py agent — deployed via the
acapysub-chart dependency
Use Helm's global: section so shared OTel settings are declared once and cascade to both components automatically.
Requirements:
- Add a
global.otel:section tovalues.yamlwith shared fields:enabled,endpoint,protocol,tracesExporter,metricsExporter,logCorrelation -
Controller: Add
otel.serviceName(local, defaults to release name); render all OTEL env vars intemplates/deployment.yamlfromglobal.otel.*(with local fallback) whenglobal.otel.enabledis true -
ACA-Py agent: Once the acapy sub-chart has native
otel:support (see acapy-agent Helm Chart issue), it will automatically inheritglobal.otel.*from the parent — set onlyacapy.otel.serviceNamein the parentvalues.yamlto distinguish the agent's service name
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointonce results in correct env vars in both the controller pod and the aca-py pod - Each pod has its own distinct
OTEL_SERVICE_NAME - Traces from both components appear in Tempo after deployment
- HTTP metrics from both components appear in Mimir after deployment
Labels: phase-2 local-dev nodejs python traction
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: Both tenant-ui and traction-agent run as compose services in scripts/docker-compose.yml on a shared network, so http://lgtm:4318 is the correct endpoint for both. The traction_innkeeper plugin runs inside the ACA-Py process — no separate container.
Requirements:
- Add
grafana/otel-lgtmservice toscripts/docker-compose.ymlon the same network (ports3000:3000,4317:4317,4318:4318) - Add to the
tenant-uiservice environment:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=traction-tenant-uiOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlp
- Add to the
traction-agentservice environment:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=traction-agentOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlpOTEL_PYTHON_LOG_CORRELATION=true
- Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose -f scripts/docker-compose.yml upstarts the LGTM stack alongside both components - Grafana accessible at
http://localhost:3000 - Traces from both tenant-ui and the ACA-Py agent visible in Tempo after instrumentation is added
- Developer docs updated
Labels: phase-2 instrumentation nodejs traction
Estimated Effort: ~2h
Description: Add OTel instrumentation to the tenant-ui (Node.js/Express) component. The ACA-Py plugin (traction_innkeeper) requires no separate code changes — it runs inside the same ACA-Py process and inherits all instrumentation (aiohttp server spans, psycopg spans, SDK init) from the acapy-agent OTel issue.
Requirements:
-
tenant-ui: Add
@opentelemetry/sdk-node,@opentelemetry/auto-instrumentations-node,@opentelemetry/sdk-metrics,@opentelemetry/exporter-trace-otlp-http,@opentelemetry/exporter-metrics-otlp-http; createsrc/tracing.tswithNodeSDKconfigured with both trace and metric exporters using the HTTP/protobuf exporters (port 4318, matching all other services — do not use gRPC exporters); import as the first line ofsrc/index.tsbefore Express initializes
Acceptance Criteria:
- tenant-ui HTTP spans visible in Tempo
- HTTP RED metrics from tenant-ui appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm traction
Estimated Effort: ~1h
Context: The traction chart (charts/traction/) has two components: tenant-ui (own deployment template at templates/ui/deployment.yaml with hardcoded env + envFrom configmap) and traction-agent (via acapy sub-chart dependency). The traction_innkeeper plugin runs inside the ACA-Py process — no separate deployment.
Requirements:
- Add a
global.otel:section tovalues.yamlwith shared fields:enabled,endpoint,protocol,tracesExporter,metricsExporter,logCorrelation -
tenant-ui: Add
ui.otel.serviceName(local, defaults totraction-tenant-ui); render all OTEL env vars conditionally intemplates/ui/deployment.yamlfromglobal.otel.*whenglobal.otel.enabledis true -
ACA-Py agent: Once the acapy sub-chart has native
otel:support (see acapy-agent Helm Chart issue), it inheritsglobal.otel.*automatically — set onlyacapy.otel.serviceName: traction-agentin this chart'svalues.yaml
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointonce results in correct env vars in both the tenant-ui pod and the traction-agent pod - Each pod has its own distinct
OTEL_SERVICE_NAME - Traces from both components visible in Tempo
- Metrics from both components visible in Mimir
Labels: phase-2 local-dev python endorser
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local development environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: The repo has a VS Code devcontainer (.devcontainer/) with docker-in-docker and a full compose stack at docker/docker-compose.yml (including acapy-endorser-api as a service with source mounted for hot reload). Developers may run the API either via compose or directly in the devcontainer shell.
Requirements:
- Create
docker/docker-compose.observability.ymlwith only thegrafana/otel-lgtmservice (ports3000:3000,4317:4317,4318:4318) on the same network asdocker/docker-compose.yml - When running via compose, add to the
acapy-endorser-apiservice environment indocker/docker-compose.yml:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=acapy-endorser-serviceOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlpOTEL_PYTHON_LOG_CORRELATION=true
- When running uvicorn directly in the devcontainer shell, use
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318instead - Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose -f docker/docker-compose.observability.yml up -dstarts the LGTM stack - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python endorser
Estimated Effort: ~2h
Description: Add OTel auto-instrumentation to the FastAPI endorser service with SQLAlchemy tracing and application metrics.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-aiohttp-client,opentelemetry-instrumentation-sqlalchemytopyproject.toml - Do not add
opentelemetry-instrumentation-asyncpgseparately — the endorser uses SQLAlchemy with an asyncpg backend (create_async_engineinendorser/api/db/session.py); the SQLAlchemy instrumentor covers database spans at the correct level and adding asyncpg on top produces duplicate spans - Initialize OTel SDK at the top of
endorser/api/main.py, beforeapp = get_application()— set up bothTracerProviderandMeterProviderwith OTLP exporters - Apply
SQLAlchemyInstrumentor().instrument(engine=engine)after the engine is created inendorser/api/db/session.py
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- SQLAlchemy query spans appear as children of request spans
- Outbound aiohttp calls to ACA-Py appear as child spans
- HTTP RED metrics appear in Mimir
- No regression in existing test suite
Labels: phase-2 helm endorser
Estimated Effort: ~1h
Context: The endorser chart (charts/endorser-service/) has two components: the endorser API (templates/api/deployment.yaml, hardcoded env) and the acapy sub-chart dependency.
Requirements:
- Add a
global.otel:section tovalues.yamlwith shared fields:enabled,endpoint,protocol,tracesExporter,metricsExporter,logCorrelation -
Endorser API: Add
api.otel.serviceName(local); render all OTEL env vars conditionally intemplates/api/deployment.yamlfromglobal.otel.*whenglobal.otel.enabledis true -
ACA-Py agent: Once the acapy sub-chart has native
otel:support, it inheritsglobal.otel.*automatically — set onlyacapy.otel.serviceNamein this chart'svalues.yaml
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointonce results in correct env vars in both the endorser API pod and the aca-py pod - Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev python didwebvh
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: No devcontainer. The compose file is demo/docker-compose.yml — webvh-server runs as a compose service on webvh-network, so http://lgtm:4318 is the correct endpoint.
Requirements:
- Add
grafana/otel-lgtmservice todemo/docker-compose.ymlon thewebvh-network(ports3000:3000,4317:4317,4318:4318) - Add to the
webvh-serverservice environment:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=didwebvh-server-pyOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlpOTEL_PYTHON_LOG_CORRELATION=true
- Update README or CONTRIBUTING with local observability instructions
Acceptance Criteria:
-
docker compose -f demo/docker-compose.yml upstarts the LGTM stack alongside the service - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after instrumentation is added and test requests are made locally
- Developer docs updated
Labels: phase-2 instrumentation python didwebvh
Estimated Effort: ~2.5h
Description: Add OTel auto-instrumentation to the didwebvh FastAPI server with centralized OTel init covering both traces and metrics, and SQLAlchemy database tracing.
Requirements:
- Add
opentelemetry-instrumentation-fastapi,opentelemetry-instrumentation-logging,opentelemetry-instrumentation-sqlalchemytoserver/pyproject.toml - Create
server/app/otel.pyfor centralized OTel init — configure bothTracerProvider(BatchSpanProcessor + OTLPSpanExporter) andMeterProvider(PeriodicExportingMetricReader + OTLPMetricExporter) - Call OTel init from
server/main.pybeforeuvicorn.run(must be active before uvicorn imports the app module) - Apply
FastAPIInstrumentor.instrument_app(app)inserver/app/__init__.pywhereappis defined (uses modernlifespancontext manager) - Apply
SQLAlchemyInstrumentor().instrument(engine=self._engine)inserver/app/plugins/storage.py→StorageManager.__init__, immediately afterself._engineis first created (engine is a lazy singleton — instrument once on first init)
Acceptance Criteria:
- FastAPI request spans visible in Tempo
- SQLAlchemy/psycopg2 query spans appear as children
- HTTP RED metrics appear in Mimir
- Log correlation enabled (
trace_idpresent in stdout log records) - No regression in existing test suite
Labels: phase-2 helm didwebvh
Estimated Effort: ~1h
Context: The upstream chart (didwebvh-server-py v0.7.0) lives in ~/code/helm-charts/charts/didwebvh-server/ (DIF helm-charts repo). The deployment template (templates/server/deployment.yaml) has hardcoded env vars with no extraEnvVars support. Changes go to that repo; DITP gitops per-env overrides in services/didwebvh-server-py/charts/{env}/values.yaml pick them up automatically once the chart is updated.
Requirements:
- Add a native
otel:section tovalues.yamlin the DIF chart with fields:enabled,endpoint,protocol,tracesExporter,metricsExporter,logCorrelation,serviceName - Support
global.otel.*for all shared fields (prefer global with local fallback), consistent with the pattern used across all other services in this project - Render OTEL env vars conditionally in
templates/server/deployment.yamlwhenotel.enabled(orglobal.otel.enabled) is true
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointin the gitopsvalues.yamlresults in correct env vars in the deployed pod - Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev typescript mediator
Estimated Effort: ~½h
Description: Update the existing local observability compose file to use grafana/otel-lgtm for consistency with other services. See Local Development for background.
Context: apps/mediator/docker-compose.observability.yml already exists but uses a full custom stack (OTel Collector → Jaeger, Prometheus, Grafana). For Phase 2 local dev, replace this with the simpler grafana/otel-lgtm image. The devcontainer (.devcontainer/docker-compose.yml) has a mediator service; LGTM goes on the same network.
Requirements:
- Replace the contents of
apps/mediator/docker-compose.observability.ymlwith a minimalgrafana/otel-lgtmservice (ports3000:3000,4317:4317,4318:4318) - Update the
mediatorservice environment in that file to:OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318OTEL_SERVICE_NAME=didcomm-mediator-credoOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlp-
OTEL_METRICS_EXPORTER=otlp(use OTLP for local dev; Prometheus scraping is for production)
- Update README with updated local observability instructions
Acceptance Criteria:
-
docker compose -f apps/mediator/docker-compose.observability.yml upstarts the LGTM stack alongside the mediator - Grafana accessible at
http://localhost:3000 - Traces visible in Tempo after test requests are made
- Developer docs updated
Labels: phase-2 instrumentation typescript mediator
Estimated Effort: ~1h
Description: Wire the existing instrumentation.js preloader into the main Dockerfile and add OTel packages as proper dependencies. Most of the instrumentation work is already done.
Context: apps/mediator/instrumentation.js already exists and is complete — it configures NodeSDK with auto-instrumentations for http, express, ws, pg (enhancedDatabaseReporting: true), OTLP trace/metrics exporters, and Prometheus. The problem is:
- OTel packages are not in
apps/mediator/package.json— the mainDockerfilecannot resolve them - The main
Dockerfiledoes not loadinstrumentation.js— onlyDockerfile.observabilitydoes, via a hacky separatenpm installinto/otel/withNODE_PATH
Requirements:
- Add all required OTel packages to
apps/mediator/package.jsonas properdependencies:@opentelemetry/api,@opentelemetry/sdk-node,@opentelemetry/auto-instrumentations-node,@opentelemetry/exporter-trace-otlp-http,@opentelemetry/exporter-metrics-otlp-http,@opentelemetry/exporter-prometheus,@opentelemetry/sdk-metrics,@opentelemetry/resources,@opentelemetry/semantic-conventions - Update the main
Dockerfileto copyapps/mediator/instrumentation.jsinto the final image and setNODE_OPTIONS="--require /app/instrumentation.js"(or pass--requirein the entrypoint) -
Dockerfile.observabilitycan be removed or kept as a reference — itsNODE_PATHworkaround is no longer needed once deps are inpackage.json
Acceptance Criteria:
- Main
Dockerfilebuild succeeds with OTel packages resolved viapackage.json - Mediator HTTP, WebSocket, and pg spans visible in Tempo
- HTTP RED metrics visible in Mimir
- No regression in existing behaviour
Labels: phase-2 helm mediator
Estimated Effort: ~1h
Context: The deployment chart is the OWF chart at owf/helm-charts/charts/didcomm-mediator-credo/. The deployment template accepts arbitrary env vars via {{- toYaml .Values.environment }}, which is flexible but not a native otel: integration. DITP gitops per-env overrides live in services/mediator-credo/charts/{env}/values.yaml.
Requirements:
- Add a native
otel:section to the OWF chart'svalues.yamlwith fields:enabled,endpoint,protocol,tracesExporter,metricsExporter,serviceName - Support
global.otel.*for all shared fields (prefer global with local fallback), consistent with the pattern used across all other services - Render OTEL env vars conditionally in
templates/deployment.yamlwhenotel.enabled(orglobal.otel.enabled) is true, replacing reliance on the genericenvironmentpassthrough for OTel config
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointin the gitopsvalues.yamlresults in correct env vars in the deployed pod - Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment
Labels: phase-2 local-dev typescript bc-wallet
Estimated Effort: ~½h
Description: Add the grafana/otel-lgtm observability stack to the local docker-compose environment so developers can validate instrumentation locally before deploying to OpenShift. See Local Development for background.
Context: No devcontainer. The server runs on the host via yarn server:dev (ts-node-dev) — the server compose service in docker-compose.yml is under profiles: [prod] and is not used during development. Only MongoDB runs via compose. LGTM is added to the same compose file; the server running on the host connects via localhost.
Requirements:
- Add
grafana/otel-lgtmservice todocker-compose.yml(ports3000:3000,4317:4317,4318:4318) on the same network asmongodb - The server runs on the host, not in compose — set OTEL env vars in
server/.env.example(andserver/.envlocally):OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318OTEL_SERVICE_NAME=bc-wallet-demoOTEL_EXPORTER_OTLP_PROTOCOL=http/protobufOTEL_TRACES_EXPORTER=otlpOTEL_METRICS_EXPORTER=otlp
- Update README with local observability instructions
Acceptance Criteria:
-
docker compose upstarts LGTM and MongoDB together - Grafana accessible at
http://localhost:3000 - Running
yarn server:devwith OTEL env vars set produces traces in Tempo - Developer docs updated
Labels: phase-2 instrumentation typescript bc-wallet
Estimated Effort: ~2h
Description: Add OTel auto-instrumentation to the bc-wallet-demo server via a preloader file covering traces, metrics, and Mongoose database spans.
Context: Express 4 with routing-controllers (decorator-based). MongoDB is accessed via mongoose. The server entry is server/src/index.ts.
Requirements:
- Create
server/src/instrumentation.ts:- Use
NodeSDKwithgetNodeAutoInstrumentations()— this covers HTTP, Express, and MongoDB driver spans automatically - Add
MongooseInstrumentationfrom@opentelemetry/instrumentation-mongooseexplicitly for ODM-level spans (not included inauto-instrumentations-node) - Add
PeriodicExportingMetricReaderwithOTLPMetricExporterfrom@opentelemetry/exporter-metrics-otlp-http - Use
OTLPTraceExporterfrom@opentelemetry/exporter-trace-otlp-http - Read config from
OTEL_ENABLED,OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT,OTEL_TRACES_EXPORTER,OTEL_METRICS_EXPORTER; no-op when disabled
- Use
- Import as the first line of
server/src/index.ts(before Express and mongoose initialization) - Add required packages to
server/package.json:@opentelemetry/api,@opentelemetry/sdk-node,@opentelemetry/auto-instrumentations-node,@opentelemetry/exporter-trace-otlp-http,@opentelemetry/exporter-metrics-otlp-http,@opentelemetry/sdk-metrics,@opentelemetry/instrumentation-mongoose - Do NOT add
@opentelemetry/instrumentation-mongodbseparately — it is already included inauto-instrumentations-nodeand adding it again creates duplicate spans
Acceptance Criteria:
- Server HTTP spans visible in Tempo
- Mongoose operation spans appear as children of request spans
- HTTP RED metrics visible in Mimir
- No regression in existing test suite
Labels: phase-2 helm bc-wallet
Estimated Effort: ~1h
Context: The Helm chart is in the repo itself at charts/showcase/. The server deployment template (charts/showcase/templates/server/deployment.yaml) already has a {{- range .Values.showcase.server.extraEnv }} mechanism with {name, value} pairs — but following the pattern established across all other services, OTel config should be native rather than passed through extraEnv.
Requirements:
- Add a native
otel:section tocharts/showcase/values.yamlundershowcase.server.otelwith fields:enabled,serviceName - Support
global.otel.*for shared fields (endpoint,protocol,tracesExporter,metricsExporter), with localshowcase.server.otel.*as overrides where needed - Render OTEL env vars conditionally in
charts/showcase/templates/server/deployment.yamlwhenotel.enabled(orglobal.otel.enabled) is true
Acceptance Criteria:
- Setting
global.otel.enabled: trueandglobal.otel.endpointin the gitopsvalues.yamlresults in correct env vars in the deployed pod - Traces appear in Tempo after deployment
- HTTP metrics appear in Mimir after deployment