Summary
On Python pods where an OpenTelemetry auto-instrumentation wrapper has already installed an OTel TracerProvider before user code runs (e.g. when an OpenTelemetry-Operator Instrumentation CR injects an init-container that copies its bundled SDK into /otel-auto-instrumentation-python and prepends it to PYTHONPATH), sap_cloud_sdk.core.telemetry.auto_instrument() silently fails to deliver its resource attributes to the globally active TracerProvider. SAP-cloud-sdk attrs — sap.cloud_sdk.*, sap.solution_area, mlflow.experiment_id, sap.cld.*, deployment.environment.name, cloud.region — are missing from emitted spans.
This affects every sap-cloud-sdk consumer running on a managed Kubernetes runtime that auto-injects Python OTel auto-instrumentation. SAP App Foundation is one such environment — every Python pod gets the operator wrapper via a Kyverno ClusterPolicy that matches on the otel.instrumentation/enabled: python label, which the runtime's CI/CD workflow stamps automatically. Platform tracking ticket: AFSDK-2840.
Root cause
auto_instrument() calls Traceloop.init(..., resource_attributes=resource, ...). Internally Traceloop builds its own TracerProvider with the supplied Resource and calls trace.set_tracer_provider(...) to install it globally. But OTel's set_tracer_provider honours only the first call per process — it's gated by _TRACER_PROVIDER_SET_ONCE, with no override=True parameter (upstream issue thread). When a wrapper has already called set_tracer_provider during Python startup, Traceloop's call is silently dropped, and the resource_attributes we passed never reach the globally active provider.
Reproduction
- Deploy a Python application that calls
sap_cloud_sdk.core.telemetry.auto_instrument() from its startup path.
- Run it on a Kubernetes cluster with an OTel-Operator
Instrumentation CR that auto-injects Python auto-instrumentation (or set instrumentation.opentelemetry.io/inject-python: "true" on the pod manually). The pod will get an init-container that mounts an OTel SDK at /otel-auto-instrumentation-python and prepends it to PYTHONPATH.
- Read
trace.get_tracer_provider().resource.attributes after auto_instrument() returns.
Expected: the active provider's Resource carries the full sap-cloud-sdk enrichment — sap.cloud_sdk.*, sap.solution_area, mlflow.experiment_id, sap.cld.*, deployment.environment.name, cloud.region, plus service.name from APPFND_CONHOS_APP_NAME.
Observed: the active provider's Resource carries only operator-supplied attrs (telemetry.sdk.*, telemetry.auto.version, k8s.*, service.namespace, service.instance.id, and service.name derived from the k8s deployment name). All sap-cloud-sdk attrs are missing.
App Foundation reproducer
For SAP App Foundation tenants the trigger chain is automatic on every Python deploy:
- CI auto-detects Python and stamps
otel.instrumentation/enabled: python onto the workload CR — see ci-cd-workflow/.github/actions/detect-otel-runtime/detect.py and inject-otel-app-yaml/inject.py:39-47.
- The workload chart propagates the CR label onto the rendered Deployment + Pod template — see
helm-templates/charts/agent/templates/_helpers.tpl:170-193 ("Kyverno matches Deployment labels").
- The cluster's Kyverno
otel-inject-python-pod ClusterPolicy fires the OTel-Operator webhook → init-container injection → wrapper-installed TracerProvider → bug above.
Concrete evidence
OTEL resource attributes on a deployed App Foundation pod (sap-cloud-sdk==0.11.6, OTel-Operator bundle telemetry.auto.version=0.62b1, OTel SDK 1.41.1):
{
"resource_attribute_count": 15,
"resource_attributes": {
"telemetry.sdk.language": "python",
"telemetry.sdk.name": "opentelemetry",
"telemetry.sdk.version": "1.41.1",
"service.version": "0.0.1",
"sap.service.display_name": "buyer-agent-evals-fina",
"k8s.container.name": "buyer-agent-evals-fina",
"k8s.deployment.name": "buyer-agent-evals-fina-deployment",
"k8s.namespace.name": "buyer-agent-evals-fsmcba",
"k8s.node.name": "ip-10-250-152-52.eu-central-1.compute.internal",
"k8s.pod.name": "buyer-agent-evals-fina-deployment-97d4f6795-7r2vz",
"k8s.replicaset.name": "buyer-agent-evals-fina-deployment-97d4f6795",
"service.instance.id": "buyer-agent-evals-fsmcba.buyer-agent-evals-fina-deployment-97d4f6795-7r2vz.buyer-agent-evals-fina",
"service.namespace": "buyer-agent-evals-fsmcba",
"service.name": "buyer-agent-evals-fina-deployment",
"telemetry.auto.version": "0.62b1"
}
}
For comparison, auto_instrument() on a single-tenant pod (no wrapper active) produces a Resource with 23 attributes including all expected sap-cloud-sdk keys — so the resource-building path is fine, the problem is purely that those attrs never reach the globally active provider when something else got there first.
Impact
- MLflow trace routing breaks. The
mlflow.experiment_id resource attribute is the routing key on the collector side; without it, spans land in the wrong (or default) experiment.
- Solution-area / sub-account attribution breaks.
sap.solution_area, sap.cld.subaccount_id, sap.cld.system_role — used for filtering / quota / chargeback — are absent.
- Cloud-SDK provenance signals lost.
sap.cloud_sdk.{name,language,version} no longer identify spans as coming from a sap-cloud-sdk-instrumented workload.
- Service identity is incorrect.
service.name ends up as <appname>-deployment (k8s deployment name) rather than the cloud-sdk-supplied <appname> from APPFND_CONHOS_APP_NAME.
The bug is silent — auto_instrument() returns successfully and emits the "Cloud auto instrumentation initialized successfully" log line — so it's easy to miss without specifically inspecting the resulting provider's Resource.
Versions
sap-cloud-sdk 0.11.x (current).
traceloop-sdk 0.54.0.
- OTel SDK 1.41.x (any).
- Reproduces on any cluster with an OTel-Operator
Instrumentation CR auto-injecting Python auto-instrumentation. Confirmed on App Foundation's Kyma runtime.
Proposed fix
Detect the wrapper-installed-provider case via the standard upstream OTel-Operator marker telemetry.auto.version on the active provider's Resource, and merge the sap-cloud-sdk attrs onto it via provider._resource = provider.resource.merge(Resource.create(sap_attrs)). Right-side wins on collisions. No new public API; no parameter additions; existing single-tenant flows unaffected (no marker → no merge). Auto-detection means existing callers pick up the fix transparently on upgrade. PR forthcoming.
Summary
On Python pods where an OpenTelemetry auto-instrumentation wrapper has already installed an OTel
TracerProviderbefore user code runs (e.g. when an OpenTelemetry-OperatorInstrumentationCR injects an init-container that copies its bundled SDK into/otel-auto-instrumentation-pythonand prepends it toPYTHONPATH),sap_cloud_sdk.core.telemetry.auto_instrument()silently fails to deliver its resource attributes to the globally activeTracerProvider. SAP-cloud-sdk attrs —sap.cloud_sdk.*,sap.solution_area,mlflow.experiment_id,sap.cld.*,deployment.environment.name,cloud.region— are missing from emitted spans.This affects every sap-cloud-sdk consumer running on a managed Kubernetes runtime that auto-injects Python OTel auto-instrumentation. SAP App Foundation is one such environment — every Python pod gets the operator wrapper via a Kyverno ClusterPolicy that matches on the
otel.instrumentation/enabled: pythonlabel, which the runtime's CI/CD workflow stamps automatically. Platform tracking ticket: AFSDK-2840.Root cause
auto_instrument()callsTraceloop.init(..., resource_attributes=resource, ...). Internally Traceloop builds its ownTracerProviderwith the suppliedResourceand callstrace.set_tracer_provider(...)to install it globally. But OTel'sset_tracer_providerhonours only the first call per process — it's gated by_TRACER_PROVIDER_SET_ONCE, with nooverride=Trueparameter (upstream issue thread). When a wrapper has already calledset_tracer_providerduring Python startup, Traceloop's call is silently dropped, and theresource_attributeswe passed never reach the globally active provider.Reproduction
sap_cloud_sdk.core.telemetry.auto_instrument()from its startup path.InstrumentationCR that auto-injects Python auto-instrumentation (or setinstrumentation.opentelemetry.io/inject-python: "true"on the pod manually). The pod will get an init-container that mounts an OTel SDK at/otel-auto-instrumentation-pythonand prepends it toPYTHONPATH.trace.get_tracer_provider().resource.attributesafterauto_instrument()returns.Expected: the active provider's
Resourcecarries the full sap-cloud-sdk enrichment —sap.cloud_sdk.*,sap.solution_area,mlflow.experiment_id,sap.cld.*,deployment.environment.name,cloud.region, plusservice.namefromAPPFND_CONHOS_APP_NAME.Observed: the active provider's
Resourcecarries only operator-supplied attrs (telemetry.sdk.*,telemetry.auto.version,k8s.*,service.namespace,service.instance.id, andservice.namederived from the k8s deployment name). All sap-cloud-sdk attrs are missing.App Foundation reproducer
For SAP App Foundation tenants the trigger chain is automatic on every Python deploy:
otel.instrumentation/enabled: pythononto the workload CR — seeci-cd-workflow/.github/actions/detect-otel-runtime/detect.pyandinject-otel-app-yaml/inject.py:39-47.helm-templates/charts/agent/templates/_helpers.tpl:170-193("Kyverno matches Deployment labels").otel-inject-python-podClusterPolicy fires the OTel-Operator webhook → init-container injection → wrapper-installedTracerProvider→ bug above.Concrete evidence
OTEL resource attributes on a deployed App Foundation pod (
sap-cloud-sdk==0.11.6, OTel-Operator bundletelemetry.auto.version=0.62b1, OTel SDK 1.41.1):{ "resource_attribute_count": 15, "resource_attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.41.1", "service.version": "0.0.1", "sap.service.display_name": "buyer-agent-evals-fina", "k8s.container.name": "buyer-agent-evals-fina", "k8s.deployment.name": "buyer-agent-evals-fina-deployment", "k8s.namespace.name": "buyer-agent-evals-fsmcba", "k8s.node.name": "ip-10-250-152-52.eu-central-1.compute.internal", "k8s.pod.name": "buyer-agent-evals-fina-deployment-97d4f6795-7r2vz", "k8s.replicaset.name": "buyer-agent-evals-fina-deployment-97d4f6795", "service.instance.id": "buyer-agent-evals-fsmcba.buyer-agent-evals-fina-deployment-97d4f6795-7r2vz.buyer-agent-evals-fina", "service.namespace": "buyer-agent-evals-fsmcba", "service.name": "buyer-agent-evals-fina-deployment", "telemetry.auto.version": "0.62b1" } }For comparison,
auto_instrument()on a single-tenant pod (no wrapper active) produces aResourcewith 23 attributes including all expected sap-cloud-sdk keys — so the resource-building path is fine, the problem is purely that those attrs never reach the globally active provider when something else got there first.Impact
mlflow.experiment_idresource attribute is the routing key on the collector side; without it, spans land in the wrong (or default) experiment.sap.solution_area,sap.cld.subaccount_id,sap.cld.system_role— used for filtering / quota / chargeback — are absent.sap.cloud_sdk.{name,language,version}no longer identify spans as coming from a sap-cloud-sdk-instrumented workload.service.nameends up as<appname>-deployment(k8s deployment name) rather than the cloud-sdk-supplied<appname>fromAPPFND_CONHOS_APP_NAME.The bug is silent —
auto_instrument()returns successfully and emits the "Cloud auto instrumentation initialized successfully" log line — so it's easy to miss without specifically inspecting the resulting provider'sResource.Versions
sap-cloud-sdk0.11.x (current).traceloop-sdk0.54.0.InstrumentationCR auto-injecting Python auto-instrumentation. Confirmed on App Foundation's Kyma runtime.Proposed fix
Detect the wrapper-installed-provider case via the standard upstream OTel-Operator marker
telemetry.auto.versionon the active provider'sResource, and merge the sap-cloud-sdk attrs onto it viaprovider._resource = provider.resource.merge(Resource.create(sap_attrs)). Right-side wins on collisions. No new public API; no parameter additions; existing single-tenant flows unaffected (no marker → no merge). Auto-detection means existing callers pick up the fix transparently on upgrade. PR forthcoming.