Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
5e3b057
Update RELEASING.md with contrib.yml notes (#4864)
emdneto Dec 29, 2025
1ac0158
Prevent recursive logging issue in `SimpleLogRecordProcessor.on_emit`…
DylanRussell Dec 29, 2025
52137e0
Allow loading all resource detectors by setting `OTEL_EXPERIMENTAL_RE…
bschoenmaeckers Dec 30, 2025
f40ae14
Add CC users for performance alert comments (#4871)
emdneto Jan 9, 2026
3750c14
add more benchmark for logs signal (#4870)
emdneto Jan 9, 2026
3987f3b
Switch to using contextvar for recursion indicator (#4867)
DylanRussell Jan 9, 2026
7e6f11e
Update Emídio Neto's affiliation in README (#4874)
emdneto Jan 9, 2026
784442f
Fix docs/example semconv link (#4875)
tammy-baylis-swi Jan 12, 2026
62e9ad3
Fix duplicate HELP/TYPE declarations in Prometheus exporter (#4869)
Antrakos Jan 15, 2026
b3193f2
feat(http): add error handling for exporting (#4709)
pafi-code Jan 20, 2026
0018c00
Make ConcurrentMultiSpanProcessor fork safe (#4862)
gregoiredx Jan 20, 2026
615d467
test-util: allow filtering metrics by scope (#4883)
anuraaga Jan 23, 2026
9ee6de8
Fix: Reinitialize gRPC channel on UNAVAILABLE error (#4825)
dheeraj-vanamala Jan 27, 2026
72c3729
update tox command’s deps and allowlist
MikeGoldsmith Feb 4, 2026
a2dfa41
add use-union-operator to datamodel-codegen and regenerate models file
MikeGoldsmith Feb 4, 2026
659ab65
Merge branch 'main' of github.com:open-telemetry/opentelemetry-python…
MikeGoldsmith Feb 4, 2026
dd6a2cd
add changelog
MikeGoldsmith Feb 4, 2026
99e9570
disable union-operator and set target python to 3.10
MikeGoldsmith Feb 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,4 @@ jobs:
# Alert with a commit comment on possible performance regression
alert-threshold: '200%'
comment-on-alert: true
alert-comment-cc-users: "@open-telemetry/python-approvers,@open-telemetry/python-maintainers"
18 changes: 16 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

- `opentelemetry-exporter-otlp-proto-grpc`: Fix re-initialization of gRPC channel on UNAVAILABLE error
([#4825](https://github.com/open-telemetry/opentelemetry-python/pull/4825))
- `opentelemetry-exporter-prometheus`: Fix duplicate HELP/TYPE declarations for metrics with different label sets
([#4868](https://github.com/open-telemetry/opentelemetry-python/issues/4868))
- Allow loading all resource detectors by setting `OTEL_EXPERIMENTAL_RESOURCE_DETECTORS` to `*`
([#4819](https://github.com/open-telemetry/opentelemetry-python/pull/4819))
- `opentelemetry-sdk`: Fix the type hint of the `_metrics_data` property to allow `None`
([#4837](https://github.com/open-telemetry/opentelemetry-python/pull/4837)
([#4837](https://github.com/open-telemetry/opentelemetry-python/pull/4837)).
- Regenerate opentelemetry-proto code with v1.9.0 release
([#4840](https://github.com/open-telemetry/opentelemetry-python/pull/4840))
- Add python 3.14 support
([#4798](https://github.com/open-telemetry/opentelemetry-python/pull/4798))
- Silence events API warnings for internal users
([#4847](https://github.com/open-telemetry/opentelemetry-python/pull/4847))
- Prevent possible endless recursion from happening in `SimpleLogRecordProcessor.on_emit`,
([#4799](https://github.com/open-telemetry/opentelemetry-python/pull/4799)) and ([#4867](https://github.com/open-telemetry/opentelemetry-python/pull/4867)).
- Make ConcurrentMultiSpanProcessor fork safe
([#4862](https://github.com/open-telemetry/opentelemetry-python/pull/4862))
- `opentelemetry-exporter-otlp-proto-http`: fix retry logic and error handling for connection failures in trace, metric, and log exporters
([#4709](https://github.com/open-telemetry/opentelemetry-python/pull/4709))
- `opentelemetry-sdk`: automatically generate configuration models using OTel config JSON schema
([#4879](https://github.com/open-telemetry/opentelemetry-python/pull/4879))

## Version 1.39.0/0.60b0 (2025-12-03)

Expand Down Expand Up @@ -87,7 +101,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#4654](https://github.com/open-telemetry/opentelemetry-python/pull/4654)).
- Fix type checking for built-in metric exporters
([#4820](https://github.com/open-telemetry/opentelemetry-python/pull/4820))

## Version 1.38.0/0.59b0 (2025-10-16)

- Add `rstcheck` to pre-commit to stop introducing invalid RST
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ For more information about the maintainer role, see the [community repository](h
### Approvers

- [Dylan Russell](https://github.com/dylanrussell), Google
- [Emídio Neto](https://github.com/emdneto), PicPay
- [Emídio Neto](https://github.com/emdneto), Independent
- [Héctor Hernández](https://github.com/hectorhdzg), Microsoft
- [Jeremy Voss](https://github.com/jeremydvoss), Microsoft
- [Liudmila Molkova](https://github.com/lmolkova), Grafana Labs
Expand Down
12 changes: 12 additions & 0 deletions RELEASING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,18 @@
* Review and merge the pull request that it creates for updating the version.
* Note: If you are doing a patch release in `-core` repo, you should also do an equivalent patch release in `-contrib` repo (even if there's no fix to release), otherwise tests in CI will fail.

### Note on `contrib.yml` Workflow Behavior

The [contrib.yml](https://github.com/open-telemetry/opentelemetry-python/blob/main/.github/workflows/contrib.yml) workflow in the core repository references reusable workflows from opentelemetry-python-contrib using the hard-coded `main` branch.

Because `uses:` statements cannot receive environment variables and workflows cannot patch or modify other workflows, this reference cannot dynamically follow release branches as we are doing in other workflows.

As a result, when preparing a release branch that contains a different set of instrumentations (e.g., older branches without newly added tox environments), CI may attempt to run tests that do not exist on tox in that branch. In this case:

* It is safe to merge the release PR even if the contrib workflow fails for this reason, or

* Optionally update the contrib.yml workflow to point to the corresponding release branch before running CI.

## Making the release

* Run the [Release workflow](https://github.com/open-telemetry/opentelemetry-python/actions/workflows/release.yml).
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/sqlcommenter/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This is an example of how to use OpenTelemetry Python instrumention with
sqlcommenter to enrich database query statements with contextual information.
For more information on sqlcommenter concepts, see:

* `Semantic Conventions - Database Spans <https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/database-spans.md#sql-commenter>`_
* `Semantic Conventions - Database Spans <https://github.com/open-telemetry/semantic-conventions/blob/main/docs/db/database-spans.md#sql-commenter>`_
* `sqlcommenter <https://google.github.io/sqlcommenter/>`_

The source files of this example are available `here <https://github.com/open-telemetry/opentelemetry-python/tree/main/docs/examples/sqlcommenter/>`_.
Expand Down Expand Up @@ -120,5 +120,5 @@ References
* `OpenTelemetry Project <https://opentelemetry.io/>`_
* `OpenTelemetry Collector <https://github.com/open-telemetry/opentelemetry-collector>`_
* `OpenTelemetry MySQL instrumentation <https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-mysql>`_
* `Semantic Conventions - Database Spans <https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/database-spans.md#sql-commenter>`_
* `Semantic Conventions - Database Spans <https://github.com/open-telemetry/semantic-conventions/blob/main/docs/db/database-spans.md#sql-commenter>`_
* `sqlcommenter <https://google.github.io/sqlcommenter/>`_
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,13 @@
# See the License for the specific language governing permissions and
# limitations under the License.

"""OTLP Exporter"""
"""OTLP Exporter

This module provides a mixin class for OTLP exporters that send telemetry data
to an OTLP-compatible receiver via gRPC. It includes a configurable reconnection
logic to handle transient collector outages.

"""

import random
import threading
Expand Down Expand Up @@ -251,20 +257,27 @@ def _get_credentials(
if certificate_file:
client_key_file = environ.get(client_key_file_env_key)
client_certificate_file = environ.get(client_certificate_file_env_key)
return _load_credentials(
credentials = _load_credentials(
certificate_file, client_key_file, client_certificate_file
)
if credentials is not None:
return credentials
return ssl_channel_credentials()


# pylint: disable=no-member
class OTLPExporterMixin(
ABC, Generic[SDKDataT, ExportServiceRequestT, ExportResultT, ExportStubT]
):
"""OTLP span exporter
"""OTLP gRPC exporter mixin.

This class provides the base functionality for OTLP exporters that send
telemetry data (spans or metrics) to an OTLP-compatible receiver via gRPC.
It includes a configurable reconnection mechanism to handle transient
receiver outages.

Args:
endpoint: OpenTelemetry Collector receiver endpoint
endpoint: OTLP-compatible receiver endpoint
insecure: Connection type
credentials: ChannelCredentials object for server authentication
headers: Headers to send when exporting
Expand Down Expand Up @@ -308,6 +321,8 @@ def __init__(
if parsed_url.netloc:
self._endpoint = parsed_url.netloc

self._insecure = insecure
self._credentials = credentials
self._headers = headers or environ.get(OTEL_EXPORTER_OTLP_HEADERS)
if isinstance(self._headers, str):
temp_headers = parse_env_headers(self._headers, liberal=True)
Expand Down Expand Up @@ -336,37 +351,52 @@ def __init__(
)
self._collector_kwargs = None

compression = (
self._compression = (
environ_to_compression(OTEL_EXPORTER_OTLP_COMPRESSION)
if compression is None
else compression
) or Compression.NoCompression

if insecure:
self._channel = insecure_channel(
self._endpoint,
compression=compression,
options=self._channel_options,
)
else:
self._channel = None
self._client = None

self._shutdown_in_progress = threading.Event()
self._shutdown = False

if not self._insecure:
self._credentials = _get_credentials(
credentials,
self._credentials,
_OTEL_PYTHON_EXPORTER_OTLP_GRPC_CREDENTIAL_PROVIDER,
OTEL_EXPORTER_OTLP_CERTIFICATE,
OTEL_EXPORTER_OTLP_CLIENT_KEY,
OTEL_EXPORTER_OTLP_CLIENT_CERTIFICATE,
)

self._initialize_channel_and_stub()

def _initialize_channel_and_stub(self):
"""
Create a new gRPC channel and stub.

This method is used during initialization and by the reconnection
mechanism to reinitialize the channel on transient errors.
"""
if self._insecure:
self._channel = insecure_channel(
self._endpoint,
compression=self._compression,
options=self._channel_options,
)
else:
assert self._credentials is not None
self._channel = secure_channel(
self._endpoint,
self._credentials,
compression=compression,
compression=self._compression,
options=self._channel_options,
)
self._client = self._stub(self._channel) # type: ignore [reportCallIssue]

self._shutdown_in_progress = threading.Event()
self._shutdown = False

@abstractmethod
def _translate_data(
self,
Expand All @@ -388,6 +418,8 @@ def _export(
deadline_sec = time() + self._timeout
for retry_num in range(_MAX_RETRYS):
try:
if self._client is None:
return self._result.FAILURE
self._client.Export(
request=self._translate_data(data),
metadata=self._headers,
Expand All @@ -407,6 +439,26 @@ def _export(
retry_info.retry_delay.seconds
+ retry_info.retry_delay.nanos / 1.0e9
)

# For UNAVAILABLE errors, reinitialize the channel to force reconnection
if error.code() == StatusCode.UNAVAILABLE and retry_num == 0: # type: ignore
logger.debug(
"Reinitializing gRPC channel for %s exporter due to UNAVAILABLE error",
self._exporting,
)
try:
if self._channel:
self._channel.close()
except Exception as e:
logger.debug(
"Error closing channel for %s exporter to %s: %s",
self._exporting,
self._endpoint,
str(e),
)
# Enable channel reconnection for subsequent calls
self._initialize_channel_and_stub()

if (
error.code() not in _RETRYABLE_ERROR_CODES # type: ignore [reportAttributeAccessIssue]
or retry_num + 1 == _MAX_RETRYS
Expand Down Expand Up @@ -436,12 +488,19 @@ def _export(
return self._result.FAILURE # type: ignore [reportReturnType]

def shutdown(self, timeout_millis: float = 30_000, **kwargs) -> None:
"""
Shut down the exporter.

Args:
timeout_millis: Timeout in milliseconds for shutting down the exporter.
"""
if self._shutdown:
logger.warning("Exporter already shutdown, ignoring call")
return
self._shutdown = True
self._shutdown_in_progress.set()
self._channel.close()
if self._channel:
self._channel.close()

@property
@abstractmethod
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
from unittest import TestCase
from unittest.mock import Mock, patch

import grpc
from google.protobuf.duration_pb2 import ( # pylint: disable=no-name-in-module
Duration,
)
Expand Down Expand Up @@ -91,8 +92,8 @@ def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
def _exporting(self):
return "traces"

def shutdown(self, timeout_millis=30_000):
return OTLPExporterMixin.shutdown(self, timeout_millis)
def shutdown(self, timeout_millis: float = 30_000, **kwargs):
return OTLPExporterMixin.shutdown(self, timeout_millis, **kwargs)


class TraceServiceServicerWithExportParams(TraceServiceServicer):
Expand Down Expand Up @@ -513,6 +514,16 @@ def test_timeout_set_correctly(self):
self.assertEqual(mock_trace_service.num_requests, 2)
self.assertAlmostEqual(after - before, 1.4, 1)

def test_channel_options_set_correctly(self):
"""Test that gRPC channel options are set correctly for keepalive and reconnection"""
# This test verifies that the channel is created with the right options
# We patch grpc.insecure_channel to ensure it is called without errors
with patch(
"opentelemetry.exporter.otlp.proto.grpc.exporter.insecure_channel"
) as mock_channel:
OTLPSpanExporterForTesting(insecure=True)
self.assertTrue(mock_channel.called)

def test_otlp_headers_from_env(self):
# pylint: disable=protected-access
# This ensures that there is no other header than standard user-agent.
Expand All @@ -536,3 +547,27 @@ def test_permanent_failure(self):
warning.records[-1].message,
"Failed to export traces to localhost:4317, error code: StatusCode.ALREADY_EXISTS",
)

def test_unavailable_reconnects(self):
"""Test that the exporter reconnects on UNAVAILABLE error"""
add_TraceServiceServicer_to_server(
TraceServiceServicerWithExportParams(StatusCode.UNAVAILABLE),
self.server,
)

# Spy on grpc.insecure_channel to verify it's called for reconnection
with patch(
"opentelemetry.exporter.otlp.proto.grpc.exporter.insecure_channel",
side_effect=grpc.insecure_channel,
) as mock_insecure_channel:
# Mock sleep to avoid waiting
with patch("time.sleep"):
# We expect FAILURE because the server keeps returning UNAVAILABLE
# but we want to verify reconnection attempts happened
self.exporter.export([self.span])

# Verify that we attempted to reinitialize the channel (called insecure_channel)
# Since the initial channel was created in setUp (unpatched), this call
# must be from the reconnection logic.
self.assertTrue(mock_insecure_channel.called)
# Verify that reconnection enabled flag is set
Original file line number Diff line number Diff line change
Expand Up @@ -186,26 +186,42 @@ def export(
serialized_data = encode_logs(batch).SerializeToString()
deadline_sec = time() + self._timeout
for retry_num in range(_MAX_RETRYS):
resp = self._export(serialized_data, deadline_sec - time())
if resp.ok:
return LogRecordExportResult.SUCCESS
# multiplying by a random number between .8 and 1.2 introduces a +/20% jitter to each backoff.
backoff_seconds = 2**retry_num * random.uniform(0.8, 1.2)
try:
resp = self._export(serialized_data, deadline_sec - time())
if resp.ok:
return LogRecordExportResult.SUCCESS
except requests.exceptions.RequestException as error:
reason = error
retryable = isinstance(error, ConnectionError)
status_code = None
else:
reason = resp.reason
retryable = _is_retryable(resp)
status_code = resp.status_code

if not retryable:
_logger.error(
"Failed to export logs batch code: %s, reason: %s",
status_code,
reason,
)
return LogRecordExportResult.FAILURE

if (
not _is_retryable(resp)
or retry_num + 1 == _MAX_RETRYS
retry_num + 1 == _MAX_RETRYS
or backoff_seconds > (deadline_sec - time())
or self._shutdown
):
_logger.error(
"Failed to export logs batch code: %s, reason: %s",
resp.status_code,
resp.text,
"Failed to export logs batch due to timeout, "
"max retries or shutdown."
)
return LogRecordExportResult.FAILURE
_logger.warning(
"Transient error %s encountered while exporting logs batch, retrying in %.2fs.",
resp.reason,
reason,
backoff_seconds,
)
shutdown = self._shutdown_is_occuring.wait(backoff_seconds)
Expand Down
Loading