Skip to content

[CODE HEALTH] Fix clang-tidy bugprone-exception-escape warnings in API#3964

Open
dbarker wants to merge 14 commits intoopen-telemetry:mainfrom
dbarker:fix_clangtidy_exception_escape_in_api
Open

[CODE HEALTH] Fix clang-tidy bugprone-exception-escape warnings in API#3964
dbarker wants to merge 14 commits intoopen-telemetry:mainfrom
dbarker:fix_clangtidy_exception_escape_in_api

Conversation

@dbarker
Copy link
Copy Markdown
Member

@dbarker dbarker commented Apr 1, 2026

Fixes #3981

Contributes to #2053, #3013

Changes

  • Add macros to support try/catch when compiling with and without exceptions
  • Add macro for unreachable code paths
  • Add unit test coverage for cases with string_view, string_util, and trace state.
  • Fixes the following warnings:

bugprone-exception-escape (4 warnings)

File Line Message
opentelemetry-cpp/api/include/opentelemetry/trace/trace_state.h 55 an exception may be thrown in function 'FromHeader' which should not throw exceptions
opentelemetry-cpp/api/include/opentelemetry/trace/trace_state.h 118 an exception may be thrown in function 'Get' which should not throw exceptions
opentelemetry-cpp/api/include/opentelemetry/trace/trace_state.h 138 an exception may be thrown in function 'Set' which should not throw exceptions
opentelemetry-cpp/api/include/opentelemetry/trace/trace_state.h 172 an exception may be thrown in function 'Delete' which should not throw exceptions

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 85.88235% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.06%. Comparing base (185dbd7) to head (6105492).

Files with missing lines Patch % Lines
api/include/opentelemetry/trace/trace_state.h 84.32% 8 Missing ⚠️
api/include/opentelemetry/common/kv_properties.h 92.60% 2 Missing ⚠️
api/include/opentelemetry/plugin/detail/utility.h 50.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3964      +/-   ##
==========================================
- Coverage   90.18%   90.06%   -0.11%     
==========================================
  Files         230      230              
  Lines        7299     7312      +13     
==========================================
+ Hits         6582     6585       +3     
- Misses        717      727      +10     
Files with missing lines Coverage Δ
api/include/opentelemetry/common/string_util.h 100.00% <100.00%> (ø)
api/include/opentelemetry/nostd/string_view.h 98.15% <ø> (ø)
api/include/opentelemetry/nostd/variant.h 66.67% <ø> (ø)
api/include/opentelemetry/common/kv_properties.h 96.67% <92.60%> (-2.19%) ⬇️
api/include/opentelemetry/plugin/detail/utility.h 50.00% <50.00%> (ø)
api/include/opentelemetry/trace/trace_state.h 90.33% <84.32%> (-7.29%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to replace nostd::holds_alternative with nostd::get_if is clean and a no brainer.

Please prepare a separate patch to fix all nostd::holds_alternative so this part can be merged out of the way.

This PR can then focus on the try/catch code alone, as it will be more tricky to resolve.

@dbarker dbarker marked this pull request as ready for review April 3, 2026 13:30
@dbarker dbarker requested a review from a team as a code owner April 3, 2026 13:30
@dbarker
Copy link
Copy Markdown
Member Author

dbarker commented Apr 3, 2026

The change to replace nostd::holds_alternative with nostd::get_if is clean and a no brainer.

Please prepare a separate patch to fix all nostd::holds_alternative so this part can be merged out of the way.

This PR can then focus on the try/catch code alone, as it will be more tricky to resolve.

Thanks @marcalff. I've split this up. The nostd::variant access changes are now in #3965. Interested in feedback on the try/catch/unreachable macros given our need to support noexcept builds and analyzers.

@marcalff
Copy link
Copy Markdown
Member

marcalff commented Apr 9, 2026

@dbarker

Thanks for working on this, this is a delicate topic in opentelemetry-cpp.

Some background information, as I understand the history, to help designing the proper fix, and some general comments below.

1. Otel Instrumentation should not kill the application

When an application is instrumented with otel, it makes API calls to otel-cpp.
These calls are to collect data (SDK installed and configured), or not (SDK not installed, or discarding data), and the application does not expect data returned from these API calls.

Whatever happens within otel-cpp should not put the application at risk, which is why every API entry point is designed to be noexcept.

2. The application code can not be trusted to be robust

Every API that returns a pointer (say, a span) can never return a null pointer, because we can not expect the application code to systematically check for null pointer conditions on every single line instrumented.

Constantly checking pointers would be very tedious on the application side, and unsafe, possibly causing crashes if a check is missing.

For this reason, otel-cpp will never return a null pointer, but will return a "no-op" object instead, that the instrumented code can call without precautions.

In other words, we don't want to impose a defensive programming style on the application, when adding opentelemetry instrumentation.

3. Memory allocation and noexcept

Points 1 and 2 are already causing severe constraints, and I am not ever sure it is possible to safely return a pointer on a noop object while not raising exceptions at the same time.

When an API returns a nostd::unique_ptr<T> or nostd::shared_ptr<T>, is needs to allocate memory, which can raise exceptions, making this hard to have an API typed as noexcept.

Putting aside exceptions caused by Out Of Memory, at least we should make an effort to not raise any other exceptions then.

4. Different build flavors

Applications can be built with, or without, exception support, and we need to accommodate both.

Given that we support many compilers/platforms as well, hiding all the details behind OPENTELEMETRY_HAVE_EXCEPTIONS sounds a good thing.

5. Places that need a try-catch block

There should be very few places handling exceptions in the API itself.

Most of the time, the API consist of virtual methods implemented by the SDK, so there is no code.

When code is actually provided in-lined in the API, this is for very specific places, like:

  • The runtime context
  • Propagation

so I would expect only a very few try-catch blocks in the whole API.

6. try-catch block style

Instead of hiding all the details behind macros for the try, and more importantly the catch block, we can have more explicit code.

For example:

#if OPENTELEMETRY_HAVE_EXCEPTIONS
try
#endif
{
  // Regular code
}
#if OPENTELEMETRY_HAVE_EXCEPTIONS
catch (...)
{
  // catch code, only used when building with exceptions
}
#endif

This will avoid confusing the compiler, clang-tidy, cpp-check, and avoid complaints that a return code path is missing, code is not reachable, etc, when building with/without exceptions.

Given (4) with very few try-catch blocks needed, I think this will work better.

7. What to do when an exception is seen ?

Nothing.

Do not try to log it, do not call abort, nothing.

We can not log anything from the API (the SDK may not even be present), and we definitively can not take the application down with an abort.

The worst case is that opentelemetry-cpp will behave like a noop brick if it fails internally.

We can have an assert in debug or in maintainer mode to catch failures in CI, but can not do more than that.

8. Internal supporting code does not need to be noexcept

Code in the actual API surface, used by the application, needs to be noexcept (tracer provider, tracer, span, etc).

Supporting code, like StringUtil::Trim, is not supposed to be used directly, it is there to implement other apis.

Because of this, I think we can relax internal helpers to not be noexcept, instead of adding try-catch blocks inside the helper.

@dbarker
Copy link
Copy Markdown
Member Author

dbarker commented Apr 13, 2026

Thanks for the feedback @marcalff. The points makes sense. I'll remove the macros in the PR and we can follow up in a SIG meeting on next steps more broadly for error handling.

1. Otel Instrumentation should not kill the application

Your point is focused on instrumentation (using the API) and makes sense. The spec does a good job of highlighting this, and allows cases where the API/SDK may terminate the application on initialization (See error handling)

  1. API methods MUST NOT throw unhandled exceptions when used incorrectly by end users.
    The API and SDK SHOULD provide safe defaults for missing or invalid arguments.
    For instance, a name like empty may be used if the user passes in null as the span name argument during Span construction.
  2. The API or SDK MAY fail fast and cause the application to fail on initialization, e.g. because of a bad user config or environment, but MUST NOT cause the application to fail later at runtime, e.g. due to dynamic config settings received from the Collector.
  3. The SDK MUST NOT throw unhandled exceptions for errors in their own operations.
    For example, an exporter should not throw an exception when it cannot reach the endpoint to which it sends telemetry data.

Failing fast on initialization and perhaps user configurable error response (fail loud or silently) does seem desirable (see last point below).

3. Memory allocation and noexcept

Memory allocation is an area for discussion and the codebase is mixed. In most cases memory is allocated with new alone which may throw. Some cases use new (std::nothrow), some use placement new, and some use the modern std::make_unique/shared.

4. Different build flavors

Applications can be built with, or without, exception support, and we need to accommodate both.

Is building without exception support a requirement just for the API or to include the SDK, exporters, and detectors? Given the dependencies the project has we may not be able to provide an exception free build of all components that meets the requirements of point 1. The bazel noexept build doesn't include all components currently.

7. What to do when an exception is seen ?

We can have an assert in debug or in maintainer mode to catch failures in CI

This makes sense and I think should be a requirement for the project (ci should fail on error that would be ignored in production, a maintainer mode flag is a great idea).

The spec gives some guidance that extends this to users who may also want a clear signal that something is going wrong.

SDK implementations MUST allow end users to change the library's default error handling behavior for relevant errors.
Application developers may want to run with strict error handling in a staging environment to catch invalid uses of the API, or malformed config.
Note that configuring a custom error handler in this way is the only exception to the basic error handling principles outlined above.
The mechanism by which end users set or register a custom error handler should follow language-specific conventions.

class StringUtil
{
public:
static nostd::string_view Trim(nostd::string_view str, size_t left, size_t right) noexcept
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcalff this is a good example of a method that may be considered an implementation detail for the API but is currently public. Would removing noexcept from this method be acceptable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CODE HEALTH] clang-tidy reports bugprone-exception-escape warnings in API

3 participants