Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39898: [C++] Add support for OpenTelemetry logging #39905

Merged
merged 32 commits into from
Jun 10, 2024

Conversation

benibus
Copy link
Collaborator

@benibus benibus commented Feb 2, 2024

Rationale for this change

Supporting OTel logs will help us improve diagnostics/debugging where OTel tracing is utilized.

What changes are included in this PR?

Primary changes:

  • Bumps opentelemetry-cpp version to 1.13.0 to access the stable logs SDK
  • Introduces a new ARROW_TELEMETRY module for these additions (and in anticipation of future OpenMetrics support)
  • Integrates Otel logging facilities, provides telemetry::Logger class that wraps an OTel logger and an API for creating loggers via OTel's global logger provider
  • Adds developer-friendly log record exporters, mimicking the already-existing span exporters

Some auxiliary, but significant additions:

  • Adds an extended/re-imagined version of the current ArrowLog APIs that aims to be more flexible and configurable - which the OTel loggers currently utilize. Notable details:
    • Adds an abstract util::Logger class that enables creating custom loggers.
    • Adds a LoggerRegistry for global access to an arbitrary number of loggers
    • Adds new ARROW_LOG_* macros that take individual loggers as a parameter. Additionally, they can be stripped at compile time based on a minimum log level
    • Definitely looking for opinions on this part of the PR, specifically...
  • Adds ArrowLogLevel::ARROW_TRACE as the lowest log level to mirror the equivalent OTel enums

NOTE: I've added some log statements that utilize the OTel facilities to Flight/FlightSQL - which are driven by the FlightSQL server tests. These are for demonstrative purposes only and I intend to remove them prior to merge

Are these changes tested?

Yes (although the OTel-specific tests are currently light)

Are there any user-facing changes?

This will introduce new APIs that are likely to be public.

@benibus
Copy link
Collaborator Author

benibus commented Mar 29, 2024

I've recently updated the PR description, so check that out for a rundown on all the high-level changes. Let me know if anything needs clarification.

As mentioned, I've added some temporary log statements to Flight/FlightSQL for demonstrative purposes. You can output the logs to the console in JSON format by navigating to the test directory and running these commands

$ export ARROW_LOGGING_BACKEND=arrow_otlp_stderr
$ ./arrow-flight-sql-test --gtest_filter=TestFlightSqlServer.*

For example, here's the output of TestFlightSqlServer.Transactions (sorry for the mass of text)

{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772449837699","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"QFiT8OhS2gEdhuaGKDZknQ==","spanId":"VNA99AMnaN4=","observedTimeUnixNano":"1711739772450081861"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772451542494","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type=BeginTransaction"},"flags":1,"traceId":"QFiT8OhS2gEdhuaGKDZknQ==","spanId":"VNA99AMnaN4=","observedTimeUnixNano":"1711739772451570825"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772451960471","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type: BeginTransaction"},"flags":1,"traceId":"QFiT8OhS2gEdhuaGKDZknQ==","spanId":"VNA99AMnaN4=","observedTimeUnixNano":"1711739772451996203"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772453054587","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"pfxIlUKZvYi0wSkmKDfk2g==","spanId":"gSjdSAw4Nrg=","observedTimeUnixNano":"1711739772453082426"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772453222467","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type=BeginSavepoint"},"flags":1,"traceId":"pfxIlUKZvYi0wSkmKDfk2g==","spanId":"gSjdSAw4Nrg=","observedTimeUnixNano":"1711739772453233715"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772453444399","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type: BeginSavepoint"},"flags":1,"traceId":"pfxIlUKZvYi0wSkmKDfk2g==","spanId":"gSjdSAw4Nrg=","observedTimeUnixNano":"1711739772453469248"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772454404095","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"Jr4LyoTLAmKcz+gMb8t8ug==","spanId":"nUqWuOYIIvQ=","observedTimeUnixNano":"1711739772454420509"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772456360801","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"MB66GcI+6RabHyV3LJerdg==","spanId":"cAACYu/WDhk=","observedTimeUnixNano":"1711739772456377024"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772458600542","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"Wei17Wf4h+539J0iBIWP9w==","spanId":"wif4cir67v4=","observedTimeUnixNano":"1711739772458616434"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772459050588","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoPut] command: \nDtype.googleapis.com/arrow.flight.protocol.sql.CommandStatementUpdate\u0012{\ngINSERT INTO intTable (keyName, value) VALUES ('KEYNAME1', 1001), ('KEYNAME2', 1002), ('KEYNAME3', 1003)\u0012\u001005PCF92MMXFS12FL"},"flags":1,"traceId":"Wei17Wf4h+539J0iBIWP9w==","spanId":"wif4cir67v4=","observedTimeUnixNano":"1711739772459064887"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772459768689","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"MpGqKE8VBaz3WIzDxH/tAQ==","spanId":"GFcj+01NHPM=","observedTimeUnixNano":"1711739772459790415"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772460918014","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"vVoMtgkxxI3XFGXw+b1JgA==","spanId":"h7f9W8F6VK4=","observedTimeUnixNano":"1711739772460938131"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772462046046","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"vwEi19XUsWuLxe30Ig+DBw==","spanId":"mYOcCf5QMuU=","observedTimeUnixNano":"1711739772462059053"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772462301995","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type=EndTransaction"},"flags":1,"traceId":"vwEi19XUsWuLxe30Ig+DBw==","spanId":"mYOcCf5QMuU=","observedTimeUnixNano":"1711739772462322558"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772462482344","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type: EndTransaction"},"flags":1,"traceId":"vwEi19XUsWuLxe30Ig+DBw==","spanId":"mYOcCf5QMuU=","observedTimeUnixNano":"1711739772462491832"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772462994109","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"KivBYU3SYBB9eR3DlO5GkA==","spanId":"8Gre7RqQom0=","observedTimeUnixNano":"1711739772463009907"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772463435875","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type=EndTransaction"},"flags":1,"traceId":"KivBYU3SYBB9eR3DlO5GkA==","spanId":"8Gre7RqQom0=","observedTimeUnixNano":"1711739772463449938"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772463732086","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type: EndTransaction"},"flags":1,"traceId":"KivBYU3SYBB9eR3DlO5GkA==","spanId":"8Gre7RqQom0=","observedTimeUnixNano":"1711739772463746873"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772464357772","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"1wLLwXzb1LVyRV1yTCuY9w==","spanId":"N69bPyJ93TA=","observedTimeUnixNano":"1711739772464377664"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772464619925","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type=EndTransaction"},"flags":1,"traceId":"1wLLwXzb1LVyRV1yTCuY9w==","spanId":"N69bPyJ93TA=","observedTimeUnixNano":"1711739772464630627"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightSqlServer"},"logRecords":[{"timeUnixNano":"1711739772464742229","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[DoAction] action.type: EndTransaction"},"flags":1,"traceId":"1wLLwXzb1LVyRV1yTCuY9w==","spanId":"N69bPyJ93TA=","observedTimeUnixNano":"1711739772464751408"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772465233033","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"PhIwV9Y1sLt4kHbeWwd7JA==","spanId":"95REOyF4l4Y=","observedTimeUnixNano":"1711739772465246899"}]}]}
{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"arrow.unknown_service"}},{"key":"service.namespace","value":{"stringValue":"org.apache"}},{"key":"service.instance.id","value":{"stringValue":"arrow.unknown_id"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"service.version","value":{"stringValue":"16.0.0-SNAPSHOT"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.13.0"}}]},"scopeLogs":[{"scope":{"name":"FlightGrpcServer"},"logRecords":[{"timeUnixNano":"1711739772465914677","severityNumber":"SEVERITY_NUMBER_INFO","severityText":"INFO","body":{"stringValue":"[MakeCallContext] Started call for middleware: arrow::flight::TracingServerMiddleware"},"flags":1,"traceId":"kYZd96zA63hi/n3GMtT6KA==","spanId":"v22t7SiqeUM=","observedTimeUnixNano":"1711739772465927804"}]}]}

@benibus benibus marked this pull request as ready for review March 29, 2024 21:45
@benibus benibus requested a review from lidavidm as a code owner March 29, 2024 21:45
}

otel::sdk::resource::Resource MakeResource(const ServiceAttributes& service_attributes) {
// TODO: We could also include process info...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could we get to a decision on whether to include these attributes? It seems like it could be useful information but, since users would typically be expected to correlate logs with traces, could this also be superfluous?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since users would typically be expected to correlate logs with traces, could this also be superfluous?

To clarify, is this because there would already be process info associated with the trace?

For the record, I don't think this is an ideal place to handle resource creation now that I'm looking at the big picture. This creates and sets attributes for a global OTel resource that could be shared by both a LoggerProvider and TracerProvider - but it's currently hidden in the logging implementation exclusively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, is this because there would already be process info associated with the trace?

Yep, that was my thought.

And your point about sharing makes sense to me. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libraries shouldn't be setting this info. It should be configured once at the top level by the ultimate application. Arrow is just weird because in PyArrow you don't have a way to set it, but I wouldn't want to push this on C++ users.

@github-actions github-actions bot added awaiting committer review Awaiting committer review awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting review Awaiting review awaiting committer review Awaiting committer review awaiting changes Awaiting changes labels Apr 1, 2024
@benibus
Copy link
Collaborator Author

benibus commented Apr 5, 2024

@lidavidm Somewhat related question: I've noticed that arrow::internal::tracing::GetTracer (which, AFAIK, is used to create the vast majority of Arrow's spans) will only ever return the TracerProvider that we implement internally. Is this intended? I believe that would nullify any exporter/processor configuration done by the host application.

@lidavidm
Copy link
Member

lidavidm commented Apr 5, 2024

It probably shouldn't. But problems like this are why I favor just opening up the gRPC part of Flight so that the user can configure everything themselves...

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I started a lot of this work, but we should more seriously consider exposing the underlying gRPC bits and letting applications configure everything themselves instead of trying to wrap a larger and larger API surface - it will be easier for us and easier for developers, and it means we can properly defer the maintenance to gRPC/OpenTelemetry

-DWITH_OTLP_HTTP=ON
-DWITH_OTLP_GRPC=OFF
# Disabled because it seemed to cause linking errors. May be worth a closer look.
-DWITH_FUNC_TESTS=OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we want tests in the first place?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think we can just delete the comment and leave this option off

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't want them, but it's enabled by default (in this version, at least).

cpp/cmake_modules/DefineOptions.cmake Outdated Show resolved Hide resolved
cpp/src/arrow/flight/flight_test.cc Show resolved Hide resolved
@@ -63,6 +64,7 @@ arrow::Result<FlightDescriptor> GetFlightDescriptorForCommand(
arrow::Result<std::unique_ptr<FlightInfo>> GetFlightInfoForCommand(
FlightSqlClient* client, const FlightCallOptions& options,
const google::protobuf::Message& command) {
ARROW_FLIGHT_OTELLOG_SQL_CLIENT(INFO, "[Example message] func=", __func__);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't removed from the PR?

cpp/src/arrow/flight/sql/server.cc Show resolved Hide resolved
@@ -928,6 +929,7 @@ class GrpcClientImpl : public internal::ClientTransport {
Status GetFlightInfo(const FlightCallOptions& options,
const FlightDescriptor& descriptor,
std::unique_ptr<FlightInfo>* info) override {
ARROW_FLIGHT_OTELLOG_CLIENT(INFO, "[Example message] func=", __func__);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wary of having any logs in the library in the first place - I don't want to be in a position where we accidentally log a sensitive payload

cpp/src/arrow/telemetry/logging.cc Outdated Show resolved Hide resolved
cpp/src/arrow/telemetry/logging.cc Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we plan to replace GLog with this facade?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I attempted to design it so that it could replace the existing logging system if we wanted to go that route.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels May 14, 2024
-DWITH_OTLP_HTTP=ON
-DWITH_OTLP_GRPC=OFF
# Disabled because it seemed to cause linking errors. May be worth a closer look.
-DWITH_FUNC_TESTS=OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think we can just delete the comment and leave this option off

-DWITH_FUNC_TESTS=OFF
# These options are slated for removal in v1.14 and their features are deemed stable
# as of v1.13. However, setting their corresponding ENABLE_* macros in headers seems
# finicky - resulting in build failures or ABI-related runtime errors during HTTP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you mean by setting these macros? IIUC they should be defined just by linking to opentelemetry_api https://github.com/open-telemetry/opentelemetry-cpp/pull/2435/files#diff-1186bce31e69b2154d84fff38d88b12365327134f5b3ed31e734889784a948dbL128-L131

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into it a bit more. The issue seems to be that the macros do get defined for opentelemetry-cpp, but they're not exposed to Arrow - which results in an ABI mismatch in the affected headers. I'm pretty sure we're linking against opentelemetry_api just fine because manually defining the options in ThirdPartyToolchain.cmake (around here) solves the issue, e.g:

target_compile_definitions(opentelemetry-cpp::api
                           INTERFACE
                           ENABLE_HTTP_SSL_PREVIEW
                           ENABLE_OTLP_HTTP_SSL_PREVIEW
                           ENABLE_HTTP_SSL_TLS_PREVIEW
                           ENABLE_OTLP_HTTP_SSL_TLS_PREVIEW)

I haven't found a way to get CMake to automatically pass these options to us, though.

Alternatively, we could further bump the version to 1.14 or 1.15 where this becomes a nonissue. v.1.15 seems to work interchangeably from what I've tested.

cpp/cmake_modules/DefineOptions.cmake Outdated Show resolved Hide resolved
cpp/src/arrow/flight/sql/test_app_cli.cc Outdated Show resolved Hide resolved
cpp/src/arrow/flight/sql/test_server_cli.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels May 22, 2024
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. I don't really want to wrap so much of a third party API but given the constraints we impose on ourselves we have no real choice.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels May 24, 2024
@benibus benibus requested a review from bkietz June 4, 2024 15:37
@zeroshade zeroshade merged commit f2057c5 into apache:main Jun 10, 2024
37 of 40 checks passed
@kou
Copy link
Member

kou commented Jun 11, 2024

It seems that this breaks CI:

https://github.com/apache/arrow/actions/runs/9453191775/job/26038038570#step:7:3293

80/96 Test #83: arrow-telemetry-test .........................***Failed    0.08 sec
Running arrow-telemetry-test, redirecting output into /build/cpp/build/test-logs/arrow-telemetry-test.txt (attempt 1/1)
Running main() from _deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from TestLogging
[ RUN      ] TestLogging.Basics
/arrow/cpp/src/arrow/telemetry/telemetry_test.cc:71: Failure
Expected: (logger) != (nullptr), actual: (nullptr) vs (nullptr)
/arrow/cpp/src/arrow/telemetry/telemetry_test.cc:71: Failure
Expected: (logger) != (nullptr), actual: (nullptr) vs (nullptr)
[  FAILED  ] TestLogging.Basics (0 ms)
[----------] 1 test from TestLogging (0 ms total)

[----------] Global test environment tear-down
/arrow/cpp/src/arrow/telemetry/telemetry_test.cc:62: Failure
Value of: internal::ShutdownOtelLoggerProvider()
  Actual: false
Expected: true
[==========] 1 test from 1 test suite ran. (0 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] TestLogging.Basics

 1 FAILED TEST
/build/cpp/src/arrow/telemetry

Could you open a new issue and take a look at this?

Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit f2057c5.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 42 possible false positives for unstable benchmarks that are known to sometimes produce them.

@mapleFU
Copy link
Member

mapleFU commented Jun 11, 2024

I've meet the problem and created an issue: #42104

@pitrou
Copy link
Member

pitrou commented Jun 12, 2024

I'm not sure why this was merged with some CI jobs explicitly failing on the telemetry tests.

@benibus
Copy link
Collaborator Author

benibus commented Jun 12, 2024

I've opened a PR that should address the CI failures and leftover example logs #42122

kou pushed a commit that referenced this pull request Jun 24, 2024
…2122)

### Rationale for this change

Follow-up to #39905, as some issues popped after merging.

### What changes are included in this PR?

- Fixes tests failing due to the `ARROW_LOGGING_BACKEND` env var not being defined
- Removes example log statements accidentally left in PR

### Are these changes tested?

Yes

### Are there any user-facing changes?

No

* GitHub Issue: #42104

Authored-by: benibus <bpharks@gmx.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Add support for OpenTelemetry logging
8 participants