Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otelttrpc: implement unary interceptors for opentelemetry instrumentation. #145

Closed
wants to merge 6 commits into from

Conversation

klihub
Copy link
Member

@klihub klihub commented Jun 21, 2023

This patch series adds a unary client and server interceptors
for opentelemetry tracing instrumentation. The interceptors
can be passed as ttrpc.ClientOpts and ttrpc.ServerOpt to
ttrpc during client and server creation with code like this:

client := ttrpc.NewClient(
    conn,
    ttrpc.UnaryClientInterceptor(
        otelttrpc.UnaryClientInterceptor(),
    ),
)

and

server, err := ttrpc.NewServer(
    ttrpc.WithUnaryServerInterceptor(
        otelttrpc.UnaryServerInterceptor(),
    ),
)

These interceptors will then automatically handle generating
trace Spans for all called and served unary method calls. If
the rest of the code is properly set up to collect and export
tracing data to opentelemetry, these spans should show up as
part of the collected traces.

This code was written by lifting over the corresponding gRPC
instrumentation bits from the opentelemetry contrib go package,
leaving out bits considered unnecessary for an initial version,
then replacing gRPC-specific bits by corresponding ttRPC code.

The left out bits were intercepting filters as unnecessary and
the streaming interceptors as too complex for a first version.
A subsequent PR should implement streaming interceptors.
Issue #147 has been created to track this initial version and
the follow-up with streaming support.

TBH, we also could use more proper tests. Perhaps those can
be addressed with follow-up PRs.

UPDATE 1:
Additionally, since automatic instrumentation is implemented
using instrumentation-specific interceptors, the patch series
adds support for chaining interceptors both for unary client
and unary server interceptors. This should allow users to both
instrument their ttRPC services while keep using any other
interceptors they might have.

Both the client and the server can be be set up with explicitly
chained interceptors using the new WithChainUnaryClientInterceptor
and WithChainUnaryServerInterceptor options. Additionally,
the original WithUnaryClientInterceptor and WithUnaryServerInterceptor
options have been updated to implicitly chain interceptors if they are
used multiple times among the options.

UPDATE 2:
Reworked Makefile to better handle sub-/multiple go modules.

UPDATE 3:
Lifted over the sample instrumentation client/server example from otelgrpc,
gutted it out to exclude features we don't support yet (streaming server and
client), and adjusted it for using ttrpc.

@klihub klihub force-pushed the devel/opentelemetry branch 5 times, most recently from cffb04a to 010a308 Compare June 22, 2023 15:55
@klihub
Copy link
Member Author

klihub commented Jun 22, 2023

Updated PR, turning otelttrpc into a module of its own.

I have one immediate question related to naming. Would it be better to drop the intermediate instrumentation from the directory path and (re)name the final directory to otel ? Then the import path would become github.com/containerd/ttrpc/otel. Folks could import-rename it to otelttrpc or whatever they prefer if they also need to import an opentelemetry package with a conflicting final name component.

@klihub
Copy link
Member Author

klihub commented Jun 23, 2023

Another missing thing I just realized is options for chaining interceptors. I'll take a look at that.

@cpuguy83
Copy link
Member

Would it be better to drop the intermediate instrumentation

This seems fine.

and (re)name the final directory to otel

My issue here is it makes getting the imports right a manual process.
With otelttprc goimports/gopls will know exactly what you are trying to get at.

@klihub klihub force-pushed the devel/opentelemetry branch 14 times, most recently from ff2c52d to b56a384 Compare June 24, 2023 17:27
@klihub klihub requested a review from cpuguy83 June 26, 2023 12:55
@cpuguy83
Copy link
Member

Overall looks great, just a minor comment.

@klihub
Copy link
Member Author

klihub commented Jun 27, 2023

Overall looks great, just a minor comment.

@cpuguy83 I think there is at least one thing left to do in this PR, now that we turned otelttrpc into a (sub)package of its own. We need to update the Makefile so that for selected targets (for instance build and test) it knows to explicitly descend into otelttrpc and do the right thing there...

@klihub klihub force-pushed the devel/opentelemetry branch 2 times, most recently from a9993cc to 9a2787e Compare June 27, 2023 20:35
@klihub klihub force-pushed the devel/opentelemetry branch 3 times, most recently from 4e2a1ac to f3c586e Compare July 13, 2023 14:32
@klihub
Copy link
Member Author

klihub commented Jul 13, 2023

Should these commits be split out to separate PR's of their own:

  • server_test: wait for OnClose in TestClientEOF
  • .github: give more slack for build+tests

They are addressing problems which are not directly related to this PR. Especially the latter one which simply increases a workflow timeout.

@estesp
Copy link
Member

estesp commented Jul 13, 2023

Should these commits be split out to separate PR's of their own:

  • server_test: wait for OnClose in TestClientEOF
  • .github: give more slack for build+tests

They are addressing problems which are not directly related to this PR. Especially the latter one which simply increases a workflow timeout.

I think it would be good to break them out; especially because the rest of the commits are all directly about the otel feature, and if we ever had to migrate that or revert it.. would be better to not have unrelated commits associated with this PR

@estesp
Copy link
Member

estesp commented Jul 13, 2023

I know there is the example tucked under the new module with a README, but it seems like it would be good to have some information on this in the main README, especially as essentially there is a new "library" within the repo with its own build/test, etc. It doesn't need to be exhaustive but maybe one new section in the main project readme to guide anyone looking for OTEL support here.

@klihub
Copy link
Member Author

klihub commented Jul 24, 2023

Should these commits be split out to separate PR's of their own:

  • server_test: wait for OnClose in TestClientEOF
  • .github: give more slack for build+tests

They are addressing problems which are not directly related to this PR. Especially the latter one which simply increases a workflow timeout.

I think it would be good to break them out; especially because the rest of the commits are all directly about the otel feature, and if we ever had to migrate that or revert it.. would be better to not have unrelated commits associated with this PR

@estesp I split them out to to #150 and #151. I also split out, as #152, the following 2 commits which implement client and server side unary interceptor chaining:

  • client: implement UnaryClientInterceptor chaining
  • server: implement UnaryServerInterceptor chaining
    While they are related to this PR, they are self-contained and not a strict directly requirement to implement open-telemetry instrumentation.

@klihub klihub force-pushed the devel/opentelemetry branch 6 times, most recently from 10d82e5 to 99606e6 Compare August 9, 2023 15:43
@klihub
Copy link
Member Author

klihub commented Aug 9, 2023

I know there is the example tucked under the new module with a README, but it seems like it would be good to have some information on this in the main README, especially as essentially there is a new "library" within the repo with its own build/test, etc. It doesn't need to be exhaustive but maybe one new section in the main project readme to guide anyone looking for OTEL support here.

@estesp I added a README under otelttrpc with a brief description and info about usage. Linked to it from the main README under a new 'instrumentation' chapter.

@klihub klihub marked this pull request as draft August 16, 2023 11:51
klihub and others added 6 commits September 14, 2023 15:12
This commit adds a unary client and server interceptors
for opentelemetry tracing instrumentation. The interceptors
can be passed as ttrpc.ClientOpts and ttrpc.ServerOpt to
ttrpc during client and server creation with code like this:

    client := ttrpc.NewClient(
        conn,
        ttrpc.UnaryClientInterceptor(
            otelttrpc.UnaryClientInterceptor(),
        ),
    )

and

    server, err := ttrpc.NewServer(
        ttrpc.WithUnaryServerInterceptor(
            otelttrpc.UnaryServerInterceptor(),
        ),
    )

These interceptors will then automatically handle generating
trace Spans for all called and served unary method calls. If
the rest of the code is properly set up to collect and export
tracing data to opentelemetry, these spans should show up as
part of the collected traces.

This code was written by lifting over the corresponding gRPC
instrumentation bits from the opentelemetry contrib go package,
leaving out bits considered unnecessary for an initial version,
then replacing gRPC-specific bits by corresponding ttRPC code.

The left out bits were intercepting filters as unnecessary and
the streaming interceptors as too complex for a first version.

Co-authored-by: Krisztian Litkey <krisztian.litkey@intel.com>
Co-authored-by: Swagat Bora <sbora@amazon.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Borrow minimal example and instrumentation tests from PR#134.

Co-authored-by: Swagat Bora <sbora@amazon.com>
Co-authored-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Rework Makefile, trying to make subpackage handling easier.
Hook subpackage processing into most of the common targets.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Our CI workflow uses the golangci-lint action for linting
instead of our Makefile. Add steps for running the action
for the otelttrpc subpackage as well.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Lift over the sample client/server code from otelgrpc,
gutting out any bits we don't support yet (streaming),
and adapting it for (otel)ttrpc.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
And reference it under instrumentation from the main README.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub marked this pull request as ready for review September 14, 2023 15:01
@klihub
Copy link
Member Author

klihub commented Jan 13, 2024

@dmcgowan @cpuguy83 I split the remaining commits out into a separate repo. If we could create a new target repo under the containerd umbrella for the ttrpc/otel instrumentation bits, I could close this PR and file one against that.

@klihub
Copy link
Member Author

klihub commented Jan 26, 2024

Essentially the same code with more up-to-date deps is now available as github.com/containerd/otelttrpc. Closing this PR.

@klihub klihub closed this Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants