Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Go gGPC probes for metrics and traces #46

Merged
merged 18 commits into from
Apr 12, 2023
Merged

Conversation

grcevski
Copy link
Contributor

@grcevski grcevski commented Apr 5, 2023

This PR adds support for tracking gRPC calls with the following parameters:

  • gRPC method
  • gRPC status
  • gPRC peer
  • gRPC host and port

The current support relies of us having DWARF symbols for the various fields.

Relates to #20

@grcevski grcevski added the enhancement New feature or request label Apr 5, 2023
@grcevski grcevski requested a review from mariomac April 5, 2023 19:19
@grcevski grcevski self-assigned this Apr 5, 2023
@codecov-commenter
Copy link

codecov-commenter commented Apr 5, 2023

Codecov Report

Merging #46 (3c7a1c5) into main (abf9c46) will increase coverage by 0.94%.
The diff coverage is 48.32%.

@@            Coverage Diff             @@
##             main      #46      +/-   ##
==========================================
+ Coverage   50.30%   51.24%   +0.94%     
==========================================
  Files          14       14              
  Lines         833      927      +94     
==========================================
+ Hits          419      475      +56     
- Misses        378      415      +37     
- Partials       36       37       +1     
Flag Coverage Δ
unittests 51.24% <48.32%> (+0.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/ebpf/nethttp/nethttp.go 0.00% <0.00%> (ø)
pkg/goexec/offsets.go 0.00% <0.00%> (ø)
pkg/pipe/config.go 53.33% <0.00%> (ø)
pkg/goexec/structmembers.go 66.98% <46.15%> (-2.41%) ⬇️
pkg/transform/spanner.go 80.82% <75.00%> (-3.11%) ⬇️
pkg/export/otel/metrics.go 80.68% <94.73%> (+3.53%) ⬆️
pkg/export/otel/traces.go 77.01% <94.73%> (+4.15%) ⬆️
pkg/pipe/instrumenter.go 82.60% <100.00%> (+3.66%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@mariomac
Copy link
Contributor

If you want to know why integration tests are failing, remember that here you have the docker compose logs from the last run: https://github.com/grafana/ebpf-autoinstrument/actions/runs/4662697238?pr=46

The instrumenter is failing in the test-suite-nodebug logs:

integration-autoinstrumenter-1  | time=2023-04-11T01:02:56.432Z level=ERROR msg="can't instantiate eBPF tracer" err="loading and assigning BPF objects: field UprobeServerHandleStreamReturn: program uprobe_server_handleStream_return: load program: invalid argument: invalid func unknown#177 (274 line(s) omitted)"

Probably you can fix the tests by adding here the required GRPC fields and running make update-offsets: https://github.com/grafana/ebpf-autoinstrument/blob/main/configs/offsets/tracker_input.json

For example (but adding your required structs and fields:

  "google.golang.org/grpc": {
    "versions": ">= 1.3",
    "fields": {
      "google.golang.org/grpc/internal/transport.Stream": [
        "method",
        "id",
        "ctx"
      ],
      "google.golang.org/grpc.ClientConn": [
        "target"
      ],
      "golang.org/x/net/http2.MetaHeadersFrame": [
        "Fields"
      ],
      "golang.org/x/net/http2.FrameHeader": [
        "StreamID"
      ]
    }
  }

Comment on lines +23 to +24
#define EVENT_HTTP_REQUEST 1
#define EVENT_GRPC_REQUEST 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that, for now, it's fine. But once we merge the PR and make sure everything is working, we could refactor a bit to provide two eBPF instrumenters separately (one for HTTP and another for GRPC, in different files), and communicate both with the userspace via separate ringbuffers, so we don't need to differentiate the event type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, I'll refactor this in a follow-up. After we successfully merge the goroutine PR.

@mariomac
Copy link
Contributor

It looks great, thank you! In addition to fixing the integration tests by adding prefetched indices for the GRPC fields, I'd also add a new integration test for GRPC. For example, adding this GRPC server to the docker-compose.yml file and some extra integration test cases:

@grcevski
Copy link
Contributor Author

After I pushed the fix for the eBPF verification error reporting, I realized that whatever we were hitting there must've been an eBPF loader issue, perhaps related to the host OS kernel version. It was claiming a call to bpf_printk was unknown?!

integration-autoinstrumenter-1  | ; bpf_dbg_printk("st_ptr %lx, offset=%d, remote=%d, local=%d", st_ptr, grpc_stream_st_ptr_pos, grpc_st_remoteaddr_ptr_pos, grpc_st_localaddr_ptr_pos);
integration-autoinstrumenter-1  | 161: (18) r1 = 0xffff97a803817910
integration-autoinstrumenter-1  | 163: (71) r1 = *(u8 *)(r1 +0)
integration-autoinstrumenter-1  |  R0_w=inv(id=0) R1_w=map_value(id=0,off=0,ks=4,vs=1170,imm=0) R6=mem(id=0,ref_obj_id=4,off=0,imm=0) R7_w=inv(id=7) R8_w=map_value(id=0,off=680,ks=4,vs=1170,imm=0) R9=invP0 R10=fp0 fp-8_w=mmmmmmmm fp-16=mmmmmmmm fp-48=mmmmmmmm refs=4
integration-autoinstrumenter-1  | 164: (b7) r2 = 3
integration-autoinstrumenter-1  | ; bpf_dbg_printk("st_ptr %lx, offset=%d, remote=%d, local=%d", st_ptr, grpc_stream_st_ptr_pos, grpc_st_remoteaddr_ptr_pos, grpc_st_localaddr_ptr_pos);
integration-autoinstrumenter-1  | 165: (2d) if r2 > r1 goto pc+19
integration-autoinstrumenter-1  |  R0_w=inv(id=0) R1_w=inv(id=0,umin_value=3,umax_value=255,var_off=(0x0; 0xff)) R2_w=inv3 R6=mem(id=0,ref_obj_id=4,off=0,imm=0) R7_w=inv(id=7) R8_w=map_value(id=0,off=680,ks=4,vs=1170,imm=0) R9=invP0 R10=fp0 fp-8_w=mmmmmmmm fp-16=mmmmmmmm fp-48=mmmmmmmm refs=4
integration-autoinstrumenter-1  | ; bpf_dbg_printk("st_ptr %lx, offset=%d, remote=%d, local=%d", st_ptr, grpc_stream_st_ptr_pos, grpc_st_remoteaddr_ptr_pos, grpc_st_localaddr_ptr_pos);
integration-autoinstrumenter-1  | 166: (79) r1 = *(u64 *)(r10 -8)
integration-autoinstrumenter-1  | 167: (7b) *(u64 *)(r10 -48) = r1
integration-autoinstrumenter-1  | 168: (79) r1 = *(u64 *)(r8 +0)
integration-autoinstrumenter-1  |  R0_w=inv(id=0) R1_w=inv(id=0) R2_w=inv3 R6=mem(id=0,ref_obj_id=4,off=0,imm=0) R7_w=inv(id=7) R8_w=map_value(id=0,off=680,ks=4,vs=1170,imm=0) R9=invP0 R10=fp0 fp-8_w=mmmmmmmm fp-16=mmmmmmmm fp-48_w=mmmmmmmm refs=4
integration-autoinstrumenter-1  | 169: (7b) *(u64 *)(r10 -40) = r1
integration-autoinstrumenter-1  | 170: (18) r1 = 0xffff97a803817bf0
integration-autoinstrumenter-1  | 172: (79) r1 = *(u64 *)(r1 +0)
integration-autoinstrumenter-1  |  R0_w=inv(id=0) R1_w=map_value(id=0,off=736,ks=4,vs=1170,imm=0) R2_w=inv3 R6=mem(id=0,ref_obj_id=4,off=0,imm=0) R7_w=inv(id=7) R8_w=map_value(id=0,off=680,ks=4,vs=1170,imm=0) R9=invP0 R10=fp0 fp-8_w=mmmmmmmm fp-16=mmmmmmmm fp-40_w=mmmmmmmm fp-48_w=mmmmmmmm refs=4
integration-autoinstrumenter-1  | 173: (7b) *(u64 *)(r10 -32) = r1
integration-autoinstrumenter-1  | 174: (18) r1 = 0xffff97a803817bf8
integration-autoinstrumenter-1  | 176: (79) r1 = *(u64 *)(r1 +0)
integration-autoinstrumenter-1  |  R0_w=inv(id=0) R1_w=map_value(id=0,off=744,ks=4,vs=1170,imm=0) R2_w=inv3 R6=mem(id=0,ref_obj_id=4,off=0,imm=0) R7_w=inv(id=7) R8_w=map_value(id=0,off=680,ks=4,vs=1170,imm=0) R9=invP0 R10=fp0 fp-8_w=mmmmmmmm fp-16=mmmmmmmm fp-32_w=mmmmmmmm fp-40_w=mmmmmmmm fp-48_w=mmmmmmmm refs=4
integration-autoinstrumenter-1  | 177: (7b) *(u64 *)(r10 -24) = r1
integration-autoinstrumenter-1  | 178: (bf) r3 = r10
integration-autoinstrumenter-1  | 179: (07) r3 += -48
integration-autoinstrumenter-1  | 180: (18) r1 = 0xffff97a803817bc0
integration-autoinstrumenter-1  | 182: (b7) r2 = 43
integration-autoinstrumenter-1  | 183: (b7) r4 = 32
integration-autoinstrumenter-1  | 184: (85) call unknown#177
integration-autoinstrumenter-1  | invalid func unknown#177

We can see it's trying to generate the code for bpf_dbg_printf, which is just a macro wrapper for an if loglevel and a call to bpf_printk. The instruction at:

165: (2d) if r2 > r1 goto pc+19

is the log level check, so the call must be bpf_printk. For some reason earlier calls to bpf_dbg_printk work just fine.

To fix the problem, I simply removed those prints. I see no issues when running locally, so it must be kernel specific.

@grcevski grcevski requested a review from mariomac April 12, 2023 00:18
Copy link
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good job!

@grcevski grcevski merged commit bc2f549 into grafana:main Apr 12, 2023
2 checks passed
@grcevski
Copy link
Contributor Author

Thanks Mario!

@grcevski grcevski deleted the grpc_probe branch April 12, 2023 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants