Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting eBPF sampler code #84

Merged
merged 4 commits into from
Apr 27, 2023
Merged

Splitting eBPF sampler code #84

merged 4 commits into from
Apr 27, 2023

Conversation

mariomac
Copy link
Contributor

@mariomac mariomac commented Apr 26, 2023

This will allow start adding new code in separate modules (e.g. PR #85). This code works but we will still need some improvements, which I'd suggest to add in future PRs so we minimize merge conflicts with other tasks going in parallel.

To be done in following PRs:

  • Document better the code.
  • Refactor/simplify some code parts to make them more understandable.
  • Reinsert the logInstrumentedFeatures features (needs to be reimplemented for the new architecture)
  • Add unit tests for the tracer.go file, which is now more testable as we have removed the eBPF code from it

@mariomac mariomac changed the title WIP: modularising samplers Splitting eBPF sampler code Apr 27, 2023
Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have one concern that I wanted to discuss to see if we can mitigate it somehow. With this change we have 2 separate ring buffers (http and grpc), which fire their own events. This separation will prevent us from correlating events between these two categories, which I think will limit us in what we can do in the future.

For example, at the present time we have no support for grpc client calls, but I have a PR open for http client calls. So technically, if we were to split the ring buffers, we would not be able to consistently generate nested server client trace spans.

The trace creation api from OTel requires that we have a parent span, somehow, when we generate children spans, e.g:

	ctx, sp := tracer.Start(parentCtx, traceName(span),
		trace2.WithTimestamp(span.RequestStart),
		trace2.WithSpanKind(...),
	)

With a single ringbuffer our events are ordered, so with gRPC server span which wraps HTTP client span, we are guarateed to see the server span start before the HTTP client. But with two ring buffers, those events can be processed out of order.

If we make this change, I think we may not be able to accomplish all nested spans without adding some sort of external shared memory. HTTP server to HTTP client spans will work, but not any combination of gRPC and HTTP.

Even if we had one ring buffer, with two separate channels we still have the out of order problem I think.

// the autoinstrumenter ends abruptly)
// https://ants-gitlab.inf.um.es/jorgegm/xdp-tutorial/-/blob/master/basic04-pinning-maps/README.org
// we can share them later if we find is worth not including code per duplicate
struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two structs and the wakeup_data_bytes will be duplicated. I don't think there's any problem, I just want to make sure it was the intent, since we'll be using double the memory as before if a program has both grpc and http. I think with splitting there's no way around it, since they are separate programs they'll have separate maps.

Copy link
Contributor Author

@mariomac mariomac Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was aware of this duplication, but I was thinking in the scenario of 99% executables having either HTTP or GRPC but not both at the same time. But maybe I'm wrong.

Maybe if we think it's a problem we can just pin the maps to a global namespace but we would require to e.g. clean maps between executions or to check whether we could have any problem if two auto-instrumenters run at the same time. That's why I left it for a future, but we can do it now if you think it's better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good. Yeah I'm not worried about the memory increase.

@mariomac
Copy link
Contributor Author

@grcevski I see the out-of order problem. Let me rethink the current implementation to see how we can minimize the chance.

Anyway I'm 100% not sure if we are currently safe against out-of-order processing, as eBPF programs can be executed in different CPUs

@grcevski
Copy link
Contributor

grcevski commented Apr 27, 2023

@grcevski I see the out-of order problem. Let me rethink the current implementation to see how we can minimize the chance.

Anyway I'm 100% not sure if we are currently safe against out-of-order processing, as eBPF programs can be executed in different CPUs

Ah good point! So I was thinking in the meantime, perhaps we need to expose the BPF map to the go space with the ongoing http and grpc requests. When it comes time to create the nested span, we look up the maps to see if we should first create the parent span or should we simply continue with context.TODO. The parent span when it's event comes through, we look up to see in our in memory cache if someone made it before and if they did, we'll pick that up and end it.

If you think this might work I think we are good to go with this change. We can expose this later. It would mean that the trace span creator code is able to look up the ongoing goroutine ids. An HTTP client call to say an external service, will be able to check if there's gRPC server span with the same goroutine id and do the right thing.

@mariomac
Copy link
Contributor Author

@grcevski you got a good point. The ongoing requests map can be exposed easily to Go. We just need to make sure that, in the case we need to pick the data up from the Go userspace, nobody elses remove it (e.g. when the ongoing request has ended)

@grcevski
Copy link
Contributor

Here's an example of what I do now.

https://github.com/grafana/ebpf-autoinstrument/pull/85/files#diff-e524f5b7f4b1cda88b689b542fa19e82ab4b408307f76225385b7c87ecb1f850R228

It's inefficient because I send two events for the server span, when we start tracking and on return. But in theory all we need is the goroutine id -> start time, when the client span is processed.

@codecov-commenter
Copy link

Codecov Report

Merging #84 (905efe3) into main (7ad75d7) will decrease coverage by 7.64%.
The diff coverage is 1.21%.

@@            Coverage Diff             @@
##             main      #84      +/-   ##
==========================================
- Coverage   46.58%   38.94%   -7.64%     
==========================================
  Files          15       20       +5     
  Lines        1172     1330     +158     
==========================================
- Hits          546      518      -28     
- Misses        577      767     +190     
+ Partials       49       45       -4     
Flag Coverage Δ
unittests 38.94% <1.21%> (-7.64%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/ebpf/common/bpf_bpf.go 0.00% <0.00%> (ø)
pkg/ebpf/common/ringbuf.go 0.00% <0.00%> (ø)
pkg/ebpf/grpc/bpf_debug_bpfel_x86.go 0.00% <0.00%> (ø)
pkg/ebpf/grpc/grpc.go 0.00% <0.00%> (ø)
pkg/ebpf/nethttp/bpf_debug_bpfel_x86.go 0.00% <ø> (ø)
pkg/ebpf/nethttp/nethttp.go 0.00% <0.00%> (ø)
pkg/ebpf/tracer.go 0.00% <0.00%> (ø)
pkg/export/otel/metrics.go 78.33% <ø> (ø)
pkg/goexec/instructions.go 0.00% <0.00%> (ø)
pkg/goexec/offsets.go 0.00% <0.00%> (ø)
... and 3 more

... and 1 file with indirect coverage changes

@grcevski
Copy link
Contributor

@grcevski you got a good point. The ongoing requests map can be exposed easily to Go. We just need to make sure that, in the case we need to pick the data up from the Go userspace, nobody elses remove it (e.g. when the ongoing request has ended)

OK, cool, that's what I thought. So the only remaining problem is processing the events on the buffer and the ring buffers are behind, so the map is already cleaned up. I wonder though, maybe all we need is to send more data for the client span, parent ID and start time, perhaps the start time can even be adjusted later.

I think I'm OK with this, we'll find a solution through some of these ideas.

Thanks Mario!

Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mariomac mariomac merged commit 1842b93 into grafana:main Apr 27, 2023
2 checks passed
@mariomac mariomac deleted the modular branch April 27, 2023 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants