Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RecursionMisses to access ProgramInfo recursion misses counter #1465

Merged
merged 1 commit into from
May 28, 2024

Conversation

olsajiri
Copy link
Contributor

Adding RecursionMisses to access ProgramInfo recursion misses counter.

It does not depend on whether collection of statistics is enabled, the value is always counted.

@olsajiri olsajiri requested a review from a team as a code owner May 11, 2024 17:45
@olsajiri olsajiri marked this pull request as draft May 11, 2024 17:45
@olsajiri olsajiri marked this pull request as ready for review May 11, 2024 18:49
info.go Outdated
// it keeps the return values pattern as the other stats accessors and returns
// bool (allways true) together with the counter value.
func (pi *ProgramInfo) RecursionMisses() (uint64, bool) {
return pi.stats.recursionMisses, true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which kernel did this appear on though? The boolean is meant to indicate whether the kernel exposes this flag in general, not whether stats were enabled (IIRC).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it got in v5.11 9ed9e9ba2337 bpf: Count the number of times recursion was prevented

I'm actually not sure how is the availability detected, both Runtime and RunCount
return values if pi.stats != nil which seems to be the case always AFAICS

I can keep the same checks as for the Runtime and RunCount

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure how is the availability detected, both Runtime and RunCount
return values if pi.stats != nil which seems to be the case always AFAICS

Can't remember who contributed it, but iirc they were planning to add a probe, and the bool was added to not have to break API later. Looks like the probe was never implemented, and looking back it hardly seems worth it to load a prog, call prog_run and verify if the run counter increases just so we can return that bool correctly.

In the case of RecursionMisses, that one feels unrealistic to try to reproduce reliably at all. I'd just drop the bool and add a comment about it being 5.11+.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That breaks when using newProgramInfoFromProc. So unless we deprecate that we should just stick with the established pattern of returning a bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, keeping the bool then, thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That breaks when using newProgramInfoFromProc. So unless we deprecate that we should just stick with the established pattern of returning a bool.

@lmb Could you elaborate on what breaks exactly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pi.stats is nil and therefore the line panics.

ti-mo
ti-mo previously requested changes May 14, 2024
Copy link
Collaborator

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some feedback. Let's just drop the bool, fix up the comment and call it a day.

info.go Show resolved Hide resolved
info.go Outdated Show resolved Hide resolved
info.go Outdated
// it keeps the return values pattern as the other stats accessors and returns
// bool (allways true) together with the counter value.
func (pi *ProgramInfo) RecursionMisses() (uint64, bool) {
return pi.stats.recursionMisses, true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure how is the availability detected, both Runtime and RunCount
return values if pi.stats != nil which seems to be the case always AFAICS

Can't remember who contributed it, but iirc they were planning to add a probe, and the bool was added to not have to break API later. Looks like the probe was never implemented, and looking back it hardly seems worth it to load a prog, call prog_run and verify if the run counter increases just so we can return that bool correctly.

In the case of RecursionMisses, that one feels unrealistic to try to reproduce reliably at all. I'd just drop the bool and add a comment about it being 5.11+.

@lmb
Copy link
Collaborator

lmb commented May 17, 2024

Pushed a fixup, PTAL. Looking at the upstream commit it's not clear to me where the idea that this is related to program statistics comes from? Seems easier to me to just treat these things as completely independent.

Just returning pi.stats.recursionMisses, true doesn't work due to newProgramInfoFromProc. Reworked to just use a separate field which works around this problem.

@olsajiri
Copy link
Contributor Author

hm, it's per program stats collected under same struct in kernel together with run count and run time

struct bpf_prog_stats {
        u64_stats_t cnt;
        u64_stats_t nsecs;
        u64_stats_t misses;
        struct u64_stats_sync syncp;
} __aligned(2 * sizeof(u64));

it's incremented through bpf_prog_inc_misses_counter call any time there's recursion detected on link and we can't run the program

it makes sense to me to place it under programStats

@lmb
Copy link
Collaborator

lmb commented May 23, 2024

I can see your point, moving it out of the struct doesn't really add value. I've changed it back, fixed the panic and added a comment.

info.go Outdated
@@ -83,6 +83,9 @@ type programStats struct {
runtime time.Duration
// Total number of times the program was called.
runCount uint64
// Total number of timer the programm was NOT called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Total number of timer the programm was NOT called.
// Total number of times the program was NOT called.

Adding RecursionMisses to access ProgramInfo recursion misses counter.

This value is always measured on supported kernels, regardless of
statistics being enabled or not.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
@lmb
Copy link
Collaborator

lmb commented May 28, 2024

CI failures are due to unrelated upstream changes.

@lmb lmb merged commit b8dc0ee into cilium:main May 28, 2024
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants