Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more helpful methods to ProgramInfo #637

Merged
merged 1 commit into from
Aug 14, 2023
Merged

Conversation

astoycos
Copy link
Member

@astoycos astoycos commented Jul 5, 2023

This PR adds some helpful methods to the ProgramInfo aya structure. More specifically we start by adding helpers to extract the following information in usable format from bpf_prog_info

"Name",
"Id",
"Type",
"Name",
"Loaded-At",
"Tag",
"Gpl-compatible",
"Map-ids",
"Btf-id",
"Bytes_xlated",
"Jited",
"Bytes_jited",
"Bytes_memlock",
"Verified_insns",

This is in line with the information extracted by the popular bpftool binary

bpfd is using these commits downstream in

bpfman/bpfman#570


This change is Reviewable

@netlify
Copy link

netlify bot commented Jul 5, 2023

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit e1a5568
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/64d67a53ebdeff000808742d
😎 Deploy Preview https://deploy-preview-637--aya-rs-docs.netlify.app/src/aya/programs/mod.rs
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM - you'll need to fix the CI failure, which looks like you've used a u64 somewhere you should have used a usize. I think more generally though we should keep what Aya does as minimal as possible and do the heavier lifting downstream.

aya/src/programs/mod.rs Show resolved Hide resolved
aya/src/programs/mod.rs Outdated Show resolved Hide resolved
aya/src/programs/mod.rs Outdated Show resolved Hide resolved
aya/src/programs/mod.rs Outdated Show resolved Hide resolved
}

/// The program type as an integer.
pub fn program_type(&self) -> u32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a nice public-facing enum we can present here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's one internal to Aya (unless I'm missing it)... All we have is

pub enum Program {
which only represents the Programs Aya supports.

Apart from that it's just the bindings that we maintain https://github.com/aya-rs/aya/blob/main/bpf/aya-bpf-bindings/src/aarch64/bindings.rs#L474

aya/src/programs/utils.rs Outdated Show resolved Hide resolved
aya/src/programs/utils.rs Outdated Show resolved Hide resolved
aya/src/programs/utils.rs Outdated Show resolved Hide resolved
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @astoycos and @dave-tucker)


-- commits line 10 at r2:
s/api/API/


aya/src/programs/mod.rs line 500 at r2 (raw file):

        let info =
            bpf_prog_get_info_by_fd(fd, None).map_err(|io_error| ProgramError::SyscallError {

should this have been in the previous commit? could you check that each commit compiles please?


aya/src/programs/mod.rs line 955 at r2 (raw file):

    }

    /// The program type.

these doc comments would be somewhat more useful if they explained (or linked to an explanation of) the concept, particularly when they return an integer that is otherwise unexplained.


aya/src/programs/mod.rs line 1036 at r2 (raw file):

    ///
    /// The returned fd must be closed when no longer needed.
    pub fn fd(&self) -> Result<RawFd, ProgramError> {

should we hold an Option on the ProgramInfo so we can cache the fd?


aya/src/programs/utils.rs line 65 at r2 (raw file):

// Get time since boot.
pub(crate) fn time_since_boot() -> Duration {
    let mut time = libc::timespec {

should this be mem::zeroed or MaybeUninit? seems inconsistent with other usage.


aya/src/programs/utils.rs line 70 at r2 (raw file):

    };
    let ret = unsafe { libc::clock_gettime(libc::CLOCK_BOOTTIME, &mut time) };
    assert!(ret == 0);

can this be assert_eq to emit a better message on failure? should this function be fallible?


aya/src/programs/utils.rs line 77 at r2 (raw file):

// Get the realtime.
pub(crate) fn realtime() -> Duration {

looks like these two methods are only used in one place. can we inline them for now with a TODO that they might need to be extracted? i think the API we want for them will be clearer when there's more than one usage.


aya/src/programs/utils.rs line 101 at r2 (raw file):

                }

                let parts = l.split('\t');

each row is exactly key\tvalue, correct? would this be simpler and less unwrap-y with [rsplit_once](#### rsplit_once)?


aya/src/programs/utils.rs line 103 at r2 (raw file):

                let parts = l.split('\t');

                key_val = Ok(parts.last().unwrap_or("err").parse().unwrap_or(0));

is there a reason not to return early? can keys repeat?


aya/src/sys/bpf.rs line 476 at r1 (raw file):

pub(crate) fn bpf_prog_get_info_by_fd(
    prog_fd: RawFd,
    map_ids: Option<&mut [u32]>,

does this need to be an option? what's the difference between None and Some(&[])?


aya/src/sys/bpf.rs line 479 at r1 (raw file):

) -> Result<bpf_prog_info, io::Error> {
    let mut attr = unsafe { mem::zeroed::<bpf_attr>() };
    let mut info = unsafe { mem::zeroed::<bpf_prog_info>() };

why not use MaybeUninit here? (and above, and generally everywhere)


aya/src/sys/bpf.rs line 482 at r1 (raw file):

    if let Some(i) = map_ids {
        let buf_size = i.len() as u32;

nit: why this local?

@astoycos
Copy link
Member Author

Reviewed 3 of 3 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @astoycos and @dave-tucker)

-- commits line 10 at r2: s/api/API/

Done

aya/src/programs/mod.rs line 500 at r2 (raw file):

        let info =
            bpf_prog_get_info_by_fd(fd, None).map_err(|io_error| ProgramError::SyscallError {

should this have been in the previous commit? could you check that each commit compiles please?

Good catch! Done

aya/src/programs/mod.rs line 955 at r2 (raw file):

    }

    /// The program type.

these doc comments would be somewhat more useful if they explained (or linked to an explanation of) the concept, particularly when they return an integer that is otherwise unexplained.

Tried to add a bit more detail to some of the comments

aya/src/programs/mod.rs line 1036 at r2 (raw file):

    ///
    /// The returned fd must be closed when no longer needed.
    pub fn fd(&self) -> Result<RawFd, ProgramError> {

should we hold an Option on the ProgramInfo so we can cache the fd?

Not sure I didn't add this API, It already existed.. I don't quite follow though, why would an Option help with caching here?

aya/src/programs/utils.rs line 65 at r2 (raw file):

// Get time since boot.
pub(crate) fn time_since_boot() -> Duration {
    let mut time = libc::timespec {

should this be mem::zeroed or MaybeUninit? seems inconsistent with other usage.

Sure I'm fine with that, done for both functions.

aya/src/programs/utils.rs line 70 at r2 (raw file):

    };
    let ret = unsafe { libc::clock_gettime(libc::CLOCK_BOOTTIME, &mut time) };
    assert!(ret == 0);

can this be assert_eq to emit a better message on failure? should this function be fallible?

Done for both functions, I don't believe getting the system Boottime/Realtime should ever fail.

aya/src/programs/utils.rs line 77 at r2 (raw file):

// Get the realtime.
pub(crate) fn realtime() -> Duration {

looks like these two methods are only used in one place. can we inline them for now with a TODO that they might need to be extracted? i think the API we want for them will be clearer when there's more than one usage.

I don't agree, this isn't a public API and the functions are pretty straight-forward, i.e wrap a clock_getttime() syscall. For now I'd rather the code live in utils rather than inline, additionally, I don't think we do any other direct non BPF syscalls in mod.rs.

In the future the only extension I could for-see would be collapsing time_since_boot and realtime into a single function where the user can specify a single clock. For now I think it's fine though :)

aya/src/programs/utils.rs line 101 at r2 (raw file):

                }

                let parts = l.split('\t');

each row is exactly key\tvalue, correct? would this be simpler and less unwrap-y with [rsplit_once](#### rsplit_once)?

It looks so unwrap-y because we're also parsing the string into an integer. Using rsplit_once vs split wouldn't effect that since rsplit_once still returns an Option just like parts.last() which will need to be unwrapped.

However it should be more efficient since iter.last() has to iterate through all the entries to get the last one. I'm using unwrap here to verify that there was two strings on the line separated by a \t char which should always be true based on the docs https://man7.org/linux/man-pages/man5/proc.5.html.

aya/src/programs/utils.rs line 103 at r2 (raw file):

                let parts = l.split('\t');

                key_val = Ok(parts.last().unwrap_or("err").parse().unwrap_or(0));

is there a reason not to return early? can keys repeat?

Nope, good catch, fixed.

aya/src/sys/bpf.rs line 476 at r1 (raw file):

pub(crate) fn bpf_prog_get_info_by_fd(
    prog_fd: RawFd,
    map_ids: Option<&mut [u32]>,

does this need to be an option? what's the difference between None and Some(&[])?

Yes because in many cases across the codebase we don't care to get the "map IDs" back in bpf_prog_info (they only get populated if initialized), and I don't think we should be initializing a buffer if we don't need to.

aya/src/sys/bpf.rs line 479 at r1 (raw file):

) -> Result<bpf_prog_info, io::Error> {
    let mut attr = unsafe { mem::zeroed::<bpf_attr>() };
    let mut info = unsafe { mem::zeroed::<bpf_prog_info>() };

why not use MaybeUninit here? (and above, and generally everywhere)

Hrm I was under the impression that we should only use MaybeUninit when the pointer data is entirely populated within the kernel i.e

    // info gets entirely populated by the kernel
    let info = MaybeUninit::zeroed();

    attr.info.bpf_fd = prog_fd as u32;
    attr.info.info = info.as_ptr() as *const _ as u64;
    attr.info.info_len = mem::size_of::<bpf_map_info>() as u32;

vs

    let mut info = unsafe { mem::zeroed::<bpf_prog_info>() };

    if let Some(i) = map_ids {
        let buf_size = i.len() as u32;
        info.map_ids = i.as_mut_ptr() as u64;
        info.nr_map_ids = buf_size;
    };

Where in the latter case we use mem::zeroed since we have to set info.map_ids and info.nr_map_ids but maybe @alessandrod Can clarify here? I'm happy to tackle that changed (if needed) in a follow up.

aya/src/sys/bpf.rs line 482 at r1 (raw file):

    if let Some(i) = map_ids {
        let buf_size = i.len() as u32;

nit: why this local?

Good catch Doesn't need to be, fixed!

@astoycos astoycos requested a review from tamird July 24, 2023 17:49
@astoycos
Copy link
Member Author

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r4, 2 of 2 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @astoycos and @dave-tucker)


aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Not sure I didn't add this API, It already existed.. I don't quite follow though, why would an Option help with caching here?

As-written, some of the new functions you're adding fetch the fd, get some data, and then close the fd. Caching would reduce the number of times we retrieve the FD, which I assume is roughly constant for the life of the program.


aya/src/programs/mod.rs line 953 at r5 (raw file):

    /// The program tag is a SHA sum of the program's instructions which can be
    /// used as another program identifier.

"as another program identifier" - what does that mean?


aya/src/programs/mod.rs line 958 at r5 (raw file):

    }

    /// The program type as defined by the linux kernel enum [`bpf_prog_type`](https://elixir.bootlin.com/linux/v6.4.4/source/include/uapi/linux/bpf.h#L948).

can you reflow please?


aya/src/programs/mod.rs line 975 at r6 (raw file):

/// Provides information about a loaded program, like name, id and statistics
#[derive(Debug, Clone)]

why the new clone impl?


aya/src/programs/utils.rs line 77 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

I don't agree, this isn't a public API and the functions are pretty straight-forward, i.e wrap a clock_getttime() syscall. For now I'd rather the code live in utils rather than inline, additionally, I don't think we do any other direct non BPF syscalls in mod.rs.

In the future the only extension I could for-see would be collapsing time_since_boot and realtime into a single function where the user can specify a single clock. For now I think it's fine though :)

Well, these functions are just about entirely duplicated. I was avoiding suggesting refactoring the common bits out because I think the APIs of these functions are obscuring the caller's intent, which would be more clear when the functions are inline.

In other words, I believe you're using these values for some kind of normalization of other time values retrieved from the kernel, but it's impossible to see that from here, and those difficult to suggest alternative logic.


aya/src/programs/utils.rs line 74 at r5 (raw file):

}

/// Get the system-wide real (wall-clock) time.

this is a duration, when is it measured from? should this return a SystemTime instead?


aya/src/programs/utils.rs line 96 at r5 (raw file):

                }

                let parts = l.rsplit_once('\t');

unwrap and destructure here?


aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Yes because in many cases across the codebase we don't care to get the "map IDs" back in bpf_prog_info (they only get populated if initialized), and I don't think we should be initializing a buffer if we don't need to.

What does initializing a buffer mean? An empty slice does not initialize anything.


aya/src/sys/bpf.rs line 479 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Hrm I was under the impression that we should only use MaybeUninit when the pointer data is entirely populated within the kernel i.e

    // info gets entirely populated by the kernel
    let info = MaybeUninit::zeroed();

    attr.info.bpf_fd = prog_fd as u32;
    attr.info.info = info.as_ptr() as *const _ as u64;
    attr.info.info_len = mem::size_of::<bpf_map_info>() as u32;

vs

    let mut info = unsafe { mem::zeroed::<bpf_prog_info>() };

    if let Some(i) = map_ids {
        let buf_size = i.len() as u32;
        info.map_ids = i.as_mut_ptr() as u64;
        info.nr_map_ids = buf_size;
    };

Where in the latter case we use mem::zeroed since we have to set info.map_ids and info.nr_map_ids but maybe @alessandrod Can clarify here? I'm happy to tackle that changed (if needed) in a follow up.

Ah, yeah, sorry - this is an put to a syscall, not an out parameter.

@astoycos
Copy link
Member Author

Reviewed 4 of 4 files at r4, 2 of 2 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @astoycos and @dave-tucker)

aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
As-written, some of the new functions you're adding fetch the fd, get some data, and then close the fd. Caching would reduce the number of times we retrieve the FD, which I assume is roughly constant for the life of the program.

Ah ok I gotcha now 👍 so I'd imagine that would require updating ProgramInfo(bpf_prog_info) to look something like

#[derive(Debug)]
pub struct ProgramInfo{
    info: bpf_prog_info;
    // fd stays open while a reference to ProgramInfo is in scope.
    fd: ProgramFd

right? I'm up for doing this but #662 might have some effect as we'd probably want a concept of OwnedFd to ensure things get cleaned up 🤷

aya/src/programs/mod.rs line 953 at r5 (raw file):

    /// The program tag is a SHA sum of the program's instructions which can be
    /// used as another program identifier.

"as another program identifier" - what does that mean?

As an identifier apart from the program's ID :)

I would add that it's another "unique program identifier" but alot of the times I see the same tag for multiple different program instances i.e

93251: cgroup_skb  tag 6deef7357e7b4530  gpl
        loaded_at 2023-07-24T11:32:45-0400  uid 0
        xlated 64B  jited 55B  memlock 4096B
        pids systemd(1875038)
93252: cgroup_skb  tag 6deef7357e7b4530  gpl
        loaded_at 2023-07-24T11:32:45-0400  uid 0
        xlated 64B  jited 55B  memlock 4096B
        pids systemd(1875038)

Changed the comment to

    /// The program tag is a SHA sum of the program's instructions which can be
    /// used as additional program identifier. A program's id can vary every time
    /// it's loaded or unloaded, but the tag will remain the same.

aya/src/programs/mod.rs line 958 at r5 (raw file):

    }

    /// The program type as defined by the linux kernel enum [`bpf_prog_type`](https://elixir.bootlin.com/linux/v6.4.4/source/include/uapi/linux/bpf.h#L948).

can you reflow please?

Not sure what you wanted here but I re-worded to

    /// The type of a bpf program expressed as an integer.  To understand the 
    /// integer to type mappings please see the linux kernel enum
    /// [`bpf_prog_type`](https://elixir.bootlin.com/linux/v6.4.4/source/include/uapi/linux/bpf.h#L948).

The "best practice" longterm here would be to maintain an actual public facing enum which contains all of the program types rather than only the one's we support (i.e

pub enum Program {
)

aya/src/programs/mod.rs line 975 at r6 (raw file):

/// Provides information about a loaded program, like name, id and statistics
#[derive(Debug, Clone)]

why the new clone impl?

Added by mistake during debugging, thanks!

aya/src/programs/utils.rs line 77 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Well, these functions are just about entirely duplicated. I was avoiding suggesting refactoring the common bits out because I think the APIs of these functions are obscuring the caller's intent, which would be more clear when the functions are inline.

In other words, I believe you're using these values for some kind of normalization of other time values retrieved from the kernel, but it's impossible to see that from here, and those difficult to suggest alternative logic.

Ok I tried to add a bunch more documentation/ comments to clear things up and also renamed realtime to time_since_epoch.

Time in linux can inherently be a bit confusing and the clock_gettime https://linux.die.net/man/2/clock_gettime syscall even more so 🤣. I agree that much of the code is duplicated but keeping separate makes things much easier to parse, otherwise users would need to deeply understand the clock_gettime syscall along with the various ClockIds.

Also this is way more readable compared to bpftool lol https://github.com/libbpf/bpftool/blob/master/src/prog.c#L159

aya/src/programs/utils.rs line 74 at r5 (raw file):

}

/// Get the system-wide real (wall-clock) time.

this is a duration, when is it measured from? should this return a SystemTime instead?

See updated function name and comment.

aya/src/programs/utils.rs line 96 at r5 (raw file):

                }

                let parts = l.rsplit_once('\t');

unwrap and destructure here?

👍

aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
What does initializing a buffer mean? An empty slice does not initialize anything.

Yep sorry I was struggling with the right words...I really mean you need to specify the correct buffer length here for the syscall to work. The best way to get the map IDs is with two syscalls 😢 ....... an initial bpf_prog_get_info_by_fd to get the number of mapIDs and then another with the mapIds buffer in order for them to actually be populated. (see bpftool code ).

        let len = self.0.nr_map_ids;
        let mut map_ids = vec![0u32; len as usize];

        bpf_prog_get_info_by_fd(fd, Some(&mut map_ids)).map_err(|io_error| {
            ProgramError::SyscallError {
                call: "bpf_prog_get_info_by_fd",
                io_error,
            }
        })?;

I made map_ids here an option so that the user could be explicit, i.e explicitly express "I've already gotten the program_info, now I want to get the map_ids". I could have made a separate function for this but it seemed relatively simple to just to add it as an Option arg.

The only other place we do anything remotely like this so far in aya is with fetching btf https://github.com/aya-rs/aya/blob/main/aya/src/programs/extension.rs#L174

Here we choose an arbitrary guess and refetch if needed. I thought the explicit length specification was better especially within ProgramInfo since we've already called bpf_prog_get_info_by_fd once 🤷. Either way the operation could end up requiring two syscalls.

aya/src/sys/bpf.rs line 479 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Ah, yeah, sorry - this is an put to a syscall, not an out parameter.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @astoycos and @dave-tucker)


aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Ah ok I gotcha now 👍 so I'd imagine that would require updating ProgramInfo(bpf_prog_info) to look something like

#[derive(Debug)]
pub struct ProgramInfo{
    info: bpf_prog_info;
    // fd stays open while a reference to ProgramInfo is in scope.
    fd: ProgramFd

right? I'm up for doing this but #662 might have some effect as we'd probably want a concept of OwnedFd to ensure things get cleaned up 🤷

Why don't you just use OwnedFd from std? We don't need to invent ProgramFd.


aya/src/programs/mod.rs line 953 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

As an identifier apart from the program's ID :)

I would add that it's another "unique program identifier" but alot of the times I see the same tag for multiple different program instances i.e

93251: cgroup_skb  tag 6deef7357e7b4530  gpl
        loaded_at 2023-07-24T11:32:45-0400  uid 0
        xlated 64B  jited 55B  memlock 4096B
        pids systemd(1875038)
93252: cgroup_skb  tag 6deef7357e7b4530  gpl
        loaded_at 2023-07-24T11:32:45-0400  uid 0
        xlated 64B  jited 55B  memlock 4096B
        pids systemd(1875038)

Changed the comment to

    /// The program tag is a SHA sum of the program's instructions which can be
    /// used as additional program identifier. A program's id can vary every time
    /// it's loaded or unloaded, but the tag will remain the same.

Thanks. Instead of "another program identifier" can you say "which be used as an alternative to [id]"? That's more specific. The sentence you added is very helpful, thanks.


aya/src/programs/mod.rs line 958 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Not sure what you wanted here but I re-worded to

    /// The type of a bpf program expressed as an integer.  To understand the 
    /// integer to type mappings please see the linux kernel enum
    /// [`bpf_prog_type`](https://elixir.bootlin.com/linux/v6.4.4/source/include/uapi/linux/bpf.h#L948).

The "best practice" longterm here would be to maintain an actual public facing enum which contains all of the program types rather than only the one's we support (i.e

pub enum Program {
)

re-flow just means re-wrap, the line was too long. I think the previous wording was fine. There's a double (or triple?) space after the period now.


aya/src/programs/mod.rs line 975 at r6 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Added by mistake during debugging, thanks!

Looks like it's still here.


aya/src/programs/mod.rs line 1024 at r7 (raw file):

    /// The load time is specified by the kernel as nanoseconds since system boot,
    /// this function converts that u64 value to a [`std::time::SystemTime`]
    /// for easy consumption.  It is calculated by first finding the realtime of system

I don't think we should be spelling out implementation details in a doc comment. do you agree?


aya/src/programs/utils.rs line 77 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Ok I tried to add a bunch more documentation/ comments to clear things up and also renamed realtime to time_since_epoch.

Time in linux can inherently be a bit confusing and the clock_gettime https://linux.die.net/man/2/clock_gettime syscall even more so 🤣. I agree that much of the code is duplicated but keeping separate makes things much easier to parse, otherwise users would need to deeply understand the clock_gettime syscall along with the various ClockIds.

Also this is way more readable compared to bpftool lol https://github.com/libbpf/bpftool/blob/master/src/prog.c#L159

This is somewhat helpful, but I still don't think these functions are the right abstraction. I'm resolving this thread in favor of the one anchored on line 74.


aya/src/programs/utils.rs line 74 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

See updated function name and comment.

Right so my point is that we should not have these two functions; what we really want is a SystemTime that reflects time since boot.


aya/src/programs/utils.rs line 96 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

👍

sorry, destructure would be let (_key, value) = l.rsplit_once('\t').unwrap()


aya/src/programs/utils.rs line 12 at r7 (raw file):

};

// for docs link

what's up with this pattern? can't you just write the full path in the docs?


aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Yep sorry I was struggling with the right words...I really mean you need to specify the correct buffer length here for the syscall to work. The best way to get the map IDs is with two syscalls 😢 ....... an initial bpf_prog_get_info_by_fd to get the number of mapIDs and then another with the mapIds buffer in order for them to actually be populated. (see bpftool code ).

        let len = self.0.nr_map_ids;
        let mut map_ids = vec![0u32; len as usize];

        bpf_prog_get_info_by_fd(fd, Some(&mut map_ids)).map_err(|io_error| {
            ProgramError::SyscallError {
                call: "bpf_prog_get_info_by_fd",
                io_error,
            }
        })?;

I made map_ids here an option so that the user could be explicit, i.e explicitly express "I've already gotten the program_info, now I want to get the map_ids". I could have made a separate function for this but it seemed relatively simple to just to add it as an Option arg.

The only other place we do anything remotely like this so far in aya is with fetching btf https://github.com/aya-rs/aya/blob/main/aya/src/programs/extension.rs#L174

Here we choose an arbitrary guess and refetch if needed. I thought the explicit length specification was better especially within ProgramInfo since we've already called bpf_prog_get_info_by_fd once 🤷. Either way the operation could end up requiring two syscalls.

I'm still confused. In all cases where you pass None, why not just pass an empty slice?

@astoycos
Copy link
Member Author

Reviewed 2 of 2 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @astoycos and @dave-tucker)

aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Ah ok I gotcha now 👍 so I'd imagine that would require updating ProgramInfo(bpf_prog_info) to look something like

#[derive(Debug)]
pub struct ProgramInfo{
    info: bpf_prog_info;
    // fd stays open while a reference to ProgramInfo is in scope.
    fd: ProgramFd

right? I'm up for doing this but #662 might have some effect as we'd probably want a concept of OwnedFd to ensure things get cleaned up 🤷

Why don't you just use OwnedFd from std? We don't need to invent ProgramFd.

Fair enough but I still don't think this is smart... if we store fd as an ownedFd when we list the programs we could potentially be holding open 100s of file descriptors when calling loaded_programs() right? (one for each program loaded on a node). I agree we're doing more opens/closes but I think that's better than holding a bunch open until the each individual reference to a programInfo goes out of scope... What if a user does something dumb and those file descriptors end up being open for a long time?

aya/src/programs/mod.rs line 953 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Thanks. Instead of "another program identifier" can you say "which be used as an alternative to [id]"? That's more specific. The sentence you added is very helpful, thanks.

Sure!

aya/src/programs/mod.rs line 958 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
re-flow just means re-wrap, the line was too long. I think the previous wording was fine. There's a double (or triple?) space after the period now.

Ah ok cool no problem.

aya/src/programs/mod.rs line 975 at r6 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Looks like it's still here.

My bad fixed now rebase issue :/

aya/src/programs/mod.rs line 1024 at r7 (raw file):

    /// The load time is specified by the kernel as nanoseconds since system boot,
    /// this function converts that u64 value to a [`std::time::SystemTime`]
    /// for easy consumption.  It is calculated by first finding the realtime of system

I don't think we should be spelling out implementation details in a doc comment. do you agree?

Agreed, changed back.

aya/src/programs/utils.rs line 77 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
This is somewhat helpful, but I still don't think these functions are the right abstraction. I'm resolving this thread in favor of the one anchored on line 74.

aya/src/programs/utils.rs line 74 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Right so my point is that we should not have these two functions; what we really want is a SystemTime that reflects time since boot.

Ok that recommendation makes more sense I refactored into a single util function that just returns the realtime of a provided duration since boot 👍

aya/src/programs/utils.rs line 96 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
sorry, destructure would be let (_key, value) = l.rsplit_once('\t').unwrap()

Sounds good, Done.

aya/src/programs/utils.rs line 12 at r7 (raw file):

};

// for docs link

what's up with this pattern? can't you just write the full path in the docs?

I was running into the issue before, but now it's not relevant.

aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
I'm still confused. In all cases where you pass None, why not just pass an empty slice?

You could But I think passing an option is way more explicit....With it we can specify "yes please give me the map ids I already know the right buffer length" Or "I don't care to get the mapIDs here just give me bpf_prog_info".

Anyways I don't care enough 😆 so I went ahead and changed it in a new commit LMK what you think.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need tests?

Reviewed 1 of 1 files at r9, 2 of 2 files at r10, 4 of 4 files at r11, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @astoycos and @dave-tucker)


aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Fair enough but I still don't think this is smart... if we store fd as an ownedFd when we list the programs we could potentially be holding open 100s of file descriptors when calling loaded_programs() right? (one for each program loaded on a node). I agree we're doing more opens/closes but I think that's better than holding a bunch open until the each individual reference to a programInfo goes out of scope... What if a user does something dumb and those file descriptors end up being open for a long time?

Up to you, I suppose. At the very least could you change fd to return an OwnedFd? There's a lot of libc::close being sprinkled in here and I'd rather there wasn't.


aya/src/programs/mod.rs line 1030 at r11 (raw file):

            call: "bpf_prog_get_info_by_fd",
            io_error,
        })?;

here's one example where the fd leaks in the error case. please let's make fd() return an OwnedFd.


aya/src/programs/utils.rs line 74 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Ok that recommendation makes more sense I refactored into a single util function that just returns the realtime of a provided duration since boot 👍

But why not leave the addition to the caller? Can this just return a SystemTime and be called boot_time?


aya/src/programs/utils.rs line 69 at r10 (raw file):

    let ret = unsafe { libc::clock_gettime(libc::CLOCK_BOOTTIME, &mut time) };
    assert_eq!(ret, 0, "failed to get system bootime");
    let dur_since_boot = Duration::new(time.tv_sec as u64, time.tv_nsec as u32);

would you mind destructuring here as well? we'd like to be sure that we used all the fields.

let libc::timespec { tv_sec, tv_nsec } = time;


aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

You could But I think passing an option is way more explicit....With it we can specify "yes please give me the map ids I already know the right buffer length" Or "I don't care to get the mapIDs here just give me bpf_prog_info".

Anyways I don't care enough 😆 so I went ahead and changed it in a new commit LMK what you think.

The "i know the right buffer length" part of the statement is where I take issue. The user can intentionally or accidentally give you a buffer that's too short, and we have no way of detecting it, and the kernel will happily populate a too-short buffer. So there's no difference between "i don't want any map ids" and "i want zero map ids".

Let's squash this commit please.

@astoycos
Copy link
Member Author

astoycos commented Jul 25, 2023

Does this need tests?

It wouldn't hurt... I'm not really sure how to do unit tests here, for integration tests I would imagine it'd depend on bpftool which I know you wanted to stay away from. If we used bpftool we could verify "correctness" easily enough. I'd like to get this in since we're depending on it in bpfd but I'm happy to make an issue for adding integration tests (as long as your ok with a bpftool dep)

Reviewed 1 of 1 files at r9, 2 of 2 files at r10, 4 of 4 files at r11, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @astoycos and @dave-tucker)

aya/src/programs/mod.rs line 1036 at r2 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
Up to you, I suppose. At the very least could you change fd to return an OwnedFd? There's a lot of libc::close being sprinkled in here and I'd rather there wasn't.

Cool I know there's a bunch of other places we still need to convert to OwnedFd but I went ahead and did it here 👍 .

aya/src/programs/mod.rs line 1030 at r11 (raw file):

            call: "bpf_prog_get_info_by_fd",
            io_error,
        })?;

here's one example where the fd leaks in the error case. please let's make fd() return an OwnedFd.

aya/src/programs/utils.rs line 74 at r5 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
But why not leave the addition to the caller? Can this just return a SystemTime and be called boot_time?

Sure If that's what you want.

aya/src/programs/utils.rs line 69 at r10 (raw file):

    let ret = unsafe { libc::clock_gettime(libc::CLOCK_BOOTTIME, &mut time) };
    assert_eq!(ret, 0, "failed to get system bootime");
    let dur_since_boot = Duration::new(time.tv_sec as u64, time.tv_nsec as u32);

would you mind destructuring here as well? we'd like to be sure that we used all the fields.

let libc::timespec { tv_sec, tv_nsec } = time;

Sure!

aya/src/sys/bpf.rs line 476 at r1 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…
The "i know the right buffer length" part of the statement is where I take issue. The user can intentionally or accidentally give you a buffer that's too short, and we have no way of detecting it, and the kernel will happily populate a too-short buffer. So there's no difference between "i don't want any map ids" and "i want zero map ids".

Let's squash this commit please.

Yep squashed out that tmp commit (sorry)

prog.gpl_compatible();
prog.map_ids().unwrap();
prog.btf_id();
prog.size_translated();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 👍

.unwrap();

// Ensure all relevant helper functions don't panic.
prog.name();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably assert the name here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is already done on line 80 to make sure we find the program we loaded (see Tamir's comment)

@mergify
Copy link

mergify bot commented Aug 10, 2023

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

@mergify
Copy link

mergify bot commented Aug 10, 2023

@astoycos, this pull request is now in conflict and requires a rebase.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you squash this into a single commit? I don't see the value in these being separate.

Reviewed 5 of 5 files at r47, 1 of 2 files at r51, 1 of 1 files at r52, 3 of 7 files at r53, 1 of 3 files at r54, 3 of 3 files at r55, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @alessandrod, @astoycos, and @dave-tucker)


aya/src/programs/mod.rs line 1058 at r55 (raw file):

    /// Returns a file descriptor referencing the program.
    ///
    /// The returned file descriptor can be closed at any time and doing so does not

inconsistent wrap width


test/integration-test/src/tests/smoke.rs line 85 at r55 (raw file):

    // Ensure all relevant helper functions don't panic.
    prog.name();
    assert_eq!(prog.name_as_str().unwrap(), "pass");

aren't you asserting that a few lines up because that's the only program you return?


xtask/public-api/aya.txt line 997 at r55 (raw file):

pub aya::maps::MapError::PinError::name: core::option::Option<alloc::string::String>
pub aya::maps::MapError::ProgramNotLoaded
pub aya::maps::MapError::SyscallError(crate::sys::SyscallError)

this is spurious - please use nightly to generate this.

@astoycos astoycos force-pushed the helpers branch 2 times, most recently from 6c264b1 to 08b2d42 Compare August 11, 2023 16:00
@astoycos
Copy link
Member Author

Done @tamird

@astoycos astoycos requested a review from tamird August 11, 2023 16:01
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r57, all commit messages.
Dismissed @alessandrod and @dave-tucker from 16 discussions.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @astoycos, and @dave-tucker)


-- commits line 2 at r57:
"aya: ..." per a recent comment from @dave-tucker -- please make sure this line stays under 50 characters per https://github.com/aya-rs/aya/blob/main/CONTRIBUTING.md#commit-message-guidelines


-- commits line 6 at r57:
can you mention the specific command that you're mimicking here?


-- commits line 7 at r57:
remove?


-- commits line 9 at r57:
"its", not "it is"


aya/src/programs/mod.rs line 971 at r52 (raw file):

Previously, astoycos (Andrew Stoycos) wrote…

Hrm I think this has to stay as a RawFd until we remove ProgramFd

which I tried and failed to do in #694 :)

#744 is relevant


aya/src/programs/mod.rs line 919 at r57 (raw file):

                /// Returns the file descriptor of this Program.
                pub fn program_info(&self) -> Result<ProgramInfo, ProgramError> {
                    let fd = self.fd().ok_or(ProgramError::NotLoaded)?;

nit: this is self.data.fd_or_err() to avoid restating the error, yeah?

- Add helper methods to get useful information from the ProgramInfo
object which is returned by the `loaded_programs()` API.  Specifically
this code mirrors the `bpftool prog` command in terms of useful fields.
- Add a new API macro to each aya `Program` type to allow us to fetch
its accompanying `ProgramInfo` metadata after its been loaded.
- Add a new ProgramInfo constructor that builds a new instance using
a raw fd.
- Add a smoke test for the loaded_programs() API as well as
all the relevant methods on the ProgramInfo type.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
@astoycos
Copy link
Member Author

Done @tamird

@astoycos astoycos requested a review from tamird August 11, 2023 18:17
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r58, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @alessandrod, @astoycos, and @dave-tucker)

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @dave-tucker from a discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @alessandrod and @dave-tucker)

@astoycos astoycos merged commit bcc9743 into aya-rs:main Aug 14, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/needs-review Makes an API change that needs review aya This is about aya (userspace) test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants