Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OwnedFd for BTF's file descriptor #662

Merged
merged 3 commits into from
Jul 28, 2023
Merged

Conversation

nrxus
Copy link
Contributor

@nrxus nrxus commented Jul 17, 2023

This PR is meant as the first step towards using OwnedFd where possible instead of RawFd to be more foolproof and safer to use. This PR tackles the BTF file descriptor that is loaded when the object has a BTF section and fixes an existing file descriptor leak as that file descriptor was never being closed. Since we are moving to use OwnedFd this close will now happen by default.

The returned file descriptor will be wrapped in an Arc so that it can be shared across programs and maps of the loaded BPF file with no user-visible impact (no breaking changes).

As an add-on I also changed the Cargo/toml to be explicit about the MSRV. This had been previously bumped by #633 by using the std::os::fd module which was only introduced in v1.66.0. This PR only really needs MSRV v1.63.0 to use OwnedFd,

Related to #612


This change is Reviewable

@netlify
Copy link

netlify bot commented Jul 17, 2023

Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit ea96c29
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/64c41d58cba71f0008e1be7c
😎 Deploy Preview https://deploy-preview-662--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@nrxus nrxus force-pushed the use-owned-fd-for-btf branch 2 times, most recently from 3c24470 to 9f88ed3 Compare July 17, 2023 04:12
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 17 of 17 files at r1, 35 of 35 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @nrxus)


-- commits line 14 at r2:

  • capitalize please
  • s/leaked/leak
  • period please

-- commits line 23 at r2:
can these be exposed via accessors rather than by being made public?


aya/src/maps/mod.rs line 495 at r2 (raw file):

impl MapData {
    /// Creates a new map with the provided `name`
    pub fn create(&mut self, name: &str, btf_fd: Option<impl AsFd>) -> Result<RawFd, MapError> {

could we pass BorrowedFd<'_>? i'm wearing of generics introducing more monomorphization and slower compilation as a result

here and everywhere else


aya/src/maps/mod.rs line 891 at r2 (raw file):

        let mut map = new_map();
        assert_matches!(map.create("foo", Option::<BorrowedFd>::None), Ok(42));

I think if you remove the generic you won't need all this type annotation

here and a few other places


aya/src/sys/mod.rs line 20 at r2 (raw file):

use crate::generated::{bpf_attr, bpf_cmd, perf_event_attr};

pub(crate) type SysResult<T = c_long> = Result<T, (c_long, io::Error)>;

do we need the default?

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @tamird)


-- commits line 14 at r2:

Previously, tamird (Tamir Duberstein) wrote…
  • capitalize please
  • s/leaked/leak
  • period please

To reword this commit should I just rebase and force push on this branch? I am not sure if that might mess with reviewable or with your own tracking of what you've already seen or not seen but I am not sure how else to reword commits that doesn't involve a rebase + force push.


-- commits line 23 at r2:

Previously, tamird (Tamir Duberstein) wrote…

can these be exposed via accessors rather than by being made public?

Unfortunately not. If it was an accessor like fn(&mut self) -> &mut HashMap then the lifetime of the returned hashmap is aliasing the entire &mut self as there is no way to do partial borrows in functions signatures. By exposing these fields I allow the compiler to be smart enough to detect that when I am mutably borrowing programs I am not borrowing neither btf file descriptor nor the maps so I can still borrow either of them (mutably or not) without issues. Borrowing of both the programs and the file descriptor is needed because loading of the programs needs to mutably borrow a program and (non mutably) borrow the btf file descriptor.


aya/src/maps/mod.rs line 495 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

could we pass BorrowedFd<'_>? i'm wearing of generics introducing more monomorphization and slower compilation as a result

here and everywhere else

They become awkward to use if I use Option<BorrowedFd<'_>> because then usage of it when having a Option<OwnedFd> becomes something like: my_owned_fd.as_ref().map(|f| f.as_fd()). I could use something like Option<&OwnedFd> but that is significantly less flexible because then we need to have a reference to an owned fd. In our case we do always have an OwnedFd but I didn't know if that was too limiting going forwards.


aya/src/sys/mod.rs line 20 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

do we need the default?

We don't and I can remove the default if we don't want to deal with that but since so many functions already returned SysResult without any generics specified I figured adding a default would touch the least amount of code.

@nrxus nrxus requested a review from tamird July 19, 2023 04:35
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @nrxus)


-- commits line 14 at r2:

Previously, nrxus (Andres) wrote…

To reword this commit should I just rebase and force push on this branch? I am not sure if that might mess with reviewable or with your own tracking of what you've already seen or not seen but I am not sure how else to reword commits that doesn't involve a rebase + force push.

Yes, force push is the way to go. You'll need to rebase to resolve some of the clippy failures, and looks like CI is failing because all the tests need updating.


-- commits line 23 at r2:

Previously, nrxus (Andres) wrote…

Unfortunately not. If it was an accessor like fn(&mut self) -> &mut HashMap then the lifetime of the returned hashmap is aliasing the entire &mut self as there is no way to do partial borrows in functions signatures. By exposing these fields I allow the compiler to be smart enough to detect that when I am mutably borrowing programs I am not borrowing neither btf file descriptor nor the maps so I can still borrow either of them (mutably or not) without issues. Borrowing of both the programs and the file descriptor is needed because loading of the programs needs to mutably borrow a program and (non mutably) borrow the btf file descriptor.

Makes sense. We still have accessors for maps and programs; should we remove or deprecate those?


aya/src/maps/mod.rs line 495 at r2 (raw file):

Previously, nrxus (Andres) wrote…

They become awkward to use if I use Option<BorrowedFd<'_>> because then usage of it when having a Option<OwnedFd> becomes something like: my_owned_fd.as_ref().map(|f| f.as_fd()). I could use something like Option<&OwnedFd> but that is significantly less flexible because then we need to have a reference to an owned fd. In our case we do always have an OwnedFd but I didn't know if that was too limiting going forwards.

Ok, let's leave this as is.


aya/src/sys/mod.rs line 20 at r2 (raw file):

Previously, nrxus (Andres) wrote…

We don't and I can remove the default if we don't want to deal with that but since so many functions already returned SysResult without any generics specified I figured adding a default would touch the least amount of code.

Let's just add the generic in a parent commit (without a default, please)

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 50 files reviewed, 4 unresolved discussions (waiting on @tamird)


-- commits line 14 at r2:

Previously, tamird (Tamir Duberstein) wrote…

Yes, force push is the way to go. You'll need to rebase to resolve some of the clippy failures, and looks like CI is failing because all the tests need updating.

I believe I've fixed the build failures in the integrations tests (sorry I didn't catch those earlier, I usually just run cargo clippy --workspace which wasn't enough in this case. However, the clippy failures seem bogus to me. It looks like they are happening mostly due to a lint that is not yet available in stable and the lint seems to be poorly implemented so far (hence why it's not in stable yet I am guessing): rust-lang/rust-clippy#5546 (comment). Since the clippy errors don't seem to be about this PR but rather code that was here before that's only now starting to alert I am hesitant on addressing them for a nightly lint that is still a WIP. For example it's alerting about using a mutable reference to mapdata where we only need a reference for functions that update the bpf map because it doesn't know that we are modifying something under the hood using the file descriptor. I tried to disregard the warning but still pass clippy by adding #[allow(...)] but it didn't seem to do anything, perhaps because it's not a lint that is in stable yet I am not sure.


-- commits line 23 at r2:

Previously, tamird (Tamir Duberstein) wrote…

Makes sense. We still have accessors for maps and programs; should we remove or deprecate those?

I'd be in favor of just removing them since using the fields already gives more flexibility but I leave that decision up to you all. Let me know what you'd like me to do.

I do think that having a method in the Bpf object to load programs and have it pass the btf file descriptor is probably warranted since it's such a common use case. The problem though is that having a method like Bpf::load_program(&mut self, name: &str) doesn't work for all program types because some of them require more information than just the name (i.e., lsm, BTF trace point, fentry, fexit, and extension). This could be left as a future improvement in another PR though, the focus right now is to fix the file descriptor leak.


aya/src/maps/mod.rs line 891 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think if you remove the generic you won't need all this type annotation

here and a few other places

Since we've decided to keep this generic I am marking this as done


aya/src/sys/mod.rs line 20 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Let's just add the generic in a parent commit (without a default, please)

Done.

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 50 files reviewed, 4 unresolved discussions (waiting on @tamird)


-- commits line 14 at r2:

Previously, nrxus (Andres) wrote…

I believe I've fixed the build failures in the integrations tests (sorry I didn't catch those earlier, I usually just run cargo clippy --workspace which wasn't enough in this case. However, the clippy failures seem bogus to me. It looks like they are happening mostly due to a lint that is not yet available in stable and the lint seems to be poorly implemented so far (hence why it's not in stable yet I am guessing): rust-lang/rust-clippy#5546 (comment). Since the clippy errors don't seem to be about this PR but rather code that was here before that's only now starting to alert I am hesitant on addressing them for a nightly lint that is still a WIP. For example it's alerting about using a mutable reference to mapdata where we only need a reference for functions that update the bpf map because it doesn't know that we are modifying something under the hood using the file descriptor. I tried to disregard the warning but still pass clippy by adding #[allow(...)] but it didn't seem to do anything, perhaps because it's not a lint that is in stable yet I am not sure.

And just after I sent the last message I see that you all fixed it in a separate PR. I don't quite agree with making some of those &mut be & just because clipper said so but since it's not really relevant to this PR I just rebased to the latest main and the nightly clippy seems happy (on my machine at least).

@nrxus nrxus requested a review from tamird July 20, 2023 07:53
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 38 of 38 files at r3, 7 of 7 files at r4, 40 of 40 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @alessandrod and @nrxus)


-- commits line 23 at r2:

Previously, nrxus (Andres) wrote…

I'd be in favor of just removing them since using the fields already gives more flexibility but I leave that decision up to you all. Let me know what you'd like me to do.

I do think that having a method in the Bpf object to load programs and have it pass the btf file descriptor is probably warranted since it's such a common use case. The problem though is that having a method like Bpf::load_program(&mut self, name: &str) doesn't work for all program types because some of them require more information than just the name (i.e., lsm, BTF trace point, fentry, fexit, and extension). This could be left as a future improvement in another PR though, the focus right now is to fix the file descriptor leak.

I'm in favor of removing as well. I know @alessandrod cares about this sort of thing. @alessandrod can you please weigh in?


test/integration-test/src/tests/bpf_probe_read.rs line 108 at r5 (raw file):

    let mut bpf = Bpf::load(bytes).unwrap();

    let prog: &mut UProbe = bpf.programs.get_mut(prog_name).unwrap().try_into().unwrap();

instead of these point accesses, could we write

let Bpf { programs, btf_fd, .. } = &mut bpf;

that makes it clear that these things are borrowed from the same structure. here and everywhere please.

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 7 of 50 files reviewed, 2 unresolved discussions (waiting on @alessandrod and @tamird)


test/integration-test/src/tests/bpf_probe_read.rs line 108 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

instead of these point accesses, could we write

let Bpf { programs, btf_fd, .. } = &mut bpf;

that makes it clear that these things are borrowed from the same structure. here and everywhere please.

Done.

@nrxus nrxus requested a review from tamird July 21, 2023 04:36
Copy link
Collaborator

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done a full review, just commenting on the load() change: I don't think this is an acceptable change API wise.

BTF is largely an implementation detail. As a user, in 99% of the cases, I don't even know that it exists. There's BTF from the kernel, there's BTF in object files produced by the toolchain, they're used by the kernel to do useful things, but as a user I never interact with it. I never create it, I never lookup things inside it, nothing.

With this change, you're forcing everyone to have to be aware of BTF, by having to shuttle this opaque btf fd thing around. Something that is an implementation detail, suddenly becomes a central thing in the API. Seems wrong to me.

@tamird
Copy link
Member

tamird commented Jul 21, 2023

WDYT about introducing a type that wraps Program + BorrowedFd, and making that the return from Bpf::program and Bpf::program_mut? That type could impl std::ops::Deref and std::ops::DerefMut with Target = Program, and the current Program::load function would move to that type. Then the user would never see the BTF file descriptor.

@tamird
Copy link
Member

tamird commented Jul 25, 2023

@nrxus does that sound reasonable? If you like, I can pick this up. We're desperately in need of fixing the FD ownership problems that are all over the codebase.

@nrxus
Copy link
Contributor Author

nrxus commented Jul 25, 2023

@nrxus does that sound reasonable? If you like, I can pick this up. We're desperately in need of fixing the FD ownership problems that are all over the codebase.

Ah my bad I thought your last question was for @alessandrod. I can try to implement that tonight and send it to you all for review. I also don't mind if you want to take over this PR, I also just want the file descriptors ownership fixed, it doesn't matter to me much by who 😅

@tamird
Copy link
Member

tamird commented Jul 25, 2023

Go for it! I'd rather cultivate more contributors :)

@nrxus
Copy link
Contributor Author

nrxus commented Jul 26, 2023

Go for it! I'd rather cultivate more contributors :)

I started down the road of making the intermediary struct (ProgramMutRef) but it gets pretty awkward since the current workflow is:

Bpf -> &mut Program -> &mut SpecificProgramType -> load

Because of this we would have to have a wrapper struct for each individual program type essentially (since not all programs are loaded with the same parameters). Before I start down that rabbit hole I think it may be worth mentioning alternatives:

  • Use Arc<OwnedFd> inside ProgramData. This means we keep the same interface as we do now where the BTF file descriptor is a hidden detail inside ProgramData. The con is that we now have the "cost" of using Arc. I don't think this is a huge deal because Arc is pretty cheap to store and the creation cost is just at load time of Bpf where it'd be minimal compared to the rest. Not really a con nor a plus but it is worth noting that in this approach the mental model of the btf file descriptor is that it isn't owned by anyone but rather just shared by all the programs and maps.

  • Add a Bpf::load_program method and a ProgramLoad trait that'd look something like:

trait ProgramLoad {
    type Data;
    
    fn load(&mut self, btf_fd: Option<impl AsFd>,data: Self::Data) -> Result<(), ProgramError>;
}

impl Bpf {
    /// Loads a program with default data - useful when the programs don't need any extra information (e.g., kprobes, tracepoints)
    fn load_program<P: ProgramLoad>(&mut self, name: &str) -> Result<P, ProgramError>
    where P::Data : Default {
        self.load_program_with(name, Data::Default)
    }
    
    /// Loads a program that needs extra information (e.g., LSM)
    fn load_program_with<P: ProgramLoad>(&mut self, name: &str, data: P::Data) -> Result<P, ProgramError> {
        let program = self.programs.get_mut(name).ok_or(ProgramError::NotFound)?:
        let program: &mut P = program.try_into()?;
        program.load(self.btf_fd.as_ref(), data)
    }
}

Then each program could implement ProgramLoad and define the kind of program-specific data they need. This has the benefit that the Bpf object is the clear owner of the BTF descriptor (the program and maps only gets it at load time) but the con is that it is a breaking change from the current way programs are load (from bpf.program_mut("my_probe").unwrap().load() to bpf.load_program("my_probe"). It is also more code to write but 🤷🏾 .

I am curious to hear what you all think or if you all have a different idea.

@tamird
Copy link
Member

tamird commented Jul 26, 2023

Go for it! I'd rather cultivate more contributors :)

I started down the road of making the intermediary struct (ProgramMutRef) but it gets pretty awkward since the current workflow is:

Bpf -> &mut Program -> &mut SpecificProgramType -> load

Because of this we would have to have a wrapper struct for each individual program type essentially (since not all programs are loaded with the same parameters). Before I start down that rabbit hole I think it may be worth mentioning alternatives:

Are you sure you need a separate type for each one? I think you just need a separate impl block for each program type.

struct WithBtf<'a, T> {
  program: &'a mut T,
  btf: BorrowedFd<'a>,
}

impl WithBtf<'_, Xdp> {
  fn load(...) {
    // this is the existing load method on Xdp
  }
}
  • Use Arc<OwnedFd> inside ProgramData. This means we keep the same interface as we do now where the BTF file descriptor is a hidden detail inside ProgramData. The con is that we now have the "cost" of using Arc. I don't think this is a huge deal because Arc is pretty cheap to store and the creation cost is just at load time of Bpf where it'd be minimal compared to the rest. Not really a con nor a plus but it is worth noting that in this approach the mental model of the btf file descriptor is that it isn't owned by anyone but rather just shared by all the programs and maps.

I'd rather not change performance characteristics or introduce multiple ownership.

  • Add a Bpf::load_program method and a ProgramLoad trait that'd look something like:
trait ProgramLoad {
    type Data;
    
    fn load(&mut self, btf_fd: Option<impl AsFd>,data: Self::Data) -> Result<(), ProgramError>;
}

impl Bpf {
    /// Loads a program with default data - useful when the programs don't need any extra information (e.g., kprobes, tracepoints)
    fn load_program<P: ProgramLoad>(&mut self, name: &str) -> Result<P, ProgramError>
    where P::Data : Default {
        self.load_program_with(name, Data::Default)
    }
    
    /// Loads a program that needs extra information (e.g., LSM)
    fn load_program_with<P: ProgramLoad>(&mut self, name: &str, data: P::Data) -> Result<P, ProgramError> {
        let program = self.programs.get_mut(name).ok_or(ProgramError::NotFound)?:
        let program: &mut P = program.try_into()?;
        program.load(self.btf_fd.as_ref(), data)
    }
}

Then each program could implement ProgramLoad and define the kind of program-specific data they need. This has the benefit that the Bpf object is the clear owner of the BTF descriptor (the program and maps only gets it at load time) but the con is that it is a breaking change from the current way programs are load (from bpf.program_mut("my_probe").unwrap().load() to bpf.load_program("my_probe"). It is also more code to write but 🤷🏾 .

This refactoring seems orthogonal to what we're doing here. I like it, but it seems orthogonal.

@nrxus nrxus force-pushed the use-owned-fd-for-btf branch 2 times, most recently from 44f3c25 to b2b52aa Compare July 27, 2023 10:10
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 43 of 43 files at r6, 7 of 7 files at r7, 40 of 40 files at r8, 43 of 43 files at r9, 8 of 8 files at r10, 7 of 42 files at r11, all commit messages.
Reviewable status: 17 of 52 files reviewed, 9 unresolved discussions (waiting on @nrxus)


-- commits line 20 at r11:
do we want to rewrite this message?


README.md line 85 at r11 (raw file):

// get the `ingress_filter` program compiled into `bpf.o`.
let mut ingress: aya::WithBtfFd<CgroupSkb> = bpf.program_mut("ingress_filter")?.try_into()?;

is it possible to put CgroupSkb elsewhere in this expression so that we don't have to name WithBtfFd?

here and everywhere


aya/src/bpf.rs line 749 at r11 (raw file):

}

/// A program with a BTF file descriptor

comments should be full sentences with punctuation please.

though I wonder if this should be #[doc(hidden)]?


aya/src/bpf.rs line 760 at r11 (raw file):

    fn deref(&self) -> &Self::Target {
        &*self.program

does self.program not work?


aya/src/bpf.rs line 770 at r11 (raw file):

}

impl<'p, P, T> AsRef<T> for WithBtfFd<'p, P>

how come these (AsRef, AsMut) are needed? won't auto deref make downstream code work even without these impls?


aya/src/bpf.rs line 791 at r11 (raw file):

/// The main entry point into the library, used to work with eBPF programs and maps.
#[derive(Debug)]
#[non_exhaustive]

fields are no longer public, so this can back out


aya/src/programs/cgroup_device.rs line 63 at r11 (raw file):

impl CgroupDevice {
    /// Loads the program inside the kernel
    pub fn load(&mut self, btf_fd: Option<impl AsFd>) -> Result<(), ProgramError> {

do we want to remove this, or at least make it not-pub? i.e. move the implementation entirely to WithBtfFd? here and everywhere.


aya/src/programs/mod.rs line 890 at r11 (raw file):

                fn try_from(program: $crate::WithBtfFd<'a, Program>) -> Result<$crate::WithBtfFd<'a, $ty>, ProgramError> {
                    Ok($crate::bpf::WithBtfFd {
                        program: program.program.try_into()?,

could you destructure WIthBtfFd everywhere?


aya/src/sys/bpf.rs line 574 at r11 (raw file):

        u.btf_log_size = log_buf.len() as u32;
    }
    let raw_fd = sys_bpf(bpf_cmd::BPF_BTF_LOAD, &attr)? as RawFd;

can you avoid the as cast here? let's use .try_into (there's an example of this already in the file)

@alessandrod
Copy link
Collaborator

Use Arc inside ProgramData. This means we keep the same interface as we do now where the BTF file descriptor is a hidden detail inside ProgramData. The con is that we now have the "cost" of using Arc. I don't think this is a huge deal because Arc is pretty cheap to store and the creation cost is just at load time of Bpf where it'd be minimal compared to the rest. Not really a con nor a plus but it is worth noting that in this approach the mental model of the btf file descriptor is that it isn't owned by anyone but rather just shared by all the programs and maps.

I think this is what we should do. The cost of this Arc would be insignificant - we only allocate one per object file, the allocation is tiny and is going to fit in the smallest size class provided by the allocator so it's not actually going to allocate anything.

The WithBtf approach doesn't seem viable to me for the same reason that load(..., btf_fd) isn't: the problem is the same, except now you've moved this weird BTF thing in return position, and I need to spell it out anytime I call bpf.program_mut() - from rustdocs:

    /// let mut program: aya::WithBtfFd<UProbe> = bpf.program_mut("SSL_read").unwrap().try_into()?;

@tamird
Copy link
Member

tamird commented Jul 27, 2023

I agree that the WithBtf type should be invisible in user code. I think if we can make that work, this approach remains viable.

@alessandrod
Copy link
Collaborator

I agree that the WithBtf type should be invisible in user code. I think if we can make that work, this approach remains viable.

But how? program and program_mut want you to specify the return type

@nrxus
Copy link
Contributor Author

nrxus commented Jul 28, 2023

I agree that the WithBtf type should be invisible in user code. I think if we can make that work, this approach remains viable.

But how? program and program_mut want you to specify the return type

If you could ascribe using only the program type, then you wouldn't need to care that you're getting a WithBtf<T> rather than a T.

The only way I can think of ascribing only the T is by changing program_mut to something like:

fn program_mut<P>(&mut self, name: &str) -> Result<WithBtfFd<'_, P>, ProgramError> {
   let program = self.programs.get_mut(name).ok_or(ProgramError::NotFound)?;
   program.try_into()
}

/* HOW IT'D BE USED */
let program = bpf.program_mut::<Kprobe>("my_kprobe").unwrap();
program.load()

But then we are requiring users to know statically what program type they are going to want when they ask for one whilst currently that is not a requirement. For example, in the way I've been using aya I have a list of program names that I need to load/attach, some are kprobes, some are tracepoints so my code looks something like:

program_names.iter().try_for_each(|p| {
    let Some(program) = bpf.program_mut(p).ok_or_else(|| Err(/*idk some error*/))?;
    match program {
        Program::KProbe(p) => p.load(),
        Program::Tracepoint(p) => p.load(),
        _ => return Err(/* unsupported program */)
    }
})

If we changed it to where program_mut needs to know the type statically then the above strategy is no longer doable. In my case I could just have a list of kprobe names and a list of tracepoint names since I know what they each are since I wrote the programs but that may not be the case across the board so I am concerned about cutting off potential workflows for users of aya. That all being said, I defer to you all since you all probably know best how aya is being used.

@tamird
Copy link
Member

tamird commented Jul 28, 2023

Agreed. Seems the best we can do is a shared FD then. Does it need to be Arc, or can it be Rc?

@nrxus
Copy link
Contributor Author

nrxus commented Jul 28, 2023

Agreed. Seems the best we can do is a shared FD then. Does it need to be Arc, or can it be Rc?

As far as I can tell ProgramData currently implements Send +Sync so I'd lean towards Arc so as to not remove functionality. I could totally see someone loading a Bpf instance in one thread and then sending it to another thread to manage its lifecycle after the programs are all attached/running.

@tamird
Copy link
Member

tamird commented Jul 28, 2023

Makes sense

This is just taking aya-rs#633 to its
logical conclusion. Because `std::os::fd` was only introduced as a
module in Rust v1.66.0 I have also updated the `Cargo.toml` of the
`aya` package to reflect the true MSRV. Note that this commit is *not*
the cause for this MSRV bump, that was done by a previous commit, this
commit is just making it explicit in the `Cargo.toml`
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 45 of 45 files at r12, 7 of 7 files at r13, 32 of 32 files at r14, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @alessandrod and @nrxus)


aya/src/sys/bpf.rs line 574 at r11 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can you avoid the as cast here? let's use .try_into (there's an example of this already in the file)

Let's address this one please.

aya/src/bpf.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See clone comment, then it's ready to go!

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 49 of 52 files reviewed, 2 unresolved discussions (waiting on @alessandrod and @tamird)


aya/src/bpf.rs line 550 at r14 (raw file):

Previously, alessandrod (Alessandro Decina) wrote…

Here and everywhere else, use Arc::clone(&fd) instead of fd.clone()

Done.


aya/src/sys/bpf.rs line 574 at r11 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Let's address this one please.

Done.

Although I personally think it's a bit unnecessary, the error here is impossible unless there's a kernel/rust bug right, it's not really something the user could handle.

@nrxus nrxus requested review from alessandrod and tamird July 28, 2023 18:07
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r15, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @alessandrod and @nrxus)


aya/src/sys/bpf.rs line 574 at r11 (raw file):

Previously, nrxus (Andres) wrote…

Done.

Although I personally think it's a bit unnecessary, the error here is impossible unless there's a kernel/rust bug right, it's not really something the user could handle.

Sure, feel free to panic if you think that's better - I'm just asking that we not cast the sign away.


aya/src/sys/bpf.rs line 467 at r15 (raw file):

    attr.__bindgen_anon_6.__bindgen_anon_1.prog_id = prog_id;
    // SAFETY: BPF_PROG_GET_FD_BY_ID returns a new file descriptor

can we please end comments with periods? here and 3 other places (let's match the style in the pre-existing comment please)

Copy link
Contributor Author

@nrxus nrxus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 51 of 52 files reviewed, 2 unresolved discussions (waiting on @alessandrod and @tamird)


aya/src/sys/bpf.rs line 467 at r15 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we please end comments with periods? here and 3 other places (let's match the style in the pre-existing comment please)

Done.

@nrxus nrxus requested a review from tamird July 28, 2023 19:13
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r16, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @alessandrod)

@tamird
Copy link
Member

tamird commented Jul 28, 2023

Looks like the integration tests are failing, are we dropping a file descriptor somewhere?

This fixes an existing file descriptor leak when there is BTF data in
the loaded object.

To avoid lifetime issues while having minimal impact to UX the
`OwnedFd` returned from the BPF_BTF_LOAD syscall will be wrapped in an
`Arc` and shared accross the programs and maps of the loaded BPF
file.
@nrxus
Copy link
Contributor Author

nrxus commented Jul 28, 2023

Looks like the integration tests are failing, are we dropping a file descriptor somewhere?

Small goof in my refactor to extract the fd_sys_bpf function. Fixed now.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r17, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @alessandrod)

@tamird tamird merged commit 13e83b2 into aya-rs:main Jul 28, 2023
22 checks passed
@tamird
Copy link
Member

tamird commented Jul 28, 2023

Thanks @nrxus!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants