Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maps,programs: avoid path UTF-8 assumptions #742

Merged
merged 2 commits into from
Aug 24, 2023
Merged

Conversation

tamird
Copy link
Member

@tamird tamird commented Aug 10, 2023

This change is Reviewable

@netlify
Copy link

netlify bot commented Aug 10, 2023

Deploy Preview for aya-rs-docs ready!

Name Link
🔨 Latest commit 0bba9b1
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/64e75f409e516800089ac4a1
😎 Deploy Preview https://deploy-preview-742--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the aya This is about aya (userspace) label Aug 10, 2023
@tamird
Copy link
Member Author

tamird commented Aug 10, 2023

@addisoncrump care to review this as well?

@addisoncrump
Copy link
Contributor

Sure, I'll take a peek in a bit.

@tamird tamird changed the title programs: avoid assumptions that paths are UTF-8 maps,programs: avoid path UTF-8 assumptions Aug 10, 2023
@mergify mergify bot added the aya-obj Relating to the aya-obj crate label Aug 10, 2023
Copy link
Member

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.x

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dave-tucker and @tamird)


aya/src/programs/uprobe.rs line 295 at r1 (raw file):

        .filter_map(|mut line| {
            loop {
                if let [stripped @ .., c] = line {

Could be the following, but as you noted, it'll have a bounds check.

    let line = line
        .iter()
        .rev()
        .position(|&b| !b.is_ascii_whitespace())
        .and_then(|pos| line.get(..line.len() - pos))
        .unwrap_or(line);

@tamird tamird requested a review from ajwerner August 10, 2023 20:28
Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 9 files reviewed, all discussions resolved (waiting on @ajwerner and @dave-tucker)


aya/src/programs/uprobe.rs line 295 at r1 (raw file):

Previously, ajwerner wrote…

Could be the following, but as you noted, it'll have a bounds check.

    let line = line
        .iter()
        .rev()
        .position(|&b| !b.is_ascii_whitespace())
        .and_then(|pos| line.get(..line.len() - pos))
        .unwrap_or(line);

I think you'd want to protect yourself from bugs by writing

    let line = line
        .iter()
        .rev()
        .position(|&b| !b.is_ascii_whitespace())
        .map(|pos| line.get(..line.len() - pos).unwrap())
        .unwrap_or(line);

at which point this is looking rather ugly.

@@ -68,7 +68,7 @@ impl KProbe {
///
/// The returned value can be used to detach from the given function, see [KProbe::detach].
pub fn attach(&mut self, fn_name: &str, offset: u64) -> Result<KProbeLinkId, ProgramError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly annoying use case, but it is possible to store non-UTF-8 names in symbol string tables, which occasionally shows up in e.g. source-protected code (and may be present in loaded kernel modules). If we want to support this we should probably use the <P: AsRef<Path>> idiom, and I don't think it would break things for downstream. This holds for UProbe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example of this:

$ cat evil.c 
#include <stdio.h>

void thing() {
  puts("hello!");
}

int main(int argc, char **argv) {
  thing();
  return 0;
}
$ gcc -O0 -c evil.c 
$ objcopy --redefine-sym thing="$(printf '\x80')" evil.o 
$ objdump -t evil.o 

evil.o:     file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l    df *ABS*	0000000000000000 evil.c
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 g     F .text	0000000000000011 �
0000000000000000         *UND*	0000000000000000 puts
0000000000000011 g     F .text	0000000000000020 main


$ gcc evil.o -o evil
$ ./evil 
hello!

aya-obj/src/obj.rs Show resolved Hide resolved
@@ -49,6 +49,39 @@ pub struct UProbe {
pub(crate) kind: ProbeKind,
}

trait OsStringExt {
fn starts_with(&self, needle: &OsStr) -> bool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a needle? I would have expected pattern

@@ -172,6 +204,7 @@ fn test_resolve_attach_path() {
// Now let's resolve the path to libc. It should exist in the current process's memory map and
// then in the ld.so.cache.
let libc_path = resolve_attach_path(&"libc", Some(pid)).unwrap();
let libc_path = libc_path.to_str().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could panic, because you're making another assumption about UTF-8 paths here.
The assertion below should probably be either that the libc_path.file_name() contains libc.

}

fn find_lib_in_proc_maps(pid: pid_t, lib: &str) -> Result<Option<String>, io::Error> {
fn find_lib_in_proc_maps(pid: pid_t, lib: &Path) -> Result<Option<PathBuf>, io::Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib should be an &OsStr - it's never used as a &Path here.

aya/src/sys/netlink.rs Show resolved Hide resolved
@@ -68,7 +68,7 @@ impl KProbe {
///
/// The returned value can be used to detach from the given function, see [KProbe::detach].
pub fn attach(&mut self, fn_name: &str, offset: u64) -> Result<KProbeLinkId, ProgramError> {
attach(&mut self.data, self.kind, fn_name, offset, None)
attach(&mut self.data, self.kind, Path::new(fn_name), offset, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub fn attach<T: AsRef<Path>(&mut self, fn_name: T, offset: u64) -> Result<KProbeLinkId, ProgramError> {

This won't break the API since both &str and &OsStr implement AsRef<Path>.
Not sure there's a function we'd kprobe that isn't UTF-8, but 🤷 we may as well be thorough.

@@ -167,6 +167,7 @@ fn create_probe_event(
KRetProbe | URetProbe => 'r',
};

let fn_name = fn_name.to_string_lossy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it OK to do lossy conversion here but not in other places?
Ultimately it looks like this name is being written to a file as bytes so a lossy conversion is unnecessary.

} else {
lib.to_string()
};
pub fn resolve(&self, lib: &Path) -> Option<&Path> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think lib should be &OsStr here too.

@@ -127,31 +160,30 @@ impl UProbe {
fn resolve_attach_path<T: AsRef<Path>>(
target: &T,
pid: Option<pid_t>,
) -> Result<Cow<'_, str>, UProbeError> {
) -> Result<Cow<'_, Path>, UProbeError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need Cow here? With the ldsocache changes to store PathBuf, 2/3 successful code paths yield a pathbuf, where the final one returns the input if the path is absolute.

key: String,
value: String,
key: OsString,
value: OsString,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value should be PathBuf

Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 9 files reviewed, 11 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)


aya/src/programs/kprobe.rs line 70 at r2 (raw file):

Previously, addisoncrump (Addison Crump) wrote…

As an example of this:

$ cat evil.c 
#include <stdio.h>

void thing() {
  puts("hello!");
}

int main(int argc, char **argv) {
  thing();
  return 0;
}
$ gcc -O0 -c evil.c 
$ objcopy --redefine-sym thing="$(printf '\x80')" evil.o 
$ objdump -t evil.o 

evil.o:     file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l    df *ABS*	0000000000000000 evil.c
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 g     F .text	0000000000000011 �
0000000000000000         *UND*	0000000000000000 puts
0000000000000011 g     F .text	0000000000000020 main


$ gcc evil.o -o evil
$ ./evil 
hello!

Sounds reasonable, but I'll leave that for a future change to avoid mixing API and non-API changes.


aya/src/programs/kprobe.rs line 71 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…
pub fn attach<T: AsRef<Path>(&mut self, fn_name: T, offset: u64) -> Result<KProbeLinkId, ProgramError> {

This won't break the API since both &str and &OsStr implement AsRef<Path>.
Not sure there's a function we'd kprobe that isn't UTF-8, but 🤷 we may as well be thorough.

It might break type inference if someone's .into()ing the operand.

I think that's better, but I'd prefer to avoid API review here.


aya/src/programs/probe.rs line 170 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Why is it OK to do lossy conversion here but not in other places?
Ultimately it looks like this name is being written to a file as bytes so a lossy conversion is unnecessary.

The very next line calls str::replace, so my feeling is that this need not match exactly. We can probably avoid the conversion but the code will become more complex. Do you know if we need to?


aya/src/programs/uprobe.rs line 53 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

What's a needle? I would have expected pattern

It's not a pattern though. See the argument name here: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.starts_with


aya/src/programs/uprobe.rs line 163 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Do we still need Cow here? With the ldsocache changes to store PathBuf, 2/3 successful code paths yield a pathbuf, where the final one returns the input if the path is absolute.

This is a fair point, but what would you have this return? Result<Option<PathBuf>, UProbeError>? Is that better?


aya/src/programs/uprobe.rs line 207 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

This could panic, because you're making another assumption about UTF-8 paths here.
The assertion below should probably be either that the libc_path.file_name() contains libc.

This is a test.


aya/src/programs/uprobe.rs line 316 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

lib should be an &OsStr - it's never used as a &Path here.

I tried to keep OsStr out of the various function signatures. The caller has a Path, it doesn't care that we covert to OsStr for "reasons".


aya/src/programs/uprobe.rs line 333 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Value should be PathBuf

Why?


aya/src/programs/uprobe.rs line 440 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Again, I think lib should be &OsStr here too.

See previous reply.


aya/src/sys/netlink.rs line 745 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Again, struggling to see how this needs to be part of this commit.

It can be a separate commit. I was just hunting for unnecessary lossy string conversions.


aya-obj/src/obj.rs line 1650 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I'm not sure how these changes are relevant to the commit?

It can be a separate commit. I was just hunting for unnecessary lossy string conversions.

Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 9 files reviewed, 11 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)


aya/src/sys/netlink.rs line 745 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

It can be a separate commit. I was just hunting for unnecessary lossy string conversions.

Extracted into a parent commit.


aya-obj/src/obj.rs line 1650 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

It can be a separate commit. I was just hunting for unnecessary lossy string conversions.

Extracted into a parent commit.

Copy link
Member

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 9 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @tamird)


aya/src/programs/probe.rs line 170 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

The very next line calls str::replace, so my feeling is that this need not match exactly. We can probably avoid the conversion but the code will become more complex. Do you know if we need to?

See:
https://docs.kernel.org/trace/kprobetrace.html
https://docs.kernel.org/trace/uprobetracer.html

Which describes the format of this file for kprobes/uprobes.

Atlet probe... later in this function .. we should be using the path verbatim - not after a lossy conversion. Assuming it follows standard kernel convention. Reference: https://lwn.net/Articles/71472/

The event alias component, where fixed_fn_name is used transforms /bin/bash into bin_bash where it's used to build a unique id - aya_1234_bin_bash. I think we also need to do further due diligence on whether the presence of https://doc.rust-lang.org/std/char/constant.REPLACEMENT_CHARACTER.html in this identifier would cause issues.


aya/src/programs/uprobe.rs line 163 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

This is a fair point, but what would you have this return? Result<Option<PathBuf>, UProbeError>? Is that better?

I feel Result<Option<PathBuf>, UProbeError> is better, but that's personal preference.


aya/src/programs/uprobe.rs line 333 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Why?

You can check the output of ldconfig -p the keys are OsString, the values are PathBuf.

Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 9 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @ajwerner, and @dave-tucker)


aya/src/programs/probe.rs line 170 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

See:
https://docs.kernel.org/trace/kprobetrace.html
https://docs.kernel.org/trace/uprobetracer.html

Which describes the format of this file for kprobes/uprobes.

Atlet probe... later in this function .. we should be using the path verbatim - not after a lossy conversion. Assuming it follows standard kernel convention. Reference: https://lwn.net/Articles/71472/

The event alias component, where fixed_fn_name is used transforms /bin/bash into bin_bash where it's used to build a unique id - aya_1234_bin_bash. I think we also need to do further due diligence on whether the presence of https://doc.rust-lang.org/std/char/constant.REPLACEMENT_CHARACTER.html in this identifier would cause issues.

OK this was pretty involved, but done.


aya/src/programs/uprobe.rs line 163 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I feel Result<Option<PathBuf>, UProbeError> is better, but that's personal preference.

Looking at this again: it's not correct. 1/3 code paths returns an owned PathBuf, the other 2 return Path. Even if ldsocache held PathBuf, it would return borrowed Path. Left as-is.


aya/src/programs/uprobe.rs line 333 at r2 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

You can check the output of ldconfig -p the keys are OsString, the values are PathBuf.

OsString and PathBuf are the same type, we should choose the one that's easier to work with. In this case it's easier to work with OsString because all the usages (e.g. the various calls to strip_suffix and friends) want the OsString semantics, not the Path semantics.

@mergify
Copy link

mergify bot commented Aug 11, 2023

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

@mergify mergify bot requested a review from alessandrod August 11, 2023 16:27
@mergify
Copy link

mergify bot commented Aug 14, 2023

@tamird, this pull request is now in conflict and requires a rebase.

Copy link
Member

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 6 files at r2, 7 of 7 files at r5, 6 of 9 files at r6.
Reviewable status: 8 of 11 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)


aya/src/programs/probe.rs line 51 at r6 (raw file):

    bytes.as_ref().split(|b| b == &b'\n').map(|mut line| {
        loop {

nit: while let is a thing

    while let [stripped @ .., c] = line {
        if c.is_ascii_whitespace() {
            line = stripped;
            continue
        }
        break;
    }

aya/src/programs/uprobe.rs line 258 at r6 (raw file):

        .split(|b| b == &b'\n')
        .filter_map(|mut line| {
            loop {

same nit from above: while let exists.

Copy link
Member

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 9 files at r6, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)

Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)


aya/src/programs/probe.rs line 51 at r6 (raw file):

Previously, ajwerner wrote…

nit: while let is a thing

    while let [stripped @ .., c] = line {
        if c.is_ascii_whitespace() {
            line = stripped;
            continue
        }
        break;
    }

Done.


aya/src/programs/uprobe.rs line 258 at r6 (raw file):

Previously, ajwerner wrote…

same nit from above: while let exists.

Done.

Copy link
Member

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 9 files at r7, 1 of 9 files at r8.
Reviewable status: 3 of 11 files reviewed, 7 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)

@mergify
Copy link

mergify bot commented Aug 17, 2023

@tamird, this pull request is now in conflict and requires a rebase.

Copy link
Contributor

@addisoncrump addisoncrump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but we need to also apply similar changes for function names in the future.

@tamird tamird force-pushed the avoid-utf-assumption branch 2 times, most recently from f2f58b1 to 8ad04c7 Compare August 23, 2023 20:27
We can be strict in tests.
@mergify
Copy link

mergify bot commented Aug 24, 2023

@tamird, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Aug 24, 2023
@mergify mergify bot removed the needs-rebase label Aug 24, 2023
@tamird tamird merged commit 8ffd9bb into main Aug 24, 2023
22 checks passed
@tamird tamird deleted the avoid-utf-assumption branch August 24, 2023 15:16
Copy link
Member Author

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @addisoncrump and @dave-tucker from 7 discussions.
Reviewable status: 1 of 15 files reviewed, all discussions resolved


aya/src/programs/kprobe.rs line 70 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Sounds reasonable, but I'll leave that for a future change to avoid mixing API and non-API changes.

#765


aya/src/programs/kprobe.rs line 71 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

It might break type inference if someone's .into()ing the operand.

I think that's better, but I'd prefer to avoid API review here.

#765

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/needs-review Makes an API change that needs review aya This is about aya (userspace) aya-obj Relating to the aya-obj crate test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants