Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EndianReader<T> #298

Merged
merged 3 commits into from
May 16, 2018
Merged

Add EndianReader<T> #298

merged 3 commits into from
May 16, 2018

Conversation

fitzgen
Copy link
Member

@fitzgen fitzgen commented Apr 30, 2018

This is a work-in-progress / experimental approach to #294

I don't particularly care if this is the approach we end up taking, but I wanted to try making a generic thing that would work with Rc<[u8]> and Arc<[u8]> and more generally any T: Deref<[u8]> + Clone. The idea being that T might be some custom mmap type.

In particular, check out the table in the gimli::Reader doc comment in addition to the EndianReader<T> definition.

Thoughts @philipc, @koute?

@coveralls
Copy link

coveralls commented Apr 30, 2018

Coverage Status

Coverage increased (+0.07%) to 92.701% when pulling 6f5e049 on fitzgen:endian-reader into 54dd5a3 on gimli-rs:master.

@koute
Copy link
Contributor

koute commented Apr 30, 2018

So far this looks great!

Copy link
Collaborator

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Once the Pin APIs stabilize (for immovable data), we can probably do a more efficient implementation that's closer to what bytes does.

#[inline]
fn read_u16(&mut self) -> Result<u16> {
let slice = self.range.read_slice(2)?;
Ok(self.endian.read_u16(slice))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth adding default implementations to Reader for these, to slightly reduce the burden for implementations of Reader:

        let slice: [u8; 2] = self.read_u8_array()?;
        Ok(self.endian().read_u16(&slice))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can make it work, then totally. I think using read_u8_array would avoid the borrow issues I was running into, since it makes a copy. The copy is of a small size, and I expect llvm should also optimize it away, but verifying that requires digging into the disassembly.

@fitzgen
Copy link
Member Author

fitzgen commented May 1, 2018

This looks good.

Thanks. I know you were also thinking about solving this issue; was the API you were imagining much different?

Once the Pin APIs stabilize (for immovable data), we can probably do a more efficient implementation that's closer to what bytes does.

Can you expand a little more about how you expect Pin to come into play here? I think it would make some variant of EndianSlice easier to use, allowing the owning handle into hash maps without invalidating borrows, but I'd like to know what you're thinking.

@philipc
Copy link
Collaborator

philipc commented May 2, 2018

I know you were also thinking about solving this issue; was the API you were imagining much different?

My initial thoughts were along the lines of struct { bytes: Bytes, endian: Endian }. However, Bytes is restricted to being a Vec internally (whereas we want any Deref), and the internals of Bytes are more complex than I expected so I don't want to be reimplementing it.

Can you expand a little more about how you expect Pin to come into play here?

Sorry, that was based on expecting Pin to allow self referential pointers to [u8], but looking into it further I don't see how it will do that, since [u8]: Unpin. Correct me if I'm wrong, but it seems like Pin ended up being a solution only for specific immovable types (such as used in generators), rather than for the general self reference problem? The problem I want to solve is that using start/end offsets means we need an extra pointer deref due to Arc/Rc, so I was hoping for some way to replace start/end with ptr/len. I think start/end also needs another bounds check,

@fitzgen
Copy link
Member Author

fitzgen commented May 2, 2018

Thanks for the clarification!

My initial thoughts were along the lines of struct { bytes: Bytes, endian: Endian }. However, Bytes is restricted to being a Vec internally (whereas we want any Deref), and the internals of Bytes are more complex than I expected so I don't want to be reimplementing it.

Ok.

Correct me if I'm wrong, but it seems like Pin ended up being a solution only for specific immovable types (such as used in generators), rather than for the general self reference problem?

Sort of. A self-reference is safe as long as the underlying thing is pinned. I think it gives us a vocabulary for talking about this stuff, but not any off-the-shelf solutions for self-reference, if that makes sense. I also haven't read the RFC since it underwent a couple changes, so take this with a grain of salt.

The problem I want to solve is that using start/end offsets means we need an extra pointer deref due to Arc/Rc, so I was hoping for some way to replace start/end with ptr/len. I think start/end also needs another bounds check,

I think we can replace start/end with ptr/len and some unsafe code without changing the public interface, assumign we see this show up in profiles.

@fitzgen
Copy link
Member Author

fitzgen commented May 2, 2018

I think we can replace start/end with ptr/len and some unsafe code without changing the public interface, assumign we see this show up in profiles.

To expand a little here, there would still be ref counting, but instead of doing

fn slice(&self) -> &[u8] {
    bounds_check(*(arc).len, start)
    bounds_check(*(arc).len, end)
    (*(arc.ptr + start), end - start)
}

fn clone(&self) -> Self {
    arc.increment_ref_count()
    Self {
        arc: self.arc,
        start: self.start,
        end: self.end,
    }
}

we would do

fn slice(&self) -> &[u8] {
    (self.ptr, self.len)
}

fn clone(&self) -> Self {
    // Still need to keep the original thing alive.
    self.arc.increment_ref_count();
    Self {
        arc: self.arc,
        ptr: self.ptr,
        len: self.len,
    }
}

Although, come to think of it, I am unsure that this optimization would be valid with arbitrary T: Deref<[u8]> + Clone and it might require an unsafe trait that says "I guarantee to always deref to the same slice, and not to use interior mutability to destroy a slice I handed out previously".

@philipc
Copy link
Collaborator

philipc commented May 2, 2018

Although, come to think of it, I am unsure that this optimization would be valid with arbitrary T: Deref<[u8]> + Clone and it might require an unsafe trait that says "I guarantee to always deref to the same slice, and not to use interior mutability to destroy a slice I handed out previously".

Yes, I should have mentioned that there's https://crates.io/crates/stable_deref_trait for that, I had just thought that T: Pin<[u8]> was a similar thing, but it doesn't seem to be.

@philipc
Copy link
Collaborator

philipc commented May 10, 2018

I think this is going to be useful for converting addr2line to handle the Cow changes in object, since the data needs to live as long as the addr2line::Context, rather than being able to use local variables like in dwarfdump.

@fitzgen
Copy link
Member Author

fitzgen commented May 11, 2018

Picking this back up now.

@fitzgen fitzgen changed the title [WIP] experimental EndianReader<T> Add EndianReader<T> May 11, 2018
@fitzgen
Copy link
Member Author

fitzgen commented May 11, 2018

@philipc ok, I think this is ready for review!

Copy link
Collaborator

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things to fix.

Also I checked the performance for bench_parsing_debug_info and EndianRcSlice is about 50% slower than EndianSlice.

/// ```
/// use std::rc::Rc;
///
/// let buf = Rc::from(&[1, 2, 3, 4][..]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way that EndianRcSlice and EndianArcSlice will be usable without allocating a copy of the slice like this does?

If you already have an allocation (eg a Vec or Cow) and you want to avoid a copy, then it looks like you need to use Rc<Vec<u8>> instead, but accessing the slice in that requires an extra deref.

Not sure there is anything we can do better here, it just wasn't obvious to me and I hadn't thought about it before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think about it, this is pretty inherent to the operation you're doing, since to avoid the extra deref, you need to put the refcount right before the heap elements, but if you're using a Vec, that means you would have needed to have already reserved that space. This is why you can turn a Vec<u8> into a Box<[u8]> with zero cost (because it is a fat pointer of (ptr, len)) but there is not a zero cost Vec<u8> into Rc<[u8]>.

It wouldn't be difficult to write a custom type that puts the refcount in front of the heap elements, but it would be a bunch of boilerplate and some unsafe code.

/// Say you have an `mmap`ed file that you want to serve as a `gimli::Reader`.
/// You can wrap that `mmap`ed file up in a `MmapFile` type and use
/// `EndianReader<Rc<MmapFile>>` or `EndianReader<Arc<MmapFile>>` as readers as
/// long as `MmapFile` dereferences to the underlying `[u8]` data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mmap of the whole file, not the sections, so if we wanted to use this example in practice, we'd need our object parser to tell us the offsets of the sections, or subtract pointers to recover the offsets, or have an object parser that uses gimli::Reader. Nothing to do in this PR, just more stuff I hadn't thought about before.

} else {
let start = self.start;
self.start += len;
Ok(&self.bytes[start..self.end])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be self.bytes[start..][..len].

// let buf = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0];
// let eb = EndianReader::new(&buf, NativeEndian);
// eb.split_at(30);
// }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests? I know the rest of the Reader stuff is lacking in unit tests, but I'd be happy with copying one of the tests in tests/parse_self.rs to use EndianRcSlice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally. I meant to do this and completely forgot. Need to hold myself to a higher standard -_-

if self.len() < len {
Err(Error::UnexpectedEof)
} else {
self.range.start = self.range.end - len;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be self.range.end = self.range.start + len;.

An easy way to define a custom `Reader` implementation with a reference to a
generic buffer of bytes and an associated endianity.

Fix gimli-rs#294
Looks like it takes about 1.58x the time that using `EndianSlice` does:

```
test bench_parsing_debug_info                                  ... bench:   2,261,953 ns/iter (+/- 142,827)
test bench_parsing_debug_info_with_endian_rc_slice             ... bench:   3,564,292 ns/iter (+/- 208,038)
```
@fitzgen
Copy link
Member Author

fitzgen commented May 15, 2018

@philipc ok, I added unit tests for EndianReader as well as a parse-self integration test and a parse-self benchmark.

As far as benchmark timing goes, my findings were similar to yours, where using EndianRcSlice ot parse .debug_info takes about 1.58x the time that using EndianSlice does.

@philipc philipc merged commit da9fad8 into gimli-rs:master May 16, 2018
@philipc
Copy link
Collaborator

philipc commented May 16, 2018

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants