-
Notifications
You must be signed in to change notification settings - Fork 97
[WIP] Take advantage of const generics in the Rust parser #903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| #[derive(Debug, Clone)] | ||
| pub struct BoundedSbpString<const SIZE: usize>([u8; SIZE]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only intentional change in functionality, when the SBP message specifies a fixed length string we store the entire string instead of up to the maximum length.
| vec.push(self.parse()?); | ||
| } | ||
| vec.try_into().map_err(|_| crate::Error::ParseError) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there is a practical difference, but we could skip the intermediate vec here with something like
impl<'a, T, const SIZE: usize> SbpParse<[T; SIZE]> for &'a [u8]
where
&'a [u8]: SbpParse<T>,
{
fn parse(&mut self) -> Result<[T; SIZE]> {
let mut data: [std::mem::MaybeUninit<T>; SIZE] =
unsafe { std::mem::MaybeUninit::uninit().assume_init() };
for elem in &mut data[..] {
unsafe {
std::ptr::write(elem.as_mut_ptr(), self.parse()?);
}
}
let ptr = &mut data as *mut _ as *mut [T; SIZE];
let res = unsafe { ptr.read() };
core::mem::forget(data);
Ok(res)
}
}Honestly not sure what the last few lines do, I wanted to write
impl<'a, T, const SIZE: usize> SbpParse<[T; SIZE]> for &'a [u8]
where
&'a [u8]: SbpParse<T>,
{
fn parse(&mut self) -> Result<[T; SIZE]> {
let mut data: [std::mem::MaybeUninit<T>; SIZE] =
unsafe { std::mem::MaybeUninit::uninit().assume_init() };
for elem in &mut data[..] {
unsafe {
std::ptr::write(elem.as_mut_ptr(), self.parse()?);
}
}
unsafe { Ok(std::mem::transmute::<_, [T; SIZE]>(data)) }
}
}But apparently you can't transmute with const generics atm. We could probably also avoid the unsafe by adding a Default trait bound so we could init the array like let mut data = [Default::default(), SIZE];
Not sure this is worth the unsafe, but seems like it would avoid putting some stuff on the heap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to figure out how to avoid the heap allocation, but couldn't figure out how to use MaybeUninit without using even more unstable nightly APIs. I think we should consider something like what you've recommended, all of the fixed size arrays are pretty small so a heap allocation seems overly expensive if it can be avoided. Thanks for the starting point! I'll try and work it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have linked this earlier but for reference - https://doc.rust-lang.org/nomicon/unchecked-uninit.html
Explains why MaybeUninit::uninit().assume_init() can work (I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also based on that, I think the loop is better off as -
for i in 0..SIZE {
data[i] = std::mem::MaybeUninit::new(self.parse()?);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I have will leak memory if the array fills up partially then errors.MaybeUninit stops drop from getting called. I think that's fine if T is just a scalar - are there messages with nested arrays/vecs? If so I think we need to manually drop whatever got created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SBP can only nest fixed length arrays (and strings), an unbounded array can't be followed by anything else and has to be made up of fields with a known size. If we change the string type to also used fixed size arrays then we should be fine. That being said it seems like we're asking for trouble by not calling drop on objects when we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in principle, I'm curious if there's a measurable impact on speed or memory efficiency? From the PR description I'm not sure I understand the benefit/motivation of using const generics over the current approach.
|
@silverjam The original argument I had, and what motivated me to propose this change, was to make the Rust API more representative of the SBP specification. When then spec says that there are exactly X elements in an array it seems better and more honest for the Rust types representing those messages to mirror that fact. Performance impact is a good question. In theory I would expect these changes to give a small boost since we are potentially avoiding a system call to allocate memory (assuming that I address @notoriaga's comment about removing the use of To be honest this is a fair amount of change for tweaking about 15 messages by my count. |
|
I'm going to close this. It doesn't seem like a good trade off at the moment, but we should keep this in mind for any future large scale refactors we do of the Rust parser code. |
This is a speculative PR that takes advantage of const generics available in the nightly rust compiler. We might want to consider this (or a similar) change once const generics are stable. This PR is split into two levels of changes with the second completely reworking the first. I'd suggest reviewing the changes in isolation treating the first change as a conservative modification and the second and a more intrusive refactor. We can drop the second set of changes if they are not of interest.
The first change modifies the rust code generator to use fixed size arrays in the message types when the message definition includes a known size. It also changes the existing functions to read fixed size arrays to be generic over array sizes instead of taking array size at runtime. Due to the complexity with our string handling code this change does not modify handling of fixed length strings.
The second change is a much wider refactor of our type parsing code in an attempt to reduce the amount of code that the SBP generator needs to generate. It adds a generic trait called
SbpParsefor parsing types out of something. We implement this trait for all the basic types and for each message, reading out of a&[u8]. We also have a generic implementation for parsing aVec<T>or[T; N]for anyTwhich can be parsed. Due to the trait definition we need to have separate types for fixed length strings and unbounded length strings so there can be two different parsing implementations. With the introduction of two string types a trait is added to give them a common interface. All of this has the end result of having to only generate a singleparsefunction for each message type and remove theparse_arrayandparse_array_limitfunctions from the generated code.