[WIP] Take advantage of const generics in the Rust parser #903

jbangelo · 2021-01-16T18:58:18Z

This is a speculative PR that takes advantage of const generics available in the nightly rust compiler. We might want to consider this (or a similar) change once const generics are stable. This PR is split into two levels of changes with the second completely reworking the first. I'd suggest reviewing the changes in isolation treating the first change as a conservative modification and the second and a more intrusive refactor. We can drop the second set of changes if they are not of interest.

The first change modifies the rust code generator to use fixed size arrays in the message types when the message definition includes a known size. It also changes the existing functions to read fixed size arrays to be generic over array sizes instead of taking array size at runtime. Due to the complexity with our string handling code this change does not modify handling of fixed length strings.

The second change is a much wider refactor of our type parsing code in an attempt to reduce the amount of code that the SBP generator needs to generate. It adds a generic trait called SbpParse for parsing types out of something. We implement this trait for all the basic types and for each message, reading out of a &[u8]. We also have a generic implementation for parsing a Vec<T> or [T; N] for any T which can be parsed. Due to the trait definition we need to have separate types for fixed length strings and unbounded length strings so there can be two different parsing implementations. With the introduction of two string types a trait is added to give them a common interface. All of this has the end result of having to only generate a single parse function for each message type and remove the parse_array and parse_array_limit functions from the generated code.

jbangelo · 2021-01-16T23:37:34Z

rust/sbp/src/lib.rs

+#[derive(Debug, Clone)]
+pub struct BoundedSbpString<const SIZE: usize>([u8; SIZE]);


This is the only intentional change in functionality, when the SBP message specifies a fixed length string we store the entire string instead of up to the maximum length.

notoriaga · 2021-01-19T17:45:44Z

rust/sbp/src/parser/mod.rs

+            vec.push(self.parse()?);
+        }
+        vec.try_into().map_err(|_| crate::Error::ParseError)
+    }


Not sure if there is a practical difference, but we could skip the intermediate vec here with something like

impl<'a, T, const SIZE: usize> SbpParse<[T; SIZE]> for &'a [u8] where &'a [u8]: SbpParse<T>, { fn parse(&mut self) -> Result<[T; SIZE]> { let mut data: [std::mem::MaybeUninit<T>; SIZE] = unsafe { std::mem::MaybeUninit::uninit().assume_init() }; for elem in &mut data[..] { unsafe { std::ptr::write(elem.as_mut_ptr(), self.parse()?); } } let ptr = &mut data as *mut _ as *mut [T; SIZE]; let res = unsafe { ptr.read() }; core::mem::forget(data); Ok(res) } }

Honestly not sure what the last few lines do, I wanted to write

impl<'a, T, const SIZE: usize> SbpParse<[T; SIZE]> for &'a [u8] where &'a [u8]: SbpParse<T>, { fn parse(&mut self) -> Result<[T; SIZE]> { let mut data: [std::mem::MaybeUninit<T>; SIZE] = unsafe { std::mem::MaybeUninit::uninit().assume_init() }; for elem in &mut data[..] { unsafe { std::ptr::write(elem.as_mut_ptr(), self.parse()?); } } unsafe { Ok(std::mem::transmute::<_, [T; SIZE]>(data)) } } }

But apparently you can't transmute with const generics atm. We could probably also avoid the unsafe by adding a Default trait bound so we could init the array like let mut data = [Default::default(), SIZE];

Not sure this is worth the unsafe, but seems like it would avoid putting some stuff on the heap?

I was trying to figure out how to avoid the heap allocation, but couldn't figure out how to use MaybeUninit without using even more unstable nightly APIs. I think we should consider something like what you've recommended, all of the fixed size arrays are pretty small so a heap allocation seems overly expensive if it can be avoided. Thanks for the starting point! I'll try and work it in.

Should have linked this earlier but for reference - https://doc.rust-lang.org/nomicon/unchecked-uninit.html
Explains why MaybeUninit::uninit().assume_init() can work (I think)

Also based on that, I think the loop is better off as -

for i in 0..SIZE { data[i] = std::mem::MaybeUninit::new(self.parse()?); }

I think what I have will leak memory if the array fills up partially then errors.MaybeUninit stops drop from getting called. I think that's fine if T is just a scalar - are there messages with nested arrays/vecs? If so I think we need to manually drop whatever got created

SBP can only nest fixed length arrays (and strings), an unbounded array can't be followed by anything else and has to be made up of fields with a known size. If we change the string type to also used fixed size arrays then we should be fine. That being said it seems like we're asking for trouble by not calling drop on objects when we should.

silverjam

Looks good in principle, I'm curious if there's a measurable impact on speed or memory efficiency? From the PR description I'm not sure I understand the benefit/motivation of using const generics over the current approach.

jbangelo · 2021-02-04T00:23:51Z

@silverjam The original argument I had, and what motivated me to propose this change, was to make the Rust API more representative of the SBP specification. When then spec says that there are exactly X elements in an array it seems better and more honest for the Rust types representing those messages to mirror that fact. [T; X] is self documenting about the length while a Vec<T> could be any length and you need to read through the protocol documentation to know if it's truly variable length or will always be a fixed length. I acknowledge this is a more philosophical argument, and it might not be worth introducing a breaking API change for this.

Performance impact is a good question. In theory I would expect these changes to give a small boost since we are potentially avoiding a system call to allocate memory (assuming that I address @notoriaga's comment about removing the use of Vec in the array parsing function). On the other hand there might be more CPU used for copying when moving an array as compared to moving a vector. In reality I don't think there would be a noticeable change since these are such small arrays and the messages they are in are relatively small. Though we should test/benchmark this assumption to make sure I haven't introduced some slow down.

To be honest this is a fair amount of change for tweaking about 15 messages by my count.

jbangelo · 2021-05-17T18:36:55Z

I'm going to close this. It doesn't seem like a good trade off at the moment, but we should keep this in mind for any future large scale refactors we do of the Rust parser code.

jbangelo added 4 commits January 16, 2021 10:07

Changed rust generator to use fixed size arrays

b0c318a

Regened

d64b101

Refactored parsing code to use traits

0ce8112

Regened

60a6b36

jbangelo added the draft label Jan 16, 2021

jbangelo requested review from notoriaga and silverjam January 16, 2021 18:58

jbangelo commented Jan 16, 2021

View reviewed changes

notoriaga reviewed Jan 19, 2021

View reviewed changes

silverjam reviewed Feb 3, 2021

View reviewed changes

jbangelo closed this May 17, 2021

jbangelo deleted the jbangelo/fixed-size-rust-arrays branch May 17, 2021 18:36

notoriaga mentioned this pull request Sep 27, 2021

rust: release cleanup #1071

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Take advantage of const generics in the Rust parser #903

[WIP] Take advantage of const generics in the Rust parser #903

Uh oh!

jbangelo commented Jan 16, 2021

Uh oh!

jbangelo Jan 16, 2021

Uh oh!

notoriaga Jan 19, 2021

Uh oh!

jbangelo Jan 19, 2021

Uh oh!

notoriaga Jan 20, 2021 •

edited

Loading

Uh oh!

notoriaga Jan 20, 2021

Uh oh!

notoriaga Jan 20, 2021 •

edited

Loading

Uh oh!

jbangelo Jan 21, 2021

Uh oh!

silverjam left a comment •

edited

Loading

Uh oh!

jbangelo commented Feb 4, 2021

Uh oh!

jbangelo commented May 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		#[derive(Debug, Clone)]
		pub struct BoundedSbpString<const SIZE: usize>([u8; SIZE]);

[WIP] Take advantage of const generics in the Rust parser #903

[WIP] Take advantage of const generics in the Rust parser #903

Uh oh!

Conversation

jbangelo commented Jan 16, 2021

Uh oh!

jbangelo Jan 16, 2021

Choose a reason for hiding this comment

Uh oh!

notoriaga Jan 19, 2021

Choose a reason for hiding this comment

Uh oh!

jbangelo Jan 19, 2021

Choose a reason for hiding this comment

Uh oh!

notoriaga Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

notoriaga Jan 20, 2021

Choose a reason for hiding this comment

Uh oh!

notoriaga Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbangelo Jan 21, 2021

Choose a reason for hiding this comment

Uh oh!

silverjam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbangelo commented Feb 4, 2021

Uh oh!

jbangelo commented May 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

notoriaga Jan 20, 2021 •

edited

Loading

notoriaga Jan 20, 2021 •

edited

Loading

silverjam left a comment •

edited

Loading