Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slice (bulk) volatile write/read methods on VolBlock #28

Open
nicopap opened this issue Sep 25, 2022 · 3 comments
Open

Slice (bulk) volatile write/read methods on VolBlock #28

nicopap opened this issue Sep 25, 2022 · 3 comments

Comments

@nicopap
Copy link
Contributor

nicopap commented Sep 25, 2022

Currently, the only way to write a large amount of data into volatile memory is through iterating through individual VolAddress and writing a single T to memory one after another.

I've not inspected myself the generated code, but it seems reasonable to assume this results in slower than optimal code (it seems fast, so maybe it actually is optimizing it!) I've read your amazing writeup on how to hand optimize load/store ASM code, and it seems reasonable that a write_slice and read_slice methods could make use of better intrinsics such as volatile_copy_memory and volatile_set_memory (Using a pointer cast might just work??)

impl VolBlock {
  pub fn write_slice_at_offset(self, offset: usize, slice: &[T]) {
    //...
  }
  pub fn read_slice_at_offset<const L: usize>(self, offset: usize) -> [T; L] {
    //...
  }
  pub fn write_slice(self, slice: &[T]) {
    //...
  }
  pub fn read_slice<const L: usize>(self) -> [T; L] {
    //...
  }
}

I've written already such methods (though without optimizations, just the API) and I'd be happy to contribute it back if this is wanted.

@Lokathor
Copy link
Member

The problem with the volatile_copy_memory and volatile_set_memory intrinsics is that they're still unstable, and likely to be unstable for a long time. That's why I made the design as simple as possible to kinda work around that intrinsic being missing.

I think some of this sort of thing is in the VolSlice type, which is behind a feature gate right now, but it probably would be fine to stabilize into the main library. And the VolBlock type will deref into the VolSlice type, once things are ready.

@nicopap
Copy link
Contributor Author

nicopap commented Sep 26, 2022

Oh I see now. I completely missed the VolRegion struct. That looks exactly like what I'm describing in this issue. Amazing! I guess it's still possible to featuregate the optimized version and keep the sequential read/slice in the stable implementation.

Looking at the volatile crate, I see that's exactly what they came up with.

Though I wonder if casting to a VolAddress<[T; N]> then writing/reading the array to/from memory compiles down into volatile_copy/set_memory. Which would be an alright stable alternative (although requires specifying the exact length)

@Lokathor
Copy link
Member

I think for careful optimization purposes, a little more context might be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants