Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fr32: load more than one byte at a time before shifting #460

Closed
schomatis opened this issue Jan 29, 2019 · 1 comment
Closed

fr32: load more than one byte at a time before shifting #460

schomatis opened this issue Jan 29, 2019 · 1 comment

Comments

@schomatis
Copy link
Contributor

The current padding operation has at its core a basic shift function which traverses the input one byte at a time: loads it, shifts it (in a later pass the shift "overflow" is processed, moving it from one byte to the next one) and saves it in the in-progress output.

https://github.com/filecoin-project/rust-proofs/blob/372ed716ce839170d98771f323beef56e826c175/sector-base/src/io/fr32.rs#L464-L470

As discussed with @porcuquine this algorithm could be (easily, I think) extended to load more than one byte at a time, e.g., load 8 bytes in a u64 and perform a single register shift before saving it back again in memory. This can be accomplished with the (now) stabilized from_le (and related variants) that basically performs a normal load in LE architectures (which I'm guessing it's going to be the majority of architectures where this is going to run) or a (more expensive) swap_bytes call to reorder them in a BE architecture (even though that's more expensive I don't know if this actually more expensive that my current implementation).

I think (but haven't dedicated this enough time) that this variation can be patched with relatively low effort (the most significant incorporation to the algorithm would be to handle the cases where the input we're handling is lower than a, say, u64 and we need to perform smaller loads). I'm leaving this in the backlog pipeline (or maybe it should be in the ready one?) as I reminder to myself that there's a potentially low hanging fruit (we have no benchmarks that clearly indicate how much can be gained here) to be taken when there's not a higher priority task at hand.

Also, @dignifiedquire suggested the bitwise crate that even though it's a bit more involved than the first approach it seems like a natural iteration to it if the from_le is indeed useful, to start offloading more logic away and reducing the size of the current ad hoc bit-shifting solution.

@schomatis schomatis added perf P2 Low priority labels Jan 29, 2019
@schomatis schomatis self-assigned this Jan 29, 2019
@schomatis
Copy link
Contributor Author

(Assigning this a fake estimate of 5 -based on other issues with same value- but I have no idea how this valuation system actually works.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants