You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current padding operation has at its core a basic shift function which traverses the input one byte at a time: loads it, shifts it (in a later pass the shift "overflow" is processed, moving it from one byte to the next one) and saves it in the in-progress output.
As discussed with @porcuquine this algorithm could be (easily, I think) extended to load more than one byte at a time, e.g., load 8 bytes in a u64 and perform a single register shift before saving it back again in memory. This can be accomplished with the (now) stabilized from_le (and related variants) that basically performs a normal load in LE architectures (which I'm guessing it's going to be the majority of architectures where this is going to run) or a (more expensive) swap_bytes call to reorder them in a BE architecture (even though that's more expensive I don't know if this actually more expensive that my current implementation).
I think (but haven't dedicated this enough time) that this variation can be patched with relatively low effort (the most significant incorporation to the algorithm would be to handle the cases where the input we're handling is lower than a, say, u64 and we need to perform smaller loads). I'm leaving this in the backlog pipeline (or maybe it should be in the ready one?) as I reminder to myself that there's a potentially low hanging fruit (we have no benchmarks that clearly indicate how much can be gained here) to be taken when there's not a higher priority task at hand.
Also, @dignifiedquire suggested the bitwise crate that even though it's a bit more involved than the first approach it seems like a natural iteration to it if the from_le is indeed useful, to start offloading more logic away and reducing the size of the current ad hoc bit-shifting solution.
The text was updated successfully, but these errors were encountered:
The current padding operation has at its core a basic shift function which traverses the input one byte at a time: loads it, shifts it (in a later pass the shift "overflow" is processed, moving it from one byte to the next one) and saves it in the in-progress output.
https://github.com/filecoin-project/rust-proofs/blob/372ed716ce839170d98771f323beef56e826c175/sector-base/src/io/fr32.rs#L464-L470
As discussed with @porcuquine this algorithm could be (easily, I think) extended to load more than one byte at a time, e.g., load 8 bytes in a
u64
and perform a single register shift before saving it back again in memory. This can be accomplished with the (now) stabilizedfrom_le
(and related variants) that basically performs a normal load in LE architectures (which I'm guessing it's going to be the majority of architectures where this is going to run) or a (more expensive)swap_bytes
call to reorder them in a BE architecture (even though that's more expensive I don't know if this actually more expensive that my current implementation).I think (but haven't dedicated this enough time) that this variation can be patched with relatively low effort (the most significant incorporation to the algorithm would be to handle the cases where the input we're handling is lower than a, say,
u64
and we need to perform smaller loads). I'm leaving this in thebacklog
pipeline (or maybe it should be in theready
one?) as I reminder to myself that there's a potentially low hanging fruit (we have no benchmarks that clearly indicate how much can be gained here) to be taken when there's not a higher priority task at hand.Also, @dignifiedquire suggested the
bitwise
crate that even though it's a bit more involved than the first approach it seems like a natural iteration to it if thefrom_le
is indeed useful, to start offloading more logic away and reducing the size of the current ad hoc bit-shifting solution.The text was updated successfully, but these errors were encountered: