Message encoding / decoding / logging #18

tkporter · 2022-12-07T14:31:56Z

This is a fun one!

It's super important that the Fuel/Sway Mailboxes encode messages and keccak256 them in the exact same way they do in the EVM/Solidity world. Unfortunately, Fuel/Sway has some quirks that makes it hard to get this out of the box:

No equivalent of abi.encodePacked. The only encoding that sort of exists is the implied encoding of primitive types in structs. However these aren't packed tightly -- they're padded to fit in whole words. A word in Fuel is 64 bits or 8 bytes. So the following struct would look like this:

struct Foo {
  bar: u8,
  baz: u32,
}

let foo = Foo {
  bar: u8::max(),
  baz: u32::max(),
};

// In memory:
00 00 00 00 00 00 00 FF // <-- the u8 max, left padded with zeroes into the first word
00 00 00 00 FF FF FF FF // <-- the u32 max, left padded with zeroes into the second word

Vec<u8> in Sway allocates its elements on the heap. Similar to structs, it doesn't tightly pack its elements-- each u8 element has it's own 64 bit word allocated to it! The leftmost 7 bytes are zeroes, the rightmost byte is the element. You also can't keccak or log the Vec type out of the box (you pretty much need to write assembly to do this)
The new Bytes type hasn't yet been shipped but has been merged. It provides tight packing of bytes but nothing to help with encoding / decoding non u8 types.

So what can we do?

Originally, I wrote some mostly hand-crafted bitwise operations for EncodedMessage to craft each word and read / write to memory directly. This is pretty fragile and doesn't translate well to other use cases

So instead, this PR provides bytes_extended, which is an extension to the Bytes type. It essentially lets you read and write types at specific offsets in the Bytes. It also provides access to keccak256 and logging over these bytes.

yorhodes

great work on the byte packing, super hairy

yorhodes · 2022-12-07T20:45:00Z

hyperlane-mailbox/src/main.sw

+fn msg_sender_b256() -> b256 {
+    match msg_sender().unwrap() {
+        Identity::Address(address) => address.into(),
+        Identity::ContractId(id) => id.into(),
+    }
+}


hyperlane-mailbox/tests/harness.rs

yorhodes · 2022-12-07T20:49:59Z

hyperlane-message-test/src/main.sw

+    fn id(message: Message) -> b256 {
+        EncodedMessage::from(message).id()
+    }


can we replace all of these with

Suggested change

fn id(message: Message) -> b256 {

EncodedMessage::from(message).id()

}

fn encode(message: Message) -> EncodedMessage {

EncodedMessage::from(message)

}

does this even need to be a contract?

I think we need it to be a contract - I don't think the Rust-based tests could otherwise get access to the library methods for EncodedMessage otherwise. If you try to run abigen in a way where it needs to return the struct EncodedMessage in a function, it'll complain because there's no equivalent for the raw_ptr. Also confirmed by running cargo expand --tests that it just gives you naive bindings to struct types but not to any functions of the structs, so even if we could use raw_ptr we wouldn't be able to call library functions

yorhodes · 2022-12-07T21:42:06Z

hyperlane-message-test/tests/harness.rs

+        // each of these words. We only care about the rightmost byte for each
+        // word that's logged.
+        let body: Vec<u8> = body_log_data
+            .chunks(BYTES_PER_WORD)
+            .map(|buf| buf[BYTES_PER_WORD - 1])
+            .collect();


yorhodes · 2022-12-07T21:42:45Z

hyperlane-message/src/main.sw

+        let mut word: u64 = (version << 56);
+        word = word | (nonce << 24) | (origin >> 8);


any reason to assign multiple times?

Unfortunately yes lol - there's some jankiness in the compiler that requires this because otherwise it things word is of a non-u64 type. super weird, I describe it here https://github.com/hyperlane-xyz/fuel-contracts/pull/18/files#diff-29cc6efee7411b24b785a04ecdee0529fdef41ff993f0f37d3f4a97fc50102f6R92

yorhodes · 2022-12-09T15:52:09Z

hyperlane-message/src/main.sw

+        //
+        // recipient_word3  [0:40]  - 5 bytes (continued from previous word)
+        // body             [40:??]
+        word = (recipient_word3 << 24);


I tried to reason through all of this but tbh it seems way too easy to miss something minor here

I do trust your test coverage tho

just wonder if we should wait for struct and array packing
seems like a pretty important feature

yorhodes · 2022-12-09T15:54:18Z

hyperlane-message/src/main.sw

+            (word1 << 8) | (word2 >> 56), // The last 7 bytes of word 1 and the first byte of word 2
+            (word2 << 8) | (word3 >> 56), // The last 7 bytes of word 2 and the first byte of word 3
+            (word3 << 8) | (word4 >> 56), // The last 7 bytes of word 3 and the first byte of word 4
+            (word4 << 8) | (word5 >> 56), // The last 7 bytes of word 4 and the first byte of word 5


this almost makes me want to rethink the layout lol

yorhodes · 2022-12-09T15:55:13Z

hyperlane-message/src/main.sw

+    pub fn body(self) -> Vec<u8> {
+        let body_len = self.buffer.bytes_len - PREFIX_BYTES;
+
+        let mut body: Vec<u8> = Vec::with_capacity(body_len);
+
+        // The body starts in word 9.
+        let mut current_word_index = 9;
+        let mut word = self.buffer.read_word(current_word_index);
+
+        let mut body_index = 0;
+        while body_index < body_len {
+            // Where 0 means the furthest left byte in the word.
+            let byte_index_within_word = (body_index + BODY_START_BYTE_IN_WORD) % BYTES_PER_WORD;
+            let right_shift = ((7 - byte_index_within_word) * BITS_PER_BYTE);
+            // Push the byte to the Vec.
+            let byte: u8 = (word >> right_shift) & 0xff;
+            body.push(byte);
+
+            // If this was the last byte in the word, read the next word.
+            if byte_index_within_word == 7u64 {
+                // Move to the next word.
+                current_word_index += 1;
+
+                word = self.buffer.read_word(current_word_index);
+            }
+
+            body_index += 1;
+        }
+
+        body
+    }


I really would think we can just return a pointer with len into the buffer without looping

Unfortunately can't with the Vec<u8> type because each u8 isn't tightly packed and lives in it's own 8 byte word :/

yorhodes · 2022-12-09T15:56:36Z

test-utils/src/lib.rs

+    // [any prior receipts..., LogData with reason, Revert, ScriptResult]
+    // We want the LogData with the reason, which is utf-8 encoded as the `data`.


yorhodes · 2022-12-09T15:57:20Z

test-utils/src/lib.rs

+        );
+    };
+
+    // Null bytes `\0` will be padded to the end of the revert string, so we remove them.


lol are we sure we want to remove this? isn't this the null terminator

I believe null termination isn't relevant to rust strs (see the first couple sentences here https://docs.rs/c_str/latest/c_str/). The string that fuel was giving was padded with a bunch of null bytes so that the string fit cleanly into words -- I think this is just cuz they pad a bunch of zero bytes and not anything to do with null termination

…revor/message-encoding

tkporter · 2022-12-13T15:39:59Z

bytes-extended/Forc.toml

+name = "bytes_extended"
+
+[dependencies]
+std = { git = 'https://github.com/fuellabs/sway', branch = "master" }


This is required for now to get access to FuelLabs/sway#3454 which has not yet been shipped in a forc release. Once they ship, we should rm

tkporter · 2022-12-13T15:41:24Z

bytes-extended/src/main.sw

+
+impl b256 {
+    /// Returns a pointer to the b256's packed bytes.
+    fn packed_bytes(self) -> raw_ptr {


Note this should probably be a trait but if you see the note here https://fuellabs.github.io/sway/v0.31.1/advanced/traits.html?search=#custom-types-structs-enums, you can't yet have where clauses :( so you can't have functions take in generics that implement a trait yet

yorhodes

super happy where we landed
lets ship

yorhodes · 2022-12-13T17:47:42Z

bytes-extended/src/mem.sw

+    asm(ptr) {
+        move ptr sp; // Copy the stack pointer (sp) register into `ptr`.
+        cfei i8; // Add 8 bytes (1 word) to the stack pointer, giving the memory at `ptr` a size of 8 bytes.
+        ptr: raw_ptr // Return `ptr`.
+    }


why not?

Suggested change

asm(ptr) {

move ptr sp; // Copy the stack pointer (sp) register into `ptr`.

cfei i8; // Add 8 bytes (1 word) to the stack pointer, giving the memory at `ptr` a size of 8 bytes.

ptr: raw_ptr // Return `ptr`.

}

alloc_bytes(BYTES_PER_WORD)

Just a possibly incorrect feeling that we should prefer the stack over the heap because (to my understanding) is more ephemeral than the heap. I'm waiting to hear back from the Fuel guys if we're better off on the heap tho. If it's ok with you let's merge and can address this as we hear from the Fuel ppl

yorhodes · 2022-12-13T17:48:29Z

bytes-extended/src/main.sw

+fn write_value_to_stack(value: u64, byte_count: u64) -> raw_ptr {
+    // Allocate a whole word on the stack.
+    let stack_word_ptr = alloc_stack_word();
+    // Write the value onto the stack.
+    stack_word_ptr.write::<u64>(value);
+    // Move the pointer forward to ignore any left padded zero bytes, and to point
+    // directly to the start of the value's contents.
+    let left_padding_byte_count = BYTES_PER_WORD - byte_count;
+    stack_word_ptr.add_uint_offset(left_padding_byte_count)
+}
+
+/// Reads a value that is `byte_count` bytes in length from `ptr`.
+/// Returns this value as a u64.
+///
+/// ### Arguments
+/// * `ptr` - A pointer to memory where the value begins. The `byte_count` bytes
+///   starting at `ptr` are read.
+/// * `byte_count` - The number of bytes of the original value. E.g. if the value
+///   being read is a u32, this should be 4 bytes.
+fn read_value_from_memory(ptr: raw_ptr, byte_count: u64) -> u64 {
+    // Allocate a whole word on the stack.
+    let stack_word_ptr = alloc_stack_word();
+    // Copy the `byte_count` bytes from `ptr` into `stack_word_ptr`.
+    // Note if e.g. 4 bytes are read from `ptr`, these are copied into the
+    // first 4 bytes of `stack_word_ptr`. These bytes must be shifted to the
+    // right to be correctly read into a 4-byte u32.
+    ptr.copy_bytes_to(stack_word_ptr, byte_count);
+    // Get the word at stack_word_ptr.
+    let word = stack_word_ptr.read::<u64>();
+    // Bit shift as neccesary.
+    word >> (BITS_PER_BYTE * (BYTES_PER_WORD - byte_count))
+}


much more readable now!

yorhodes · 2022-12-13T17:55:47Z

bytes-extended/src/main.sw

+    pub fn read_bytes(ref mut self, offset: u64, len: u64) -> Bytes {
+        let read_ptr = self.get_read_ptr(
+            offset,
+            len,
+        );
+
+        let mut bytes = Bytes::with_length(len);
+        bytes.write_packed_bytes(0u64, read_ptr, len);
+        bytes
+    }


can we return a new Bytes into the existing memory?

Mmm this would be really cool...

I could see this maybe being a bad practice to have contention over the same memory, but I think this would actually be fine. I'm gonna open #25 so we can think about this and just merge this for now, but I think this is probably a good move

bytes-extended/src/main.sw

yorhodes · 2022-12-13T17:59:51Z

hyperlane-message-test/src/main.sw

+    fn id(message: Message) -> b256 {
+        EncodedMessage::from(message).id()
+    }


yorhodes · 2022-12-13T18:00:11Z

hyperlane-message-test/src/main.sw

+        EncodedMessage::from(message).recipient()
+    }
+
+    /// Vec/Bytes return types aren't supported by the Rust SDK.


wonder if they have something WIP here as well

hope so, we should ask for sure

hyperlane-message/src/main.sw

run_tests.sh

tkporter added 10 commits December 5, 2022 12:58

Add EncodedMessage

8ba0c85

Some encoding attempts, works

549199a

Starting to clean up

92df00b

Getting cleaner

06a44ad

Clean, do mailbox tests

ce041bf

Fmt

61b94fe

Nits

0365e62

test utils

913c20d

Clean up mailbox test a little

b5f5348

Remove unnecessary assignments

cb00045

tkporter requested a review from yorhodes December 7, 2022 14:37

tkporter mentioned this pull request Dec 8, 2022

Initial implementation of multisig ism #16

Merged

yorhodes reviewed Dec 9, 2022

View reviewed changes

Base automatically changed from trevor/mailbox-and-merkle to main December 12, 2022 13:19

tkporter added 7 commits December 12, 2022 13:25

Merge branch 'main' of github.com:hyperlane-xyz/fuel-contracts into t…

05bfe39

…revor/message-encoding

Comment for msg_sender_b256

a9d8d68

PR comments

5dfe352

Messy, but starting to move over EncodedMessage

1849a73

Cleaned and tested lfg

c17ab8b

update comment

441d0d0

Delete bytes-extended-test, no need

ef4c464

tkporter commented Dec 13, 2022

View reviewed changes

tkporter added 6 commits December 13, 2022 16:11

Rm some word_buffer stuff

5afa658

Nit

95e3738

Use add_uint_offset instead of add

e4cd794

Clean up

8d742fc

nit

e56ee44

rm more asm

5992b36

yorhodes approved these changes Dec 13, 2022

View reviewed changes

This was referenced Dec 13, 2022

Less hand-crafted version of EncodedMessage #23

Closed

Move away from Vec<u8> to the Bytes type #22

Closed

tkporter mentioned this pull request Dec 13, 2022

Consider changing EncodedMessage.body to not copy the Bytes #25

Closed

PR comments

d3c2564

tkporter merged commit 1c517e4 into main Dec 13, 2022

This was referenced Dec 14, 2022

Move away from using the stack in favor of using structs to get references of non-reference types #27

Merged

Mailbox: correctly log entire dispatched message #3

Closed

Mailbox: correctly encode/format dispatched messages #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message encoding / decoding / logging #18

Message encoding / decoding / logging #18

tkporter commented Dec 7, 2022 •

edited

Loading

yorhodes left a comment

yorhodes Dec 7, 2022

yorhodes Dec 7, 2022

yorhodes Dec 7, 2022

tkporter Dec 12, 2022 •

edited

Loading

yorhodes Dec 13, 2022

yorhodes Dec 7, 2022

tkporter Dec 12, 2022

yorhodes Dec 7, 2022

tkporter Dec 12, 2022

yorhodes Dec 9, 2022

yorhodes Dec 9, 2022

yorhodes Dec 9, 2022

yorhodes Dec 9, 2022

yorhodes Dec 9, 2022

tkporter Dec 12, 2022

yorhodes Dec 9, 2022

yorhodes Dec 9, 2022

tkporter Dec 12, 2022

tkporter Dec 13, 2022

tkporter Dec 13, 2022

yorhodes left a comment

yorhodes Dec 13, 2022

tkporter Dec 13, 2022

yorhodes Dec 13, 2022

yorhodes Dec 13, 2022

tkporter Dec 13, 2022

yorhodes Dec 13, 2022

yorhodes Dec 13, 2022

tkporter Dec 13, 2022

		let mut word: u64 = (version << 56);
		word = word \| (nonce << 24) \| (origin >> 8);

		// [any prior receipts..., LogData with reason, Revert, ScriptResult]
		// We want the LogData with the reason, which is utf-8 encoded as the `data`.

Message encoding / decoding / logging #18

Message encoding / decoding / logging #18

Conversation

tkporter commented Dec 7, 2022 • edited Loading

yorhodes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkporter Dec 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yorhodes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkporter commented Dec 7, 2022 •

edited

Loading

tkporter Dec 12, 2022 •

edited

Loading