-
Notifications
You must be signed in to change notification settings - Fork 98
Changing widths of integers across round-trips breaks relocations #198
Comments
I believe we can just use separate type with custom serialization for such cases |
Is it really about original length? As far as I remember, wabt has a switch like “non canonical lebs” for such cases, and if it set then all lebs will be encoded with it’s maximal length. |
Yes. Only relocatable addresses are being emitted as 5-byte values. Non-address constants are being emitted as the smallest width. See |
I suggest to just add VarInt32NonCan or similar, and let it serialize to full 5 bytes always or remember it's deserialized length (not sure the latter is even required) |
It does mean that the const opcodes like |
Yeah, this is no-go, i thought it was happening in relocation sections only |
Another solution will require a lot of work Not sure if it worth troubles |
How would that help? Like I said earlier, some opcodes contain smallest width and some opcodes contain largest width. It's not always one or always the other. I'm thinking of storing a map of original offset -> opcode index in the code section and having the relocation section serialize differently based on how the code section was serialized. Haven't yet written the code to see how feasible it is because it means I need to have a custom |
something like this: but surely it is not worth it |
|
You just add two width fields (if both operands actually do require so), and you don't need to hold both types of Opcode, you just use width-preserving variant whenever you need it, and use it for all opcodes (since its abilities to serialize match those of regular variant if the representation is the most compact originally). |
I made a prototype of how it would look for individual opcodes to store the original width with the value: https://github.com/Arnavion/parity-wasm/commits/min-width I only did it for That said, I want parity-wasm to be able to relocate modules in the future, which means it needs to be able to update these relocated values arbitrarily. I don't know if I want to depend on lld to always leave enough space like it does now. So I'm still going to try to implement my idea of storing a map in the CodeSection for original offset -> opcode, and changing the offsets in the RelocSection according to how the CodeSection is serialized. |
I pointed out in #196 (comment) that LLVM/lld sometimes compiles integers to larger representations than they need. That particular case was an
i32.const 0
compiled as0x41 0x80 0x80 0x80 0x80 0x00
instead of just0x41 0x00
. parity-wasm can read it just fine, but it serializes it back out as0x41 0x00
instead of the original representation.This is a problem for relocations since relocations are measured in byte offsets from the start of the corresponding section. So a round-trip breaks all relocations after such a value.
It seems to me from testing that lld does this for every relocatable value, probably intentionally so that loaders / linkers doing have space to write the relocated value without needing to move bytes around.
How do you think this should be fixed?
Each
RelocationEntry
could have some reference to the correspondingVar*Int*
(or to the section containing theVar*Int*
. But this crosses section boundaries and be brittle.Or, each
Var*Int
could store the width it originally had, and the serializer could artificially expand it to that width. But this breaks theFrom
andInto
impls for these types.The text was updated successfully, but these errors were encountered: