Arbitrary nesting of heap types in contract returns #1046

digorithm · 2023-07-17T15:22:25Z

Discussed in #944

^{Originally posted by segfault-magnet April 26, 2023}
@hal3e and I have been discussing something, details to follow:

Currently, the Rust SDK cannot handle a nested heap type (e.g. -> Option<Vec<u64>>) in the return type of a contract/script. The limitation stems from the fact that returning heap types only returns pointers which are useless without the VM's memory.

A partial workaround was implemented where we would inject some extra bytecode after the contract call. The injected bytecode would generate a ReturnData receipt containing the previously inaccessible heap data. This allowed for returning a single non-nested heap type (e.g. -> Vec<u64>). The injected bytecode is currently incapable of handling anything more complex.

We suggest taking this to the extreme and supporting arbitrarily nested heap types.

There are three main problems we need to solve: structs, enums, and heap types nested in other heap types. Let's start with structs:

Structs

Let's consider the following example:

struct AnotherStruct {
  a: u64,
  b: Vec<u64>
}

struct SomeStruct { 
  a: u64,
  b: u64,
  c: AnotherStruct
}
// ... some contract method below
fn foo() -> SomeStruct {
    SomeStruct {
        a: 10,
        b: 11,
        c: AnotherStruct {
            a: 12,
            b: vec![13, 14, 15],
        },
    }
}

Encoded it looks something like this

In order to support this we would need to inject a retd instruction that will yank the missing heap data into a separate receipt.

The only difference between this and our current workaround is that the vector pointer isn't encoded immediately at the start. We would need to calculate this offset and add it to the address in the RET register in order to yank the heap data.

This can be done easily since we already know the exact size of every part of the struct.

In this case, it doesn't matter how deep the vector is inside nested structs, a single retd with a correct offset can get its data.

Enums

Let's consider the following example:

struct WrappingStruct {
    a: u64,
    b: Vec<u64>,
}

enum TheEnum {
    A(u64),
    B(WrappingStruct),
}

fn foo() -> TheEnum {
    TheEnum::B(WrappingStruct {
        a: 10,
        b: vec![11, 12, 13],
    })
}

With enums, we need to inject bytecode that will branch depending on the enum discriminant. Variants that don't contain heap types don't need to be considered, e.g. our example will only check if the enum carries the second variant.

After the variant is determined, the enum data can be used to generate the additional receipt data as before.

Heap types in other heap types

struct WrappingStruct {
    a: u64,
    b: Vec<u64>,
}

struct ParentStruct {
    a: u64,
    b: Vec<WrappingStruct>,
}

fn foo() -> ParentStruct {
    ParentStruct {
        a: 9,
        b: vec![
            WrappingStruct {
                a: 10,
                b: vec![11, 12, 13],
            },
            WrappingStruct {
                a: 14,
                b: vec![15, 16, 17],
            },
        ],
    }
}

In order to get all the necessary data we'd need to issue 3 retd instructions.

Notice that this issue is recursive in nature. Once we issue the first and simplest retd we're right back to the same problem only now the starting address has been moved along.

After collecting the receipts

Due to the structured and deterministic way we've approached walking the type tree (a post-order transversal) we can now use the extra receipts to decode the return type.

Since the abi encoder also does post-order transversing we can adapt it to accept a stack of receipts, popping one each time a heap type is to be decoded.

Or we can merge all the receipts into one, taking care to update the pointers to point to their respective data. Decoding would then be trivial as though we had the VMs memory loaded.

But that is an implementation detail, not that relevant right now. The point is, decoding is possible.

Other technical challenges also exist, such as registry management, but nothing unsolvable as far as we can see.

Considerations

The indexer
They will not be able to use this every time, just as they weren't able to use the current injection approach. If the contract wasn't called directly through the SDK then the bytecode was never injected and the additional receipts were never generated.
They could handle returning raw or typed slices but nested heap types would not be possible then.
Logging
Logging would not be supported since we cannot inject bytecode at the appropriate places. Logging heap types will only be possible through typed slices without the support of nested heap types.
Script support
We need to investigate the possibility of injecting the bytecode at the end of user-provided scripts. If it transpires that we'll always have a deterministic way of reaching the script return data, then the approach will be the same as for contracts.

The text was updated successfully, but these errors were encountered:

segfault-magnet · 2023-08-19T04:06:56Z

Waiting to see how far away encoding support is in sway.

cold-briu · 2023-11-06T18:27:51Z

Hi there!
A user is facing an incidence of this issue.

) closes: #1278, #1279, #1046 This PR adds support for the new encoding scheme for contracts, scripts and predicates. I have added a new `ExperimentalBoundedEncoder` which can be activated with the `experimental` cfg flag. I have tried to minimize the impact of the new encoder as much as possible to make it easier for review. A full refactor of the whole sdk is necessary once the new encoding becomes the default one. - The function selector changed and now it is the name of the method. - The `CALL` opcode changed with the new encoding and is expecting the following call data: ContractID, pointer to fn_selector (name of the method), pointer to encoded arguments, number of coins, asset_id, gas_forwarded.

digorithm · 2024-04-04T13:59:36Z

Closed by #1303.

digorithm added the enhancement New feature or request label Jul 17, 2023

digorithm assigned hal3e and segfault-magnet Jul 17, 2023

segfault-magnet mentioned this issue Jul 19, 2023

Heap data corruption in scripts FuelLabs/sway#4828

Open

segfault-magnet added the blocked label Aug 19, 2023

iqdecay self-assigned this Nov 6, 2023

cold-briu mentioned this issue Nov 9, 2023

Can't invoke SRC7 metadata method from the rust SDK due to arbitrary nesting of heap types FuelLabs/sway-standards#38

Closed

DefiCake mentioned this issue Nov 21, 2023

Add SRC-8 FuelLabs/fuel-bridge#103

Closed

segfault-magnet removed their assignment Nov 21, 2023

cold-briu mentioned this issue Nov 23, 2023

Can't invoke SRC7 metadata method from the rust SDK due to arbitrary nesting of heap types #1213

Closed

digorithm mentioned this issue Jan 11, 2024

Encoder/decoder revamp #1246

Closed

kamyar-tm added the epic An epic is a high-level master issue for large pieces of work. label Feb 15, 2024

hal3e mentioned this issue Mar 21, 2024

feat: experimental encoding for contracts, scripts and predicates #1303

Merged

5 tasks

digorithm closed this as completed Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arbitrary nesting of heap types in contract returns #1046

Arbitrary nesting of heap types in contract returns #1046

digorithm commented Jul 17, 2023

Structs

Enums

Heap types in other heap types

After collecting the receipts

Considerations

segfault-magnet commented Aug 19, 2023

cold-briu commented Nov 6, 2023

digorithm commented Apr 4, 2024

Arbitrary nesting of heap types in contract returns #1046

Arbitrary nesting of heap types in contract returns #1046

Comments

digorithm commented Jul 17, 2023

Discussed in #944

Structs

Enums

Heap types in other heap types

After collecting the receipts

Considerations

segfault-magnet commented Aug 19, 2023

cold-briu commented Nov 6, 2023

digorithm commented Apr 4, 2024