Skip to content
This repository has been archived by the owner on Apr 8, 2022. It is now read-only.

DO NOT MERGE: Lets talk about tokens #67

Closed
wants to merge 4 commits into from

Conversation

sinitcin
Copy link
Contributor

@sinitcin sinitcin commented Oct 14, 2021

Description

We cannot sign transactions on the node side, all transactions must be signed on the client side. In the future, there will be changes to the structures of transactions and we will be required to track them. The ability to input any input data should be covered by tests.

Enum TransactionOutput.data

Each transaction output must carry a data field. This field is the designation of the purpose of the transaction, we must highlight the following at the moment:

  • Transfer Tokens or NFT
  • Token Issuance
  • Burning token
  • NFT creation

This field must be Enum. All of the TransactionOutput structure and related types ​​must be derived Encode and Decode. This will allow SCALE Codec to correctly decode data fields. That there is no confusion due to changes in data places, we must mark each element of the type with the index #[codec(index = NUM)].

And there is a requirement for testing, each version must be covered with tests that are guaranteed to break if someone changes the internal data in any version.

    #[derive(Encode, Decode...)]
    pub struct TransactionOutput<AccountId> {
        #[codec(index = 1)]
        pub(crate) value: Value,
        #[codec(index = 2)]
        pub(crate) destination: Destination<AccountId>,
        #[codec(index = 3)]
        pub(crate) data: Option<TxData>,
    }

    #[derive(Encode, Decode...)]
    pub enum TxData {
        // TokenTransfer data to another user. If it is a token, then the token data must also be transferred to the recipient. 
        #[codec(index = 1)]
        TokenTransferV1(..),
        // A new token creation
        #[codec(index = 2)]
        TokenIssuanceV1(..),
        // Burning a token or NFT
        #[codec(index = 3)]
        TokenBurnV1(..),
        // A new NFT creation
        #[codec(index = 4)]
        NftMintV1(..),
        ...
        // We do not change the data in TokenTransferV1, we just create a new version 
        #[codec(index = 5)]
        TokenTransferV2(..),
    }

Detailed description of Enum TxData fields

TokenTransferV1


Transfer funds to another person in a given UTXO. To send MLT we will use TokenID :: MLT, which is actually zero. If the token_id is equal to the ID of the MLS-01 token, then the amount of token is transferred to the recipient. The commission is taken only in MLT. If the token_id is equal to the id of any NFT, then the data of this NFT should be transferred to the recipient without changing the creator field. The UTXO model itself allows you to determine the owner of the NFT.

    pub enum TxData {
        TokenTransferV1{
            token_id: TokenID,
            amount: Value,
        }
        ...
    }

TokenIssuanceV1


When issuing a new token, we specify the data for creating a new token in the transaction input, where the token_id is:

    let token_id = hash(&(inputs[0].hash, inputs[0].index));

token_ticker - might be not unique. Should be limited to 5 chars. In fact, it's an ASCII string.

    pub enum TxData {
        TokenIssuanceV1{
            token_id: TokenID,
            token_ticker: Vec<u8>,
            amount_to_issue: Value,
            // Should be not more than 18 numbers
            number_of_decimals: u8,
            metadata_URI: Vec<u8>,
        }
        ...
    }

See the metada_URI format below.

TokenBurnV1


A token burning - as an input is used by UTXO that containing tokens. As an output, the data field should contain the TokenBurnV1 arm. If the amount in burning the output is less than in the input then should exist at least one output for returning the funds change. In this case, you can burn any existing number of tokens. After this operation, you can use UTXO for the remaining amount of tokens.

    type String = Vec<u8>;
    pub enum TxData {
        TokenBurnV1{
            token_id: TokenID,
            amount_to_burn: Value,
        }
        ...
    }

NftMintV1


When minting a new NFT token, we specify the data for creating a new token in the transaction input, where the token_id is:

    let token_id = hash(&(inputs[0].hash, inputs[0].index));

For the seek a creation UTXO, we should make a new Storage where:

  • Key - token_id
  • Value - hash of UTXO

It allows us to find the whole information about the NFT including creator, and it won't be changed. It is suitable for the MLS-01 tokens too.

The data_hash field is a hash of external data, which should be taken from the digital data for which the NFT is being created. This field should also not be changed when sent to a new owner.

The metadata_URI field can contain the name of the asset and its description, as well as an image with its data for preview.

It is also possible to add different types of hashes and owners.

    #[derive(Encode, Decode, ...)]
    pub enum TxData {
        MintV1{
            token_id: TokenID,
            data_hash: NftDataHash,
            metadata_URI: Vec<u8>,
        }
        ...
    }

    #[derive(Encode, Decode, ...)]
    pub enum NftDataHash {
        #[codec(index = 1)]
        Hash32([u8; 32]),
        #[codec(index = 2)]
        Raw(Vec<u8>),
        // Or any type that you want to implement
    }

Error Handling

We should use "chain-error" feature for the SCALE Codec. It allows us to get a more detailed description of errors.

[dependencies.codec]
default-features = false
features = ["derive", "chain-error"]

However, this kind of error might show only place in data that didn't decode or encoded correctly. Example:

"Could not decode `TransactionOutputV1::data`:\n\tCould not decode `TxDataV1`, variant doesn't exist\n"

Anyway, the correctness of decoded data we should check additionally.

Adding a new version

Adding a new version of data is in fact adding a new field to the enum, if the names match, add the version number at the end, for example:

  • TokenTransferV1
  • TokenTransferV2
  • etc

The order of the fields is not important, but each field must be marked with a unique codec index - # [codec (index = your index)]. Example:

#[derive(Encode, Decode, ...)]
pub enum TxDataV2 {
    #[codec(index = 2)]
    NftMintV2 {
        id: u64,
        token_name: Vec<u8>,
        // other fields that you wish
    },
    #[codec(index = 1)]
    NftMintV1 { id: u64 },
}

You also need to add an appropriate test to track changes.

Example: check_immutability test

This test will compare against the structure template and if someone accidentally changes the data fields, the test will indicate this.

What happens if the old version of the node reads the new transaction format?

The transaction can not be processing. Prove: an_old_node_read_a_new_data test

What happens if the new version of the node reads the old transaction format?

Transaction data will be correctly read and, depending on the blockchain logic, interpreted or discarded. Example:

match data {
    TokenTransferV1(...) => pallet::transfer_v1(...),
    TokenTransferV2(...) => pallet::transfer_v2(...),
}

Prove: a_new_node_read_an_old_data test

Format of data located by reference metadata_URI

This is a link to a third-party server that will contain a json format similar to “ERC721 Metadata JSON Schema”:

{
    "title": "Asset Metadata",
    "properties": {
        "name": {
            "type": "string",
            "description": "Identifies the asset to which this token represents"
        },
        "description": {
            "type": "string",
            "description": "Describes the asset to which this token represents"
        },
        "image": {
            "type": "string",
            "description": "A URI pointing to a resource with mime type image/* representing the asset to which this token represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive."
        }
    }
}

This file will be used on blockchain explorers.

Unit testing

Over here is suggested about test plan for tokens and data field:

  • All tests must apply to all possible versions of the data field.
  • Also, tests should be carried out in the mode of one input - one output, as well as multiple inputs - multiple outputs.
  • Also the tests below should be applied to cases without tokens, for MLS-01, as well as for NFT.

General checks to be repeated for each type of token:

  1. Testing token creation:

    • Creation a token with a pre-existing ID or re-creation of an already created token.

      • The action could not be completed. The error must be handled correctly.
    • Creating a token with corrupted data

      • Data field of zero length
      • The data field of the maximum allowed length filled with random garbage
      • Creation of a token with 0 issue amount
      • Generating a token with a long URI string
    • Creation of a token without input with MLT to pay commission

      • Test tx where Input with token and without MLT, output has token (without MLT)
      • Test tx where Input with token and without MLT, output has MLT (without token)
      • Test tx where Input without token but with MLT, output has MLT and token
      • Test tx where no inputs for token
      • Test where less MLT at the input than you need to pay the commission
      • Test tx where Input and output have a token but with zero value
  2. Testing token transfer

    • Standard creation of a token and sending it to a chain of persons, and from them collecting the token into one UTXO and checking that the token data has not changed, has not been burned or lost in any way.
      • All data must be correct for this test.
      • The token must be sent through multiple account groups.
      • The total amount of the token must be equal to the created one.
    • Incorrect amount of token in one input and one output
      • The input contains the correct amount of token, but the output is incorrect
      • In the input, the number is incorrect, but the output is correct
      • Entry and exit with incorrect number of tokens
    • Testing UTXO for token and return funds change
    • Use in one MLT input to pay the commission and transfer the token at the same time
    • Check possibility to cause overflow. For example, let's take a few inputs where value is a maximum possible number, and one input should have the sum of these inputs values.
  3. Testing the compatibility of the old version with the new one

    • Testing the compatibility of the new version with the old new
      • Testing data encoding in a loop
      • Testing the processing of junk and random data
      • Testing the processing of fields that are written in a different order
      • Testing the immutability of old versions
  4. Testing burning tokens

    • Trying to burn none-existing token
    • Trying to burn more token value than exist in inputs
    • Trying to burn MLT
    • Trying to burn MLS-01
    • Trying to burn NFT
    • Trying to burn token without inputs for that
    • Trying to burn existing token, but which is not in the input

What we shall test additionally?

  1. I can't make any limits on data fields sizes through SCALE. I'm pretty sure that Substrate checks size limits for the whole transactions because I take from framework already decoded structures. I can't see raw bytes. But I can't prove it without testing.
  2. All maths with u128/i128 should be double-checked and tested to prevent overflow and underflow.
  3. Functional tests - will be planned later.

Fix functional tests to work with latest staging
Signed-off-by: sinitcin <antony@email.su>
README_eng.md Outdated Show resolved Hide resolved
README_eng.md Outdated
When issuing a new token, we specify the data for creating a new token in the transaction input, where the `token_id` is:

```rust
let token_id = hash(&(inputs[0].hash, inputs[0].index));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that probably needs specification is how we're gonna create a string out of this. We can either make it just hex, which is ugly and very long. We can use base58check to make it look good, which is OK but will make a long string. We can also crop the byte array then use base58check. Or we can do what bitcoin does. Use RIPEMD160 hash, then use base58check. Just something to think about.

Copy link
Contributor Author

@sinitcin sinitcin Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer using <T as frame_system::Config>::Hashing that in fact BlakeTwo256, means we can change this line to:

 let token_id = BlakeTwo256::hash_of(&(&tx.inputs[0], &tx.inputs[0].index));

But I found out that we do have not an index field in inputs and it seems we have to add it too.

What do you think?

Copy link
Contributor

@TheQuantumPhysicist TheQuantumPhysicist Oct 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, two things:

  • please understand that hashing is one thing, and converting the hash to a string is another thing. The former converts an arbitrary set of data to a set of bytes, every byte can be in the range [0,255]. The other one converts the bytes to ASCII readable characters. If you look into bitcoin, the address is basically the public key, hashed with SHA256, then hashed with RIPEMD160, then they use Base58Check with a given prefix to convert it to a string (which you use to send tokens to). Token ID can be treated the same. We can even do like Ethereum, and just crop the last 20 bytes of a hash and then convert it to hex. I'll talk to Ben about this.

  • I just checked inputs in the source code, and I can say that the formulation I see for for inputs in outpoint is kind of flawed. When you spend an output, the output is specifically defined by a transaction hash AND the output index, because you don't have to spend all outputs of a transaction in one future transaction. We'll have to discuss this as there seems either to be something wrong or something I don't understand. I'll find the guys and talk with them about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About the string representation. I just finished talking to ben about it. let's take the last 20 bytes and use base58check on it. Pick any prefix you want for now (if you don't know what a base58check prefix/version is, let's talk about it).

  • About the input index, this is still unresolved. For now as a temporary solution, just take the hash of the first outpoint without an index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted, for the now we will use:

 let token_id = BlakeTwo256::hash_of(&tx.inputs[0]);

When we'll had a use case for the string representation we'll use the above approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last update from DM: As token_id we'll use the above approach

README_eng.md Outdated Show resolved Hide resolved
README_eng.md Outdated Show resolved Hide resolved
README_eng.md Outdated Show resolved Hide resolved

**What we shall test additionally?**
1. I can't make any limits on data fields sizes through SCALE. I'm pretty sure that Substrate checks size limits for the whole transactions because I take from framework already decoded structures. I can't see raw bytes. But I can't prove it without testing.
2. All maths with `u128`/`i128` should be double-checked and tested to prevent overflow and underflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really prefer that we find some arbitrary precision integers to use... it's much easier to deal with, and most of the time it even serializes to smaller data since not everyone uses 16 bytes to represent their transaction. I'll look into that and see what I find.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we stick with u128, we can use SCALE compact encoding to avoid having to store all the unused precision

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, does SCALE trim integers' LSB zeros when serialized? That's cool stuff. Didn't know that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest fear, btw, is overflow and underflow errors... people do these mistakes too often it's scary. Maybe we should write an Amount wrapper or something. Or maybe substrate did that already. We'll see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would find nice is something like the following:

  • struct Amount(#[codec(compact)] u128)
    • impl Add, Sub etc. check for overflows (returning Option<Amount>)
    • impl Eq, PartialEq, Ord, PartialOrd
    • we don't need Mul
    • div_mod(self, u128) -> (Amount, Amount) might be useful for dividing loot among block signers/stakers
  • struct Asset, roughly equivalent to a pair of Amount and TokenType
    • impl Add, Sub check for overflows and whether token type matches
    • Alternatively disallow arithmetic for this type
  • struct AssetBag, roughly a Map<TokenType, Amount>
    • Useful for accumulating multiple assets, e.g. when adding up assets corresponding to all transaction inputs
    • Add, Sub etc. operate by matching token types and adding/subbing them
    • PartialEq, Eq, PartialOrd (but not Ord)
  • Maybe extra specialized types for assets that are supported natively like Mlt and Btc
    • Just thin wrappers around Amount, with arithmetic ops delegated to it
    • Like Asset but gives compilation time checking for token types rather than runtime checking
  • Some conversion utilities
    • impl From<u128> for Amount
    • impl From<Asset> for AssetBag
    • impl From<{Btc, Mlt}> for Asset
    • impl TryFrom<Asset> for {Btc, Mlt}
    • etc...

Alternative design would be to generalize Amount, Asset and the native assets into something like Asset<Tok> where Tok would distinguish between tokens at type level, which could be TokenType for dynamic token types or phantom types for tokens supported natively. That may result in less boilerplate by making various operations polymorphic in Tok.

Thoughts/comments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping the amount thing was already implemented by Substrate, but we'll see. Your outline seems reasonable and is what we're probably going to be pursue.

…part

Signed-off-by: sinitcin <antony@email.su>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants