Skip to content

Inaccurate documentation for bytes encoding #15303

Open
@Amit0617

Description

@Amit0617

Page

https://docs.soliditylang.org/en/latest/abi-spec.html#examples

Abstract

Encoding process of bytes and string is confusing(wrong). In the first example function bar(bytes3[2] memory), two arguments are passed ["abc", "def"] which are getting encoded to their ASCII equivalent and get right padded. This suggests, function signature should look like bar(string[2] memory) as mentioned in Formal Specification of the Encoding for string except for the length part.

  • string:

    enc(X) = enc(enc_utf8(X)), i.e. X is UTF-8 encoded and this value is interpreted as of bytes type and encoded further. Note that the length used in this subsequent encoding is the number of bytes of the UTF-8 encoded string, not its number of characters.

Otherwise, if the function signature is bar(bytes3[2] memory) then directly bytes should be passed into functions which would be 616263 and 646566. After encoding these would become 0x6162630000000000000000000000000000000000000000000000000000000000 and 0x6465660000000000000000000000000000000000000000000000000000000000

Pull request

Activity

Amit0617

Amit0617 commented on Jul 26, 2024

@Amit0617
Author

Similar issue with third example too for dave.

github-actions

github-actions commented on Oct 28, 2024

@github-actions

This issue has been marked as stale due to inactivity for the last 90 days.
It will be automatically closed in 7 days.

added
staleThe issue/PR was marked as stale because it has been open for too long.
on Oct 28, 2024
changed the title [-]Unclear documentation for bytes encoding[/-] [+]Inaccurate documentation for bytes encoding[/+] on Oct 29, 2024
Amit0617

Amit0617 commented on Oct 29, 2024

@Amit0617
Author

It is simply wrong, string values are passed as bytes in those examples. I had raised this issue on foundry too, if they are lacking functionality and probably lacking specifications compliance. But this is just problematic technically, mixing string and bytes will create inconsistent encoding values. For example, if a function receives abc as input, it can be already hex encoded bytes value, and it can be a valid string also. This creates confusion whether it is already encoded value and needs to be padded only or encoding still needs to be done on the given value.

removed
staleThe issue/PR was marked as stale because it has been open for too long.
on Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @Amit0617

      Issue actions

        Inaccurate documentation for bytes encoding · Issue #15303 · ethereum/solidity