Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERC: Signed Data Standard #191

Closed
holiman opened this issue Jan 20, 2017 · 21 comments
Closed

ERC: Signed Data Standard #191

holiman opened this issue Jan 20, 2017 · 21 comments

Comments

@holiman
Copy link
Contributor

holiman commented Jan 20, 2017

ERC: 191
Title: Signed Data Standard
Authors: Martin Holst Swende, Nick Johnson
Status: Draft
Type: Informational
Created: 2016-01-20

Abstract

This ERC proposes a specification about how to handle signed data in Etherum contracts.

Motivation

Several multisignature wallet implementations have been created which accepts presigned transactions. A presigned transaction is a chunk of binary signed_data, along with signature (r, s and v). The interpretation of the signed_data has not been specified, leading to several problems:

  • Standard Ethereum transactions can be submitted as signed_data. An Ethereum transaction can be unpacked, into the following components: RLP<nonce, gasPrice, startGas, to, value, data> (hereby called RLPdata), r, s and v. If there are no syntactical constraints on signed_data, this means that RLPdata can be used as a syntactically valid presigned transaction.
  • Multisignature wallets have also had the problem that a presigned transaction has not been tied to a particular validator, i.e a specific wallet. Example:
    1. Users A, B and C have the 2/3-wallet X
    2. Users A, B and D have the 2/3-wallet Y
    3. User A and B submites presigned transaction to X.
    4. Attacker can now reuse their presigned transactions to X, and submit to Y.

Specification

We propose the following format for signed_data

0x19 <1 byte version> <version specific data> <data to sign>. 

Version 0 has <20 byte address> for the version specific data, and the address is the intended validator. In the case of a Multisig wallet, that is the wallet's own address .

The initial 0x19 byte is intended to ensure that the signed_data is not valid RLP

For a single byte whose value is in the [0x00, 0x7f] range, that byte is its own RLP encoding.

That means that any signed_data cannot be one RLP-structure, but a 1-byte RLP payload followed by something else. Thus, any ERC-191 signed_data can never be an Ethereum transaction.

Additionally, 0x19 has been chosen because since ethereum/go-ethereum#2940 , the following is prepended before hashing in personal_sign:

"\x19Ethereum Signed Message:\n" + len(message).

Using 0x19 thus makes it possible to extend the scheme by defining a version 0x45 (E) to handle these kinds of signatures.

Example

function submitTransactionPreSigned(address destination, uint value, bytes data, uint nonce, uint8 v, bytes32 r, bytes32 s)
    public
    returns (bytes32 transactionHash)
{
   // Arguments when calculating hash to validate
    // 1: byte(0x19) - the initial 0x19 byte
    // 2: byte(0) - the version byte 
    // 4: this - the validator address
    // 4-7 : Application specific data
    transactionHash = keccak256(byte(0x19),byte(0),this,destination, value, data, nonce);
    sender = ecrecover(transactionHash, v, r, s);
    // ...
}
@holiman
Copy link
Contributor Author

holiman commented Jan 20, 2017

@holiman
Copy link
Contributor Author

holiman commented Jan 20, 2017

I'm thinking that it may be a good idea to define the initial byte as 0x19 instead. Since ethereum/go-ethereum#2940 , the following is prepended before hashing in personal_sign:

"\x19Ethereum Signed Message:\n" + len(message).

Using 0x19 would make it possible to extend the scheme by defining a version 0x45 (E) to handle these kinds of signatures.

@Arachnid
Copy link
Contributor

That makes sense, since that still corresponds to an RLP-encoded small integer and thus isn't a valid RLP transaction.

@holiman
Copy link
Contributor Author

holiman commented Jan 20, 2017

I updated the proposal to use 0x19 instead of 0x00

@subtly
Copy link
Member

subtly commented Jan 22, 2017

I would propose this to be scoped to / maintained within Solidity. Can this also be simplified to a suggestion for developers who use ecrecover (below)? Might also make sense as a part of metropolis.

txHash = keccak256(uniqueHash, data); // uniqueHash can be one of many schemes, we SUGGEST ?, as described in solidity documentation
addr = ecrecover(txhash, v, r, s);

Thus, any ERC-191 signed_data can never be an Ethereum transaction.

This may not be true in metropolis (see: #86). In metropolis account model, it could be that internal calls between accounts are the same as transactions without the '0x0' signature - which is merely a mechanism for backwards compatibility. External transactions may have signed_data as 0x0, but internal transactions may enumerate or pass signed_data for other schemes where the payload begins with 0x19.

Overall Concerns:

  • Other systems/standards have different schemes (x509, NIST, IPFS)
  • Nonce is application specific
  • Nonce requires state and introduces complexity

Nonces introduces other problems and without further defining the nonce mechanism, the proposal doesn't address replay attacks.

@Arachnid
Copy link
Contributor

I would propose this to be scoped to / maintained within Solidity.

Can you clarify what you mean by this? This is a standard that defines how to encode strings for hashing and signing; while it would be theoretically possible to add this as a feature in Solidity, there doesn't seem a lot of point.

This may not be true in metropolis (see: #86). In metropolis account model, it could be that internal calls between accounts are the same as transactions without the '0x0' signature - which is merely a mechanism for backwards compatibility.

As long as transactions remain RLP-encoded, the prefix will ensure that they can't be replayed against contracts using this scheme.

Nonce is application specific

A nonce isn't part of this EIP.

@msacks
Copy link

msacks commented Jan 22, 2017 via email

@coder5876
Copy link

@holiman Nice initiative! I've also been playing around with this kind access control based on presigned messages.

Question about the versioning: Are you considering a central repository of versions and their definitions, once other uses emerge besides multisig?

@Georgi87
Copy link

I am wondering if the Signed Data Standard should allow to define different call types.

Using proxy contracts (e.g. multisigs) it would make sense to allow not only .call but also .delegatecall and .callcode.

One example: Assuming a multisig contract wants to approve tokens before executing a contract function, it would have to do 2 transactions using .call. Using .delegatecall those two transactions could be combined into one using a library call, which will first approve tokens and then calls the contract function.

@MicahZoltu
Copy link
Contributor

Why does this need to be a standard? I think it is a reasonable best practice and contracts that validate pre-signed input should validate that the input was intended for that contract, but I don't see value in enforcing any kind of cross-contract standard. In fact, this standard is effectively a standard meant to ensure that no two signatures are compatible. :)

Is the purpose of this EIP issue just to open up some discussion and recommend a best practice or is the intent that it will be formalized/standardized in some way? If the latter, then I'm against it without substantially more argument as to why we need a standard. If the former, I would really love it if there was a better place to put "best practices and recommendations" rather than EIPs. :/

@carver
Copy link
Contributor

carver commented Sep 28, 2017

Is the intention for this signing format to apply in contexts like signing messages in state channels? If so, there could be a benefit to avoiding replay attacks using EIP 155 values of v.

Even if not, it would be good for the signed data standard to explicitly name valid values of v.

ie~ Is {0, 1} valid, or only {27, 28}?

Edit: Also, a standard serialization format with the signature could be valuable, for passing around one blob of a signed message. Probably either: the simple concatenation of bytes: message_hash + r + s + v, or the RLP encoding of [message_hash, r, s, v]. Since all values are fixed size, RLP seems unnecessary, though.

@alex-miller-0
Copy link

What about adding a chainId parameter here to prevent cross-chain replays?

@3esmit
Copy link
Contributor

3esmit commented Jan 28, 2018

@holiman Thanks for this ERC, I think that your version byte means that the wallet softwares that provide UX could understand one, or several subtypes of signatures. I think 255 types are more then we need, but 0xFF can be used to extend this to the second byte, just like if this type contain subtypes.
0x45 (hex of char E) should be also a supported type by wallets, as is the legacy support, so it could also sign to old contracts. 0x00 could be the pattern you specified.

@MicahZoltu The standarization is important so signatures can be verified as safe and understood by wallets. Also through natspec or some compilation magic a signature interface could be created and used in wallets together with (or inside) ABI to interact with a smart-contract, so user can sign a message using UI provided by the wallet software.

@carver @alex-miller-0 Will chainId will ever be available in smart-contract? Otherwise we will need to hardcode the chainId in smart contract or passing as a variable in constructor.
I think is interesting to have chainId available as a global variable to then wallets automatically sign messages with chainId included.

I agree that reply attack is somewhat important, I see that the reply attack can be performed in 2 cases:

  1. A forked chain (like ETC) and people keep using both sides of an smart contract which uses signed messages. This is likely to happen in case of a DAO which controls native token (or a token backed by a native token DAO).
  2. A malicious deployer deploy the same address in completly different ethereum-compatible networks and user (wrongly) use the same private key for both networks.

@3esmit
Copy link
Contributor

3esmit commented Jan 28, 2018

Consider the following example:

pragma solidity ^0.4.18;

contract PreSignedExample {

    function getFooBarHash(bytes data, uint nonce) 
        internal
        pure
        returns (bytes32 unsignedDataHash)
    {
        return keccak256(byte(0x19),byte(0), address(this), data);
    }

    function submitPreSigned(bytes data, uint nonce, uint8 v, bytes32 r, bytes32 s)
    public
    {
        // ...
        sender = ecrecover(getFooBarHash(data, nonce v, r, s);
        // ...
    }

}

getApplicationUnsignedDataHash(bytes data, uint nonce) would be used by UI to help signing a transaction.
This could be improved within solidity, by adding a new keyword to specify that function purpose is to build a signed term to the contract.
With this, a interface could be created that could be used with ABI to wallets able to sign a message to that contract. This also will be converted into a public constant function.

I would suggest something like this:

contract PreSignedExample {

    mapping (address => uint256) public nonce;

    /** 
    * @notice signs `data` for foo and bar
    * @param signer the account being signed (needed to get nonce), maybe msg.signer?
    * @param data the data to sign
    * @signed header Safety header
    * @signed version signature version type
    * @signed signAtAddress `address(this)` the address of the contract receving the signature
    * @signed data the data provided
    * @signed nonce `this.nonce(msg.signer)` unique incremental identifier of this interaction 
    **/
    signature signFooBar(address signer, bytes data) 
        signs(byte header, byte version, address signAtAddress, bytes data, uint256 nonce)
    {
        sign(byte(0x19),byte(0), address(this), nonce[signer], data); 
    }


    function submitPreSigned(address signer, bytes data, uint8 v, bytes32 r, bytes32 s)
    public
    {
        // ...
        sender = ecrecover(signFooBar(signer, data), v, r, s);
        // ...
    }

}

There are some problems, in the example msg.signer is known at signing moment, but when recovering this information is unknown yet, so we should know what nonce is or provide the signer to findout the nonce. So I needed to provide address signer there.
This example have lots of room for improvement, such as fixed prefixes byte header, byte version, by modifiers.

I think this is interesting to help wallets understand correctly what they are signing, and also to provide a seamless development environment.

@CBDotGuru
Copy link
Contributor

CBDotGuru commented Jan 28, 2018

@alex-miller-0 and @3esmit I was just messing with this today. I think the chainId is an important addition, we have to assume 1) There will be multiple public chains for which the same data packet would be useful and 2) Users prefer to have the same addresses cross chain. I defined the chainId in the contract constructor, this is ideal since you could deploy the same contract in any EVM network, therefore it doesn't make sense to hardcode this value. However, the chainId storage variable should be immutable after being set, so the setter should only appear in the constructor. With that said, adding chainId into the keccak256() function actually results in a stack depth failure when you do:
bytes32 txHash = keccak256(byte(0x19), byte(0), chainId, this, destination, value, data, nonce);
but I was able to get it to work by simply combining the first two bytes:
bytes32 txHash = keccak256(bytes2(0x1900), chainId, this, destination, value, data, nonce);

With that said, I suggest adding the chainId in that position because it fits in the hierarchy: version->chain->address->data.

@ligi
Copy link
Member

ligi commented Apr 6, 2018

I think this issue can be closed now as of #973

@Arachnid Arachnid closed this as completed Apr 6, 2018
Arachnid added a commit that referenced this issue Apr 9, 2018
Arachnid added a commit to Arachnid/EIPs that referenced this issue May 2, 2018
@fulldecent
Copy link
Contributor

Please reopen this issue and set it as the Discussions-To address for the EIP during Last Call.

@fulldecent
Copy link
Contributor

Also, the code in the example does not compile.

Also, the code is using an unsupported version of Solidity. The Solidity project recommends you to use the latest version (now 0.8.4) and does not support older versions.

@benjreinhart
Copy link

benjreinhart commented Jan 27, 2023

I have a question re: length of signed message. If I'm reading the spec right, it seems like length of message is ambiguous in two ways:

  1. There seems to be no separator between the length and the message itself. What if the message starts with digit(s)?
  2. Should this be the length of the message string, or the length of the resulting message bytes? What encoding should be used? UTF8 is variable width encoding. Do we normalize the representation in any way?

Using JS in examples, but similar situations exist across languages.

Separator

For 1) above, consider the following:

const message = '0mg'
const length = message.length // 3
const eip191 = '\x19Ethereum Signed Message:\n30mg'

The length is 3 but it's ambiguous because the message starts with a digit as well. How do we know what's right in this case? Does it matter?

Encoding

For 2) above, the string length and resulting bytes length depend on encoding and normalization.

const message1 = 'héllo'
const slength1 = message1.length // 6
const bytesLength1 = new TextEncoder().encode(message1).byteLength // 7

const message2 =  'héllo'.normalize('NFKC')
const slength2 = message2.length // 5
const bytesLength2 = new TextEncoder().encode(message2).byteLength // 6

Depending on multiple factors, we get different results. Different languages have different representations. What is the correct representation that we want to use for the message length?

Final note

The spec also doesn't make it clear (though, maybe i'm missing something) whether or not this matters. Is length taken into account anywhere? Is it just ignored? Can this cause issues? At a minimum, seems like the same message can easily end up encoded differently and result in different signatures depending on language / implementation.

@carver
Copy link
Contributor

carver commented Feb 1, 2023

This is a kind of meta-standard that defines a few different signed message standards. Both issues you are referring to specifically only apply the "E" or 0x45 version.

I consider this version deprecated, because of these issues, and we warn against using it in the web3.py docs and the eth-account docs.

You can always replace a 0x45 with a 0x00 version, so that's my recommendation. I'd be in support of some kind of broader public confirmation of the deprecated status.

@arifulone
Copy link

arifulone commented Mar 15, 2024

https://eips.ethereum.org/EIPS/eip-191#:~:text=NB%3A%20The%20E%20in%20Ethereum%20Signed%20Message%20refers%20to%20the%20version%20byte%200x45.%20The%20character%20E%20is%200x45%20in%20hexadecimal%20which%20makes%20the%20remainder%2C%20thereum%20Signed%20Message%3A%5Cn%20%2B%20len(message)%2C%20the%20version%2Dspecific%20data.

NB: The E in Ethereum Signed Message refers to the version byte 0x45. The character E is 0x45 in hexadecimal which makes the remainder, thereum Signed Message:\n + len(message), the version-specific data

How E is 0x45 in hex . i couldn't get the point . can u plz explain me sir ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

19 participants