New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yet another cool checksum address encoding #55

Closed
vbuterin opened this Issue Jan 14, 2016 · 75 comments

Comments

Projects
None yet
@vbuterin
Collaborator

vbuterin commented Jan 14, 2016

EDITOR UPDATE (2017-08-24): This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. Please go there for the correct specification. The text below may be incorrect or outdated, and is not maintained.

Code:

def checksum_encode(addr): # Takes a 20-byte binary address as input
    o = ''
    v = utils.big_endian_to_int(utils.sha3(addr))
    for i, c in enumerate(addr.encode('hex')):
        if c in '0123456789':
            o += c
        else:
            o += c.upper() if (v & (2**(255 - i))) else c.lower()
    return '0x'+o

In English, convert the address to hex, but if the ith digit is a letter (ie. it's one of abcdef) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase.

Benefits:

  • Backwards compatible with many hex parsers that accept mixed case, allowing it to be easily introduced over time
  • Keeps the length at 40 characters
  • The average address will have 60 check bits, and less than 1 in 1 million addresses will have less than 32 check bits; this is stronger performance than nearly all other check schemes. Note that the very tiny chance that a given address will have very few check bits is dwarfed by the chance in any scheme that a bad address will randomly pass a check

UPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.

Examples:

  • 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 (the "cow" address)
  • 0x9Ca0e998dF92c5351cEcbBb6Dba82Ac2266f7e0C
  • 0xcB16D0E54450Cdd2368476E762B09D147972b637
@chfast

This comment has been minimized.

Show comment
Hide comment
@chfast

chfast Jan 14, 2016

Contributor

This is very nice idea.

I wander about the 0x prefix. Is it mandatory or preferred?

Contributor

chfast commented Jan 14, 2016

This is very nice idea.

I wander about the 0x prefix. Is it mandatory or preferred?

@vbuterin

This comment has been minimized.

Show comment
Hide comment
@vbuterin

vbuterin Jan 15, 2016

Collaborator

Hmm, I'm fine either way, though I definitely see the rationale for standardizing one way or the other.

Collaborator

vbuterin commented Jan 15, 2016

Hmm, I'm fine either way, though I definitely see the rationale for standardizing one way or the other.

@Souptacular

This comment has been minimized.

Show comment
Hide comment
@Souptacular

Souptacular Jan 15, 2016

Member

I saw comments on the TurboEthereum guide that suggested that we were moving away from raw hex keys into ICAP keys:

ICAP: XE472EVKU3CGMJF2YQ0J9RO1Y90BC0LDFZ
Raw hex: 0092e965928626f8880629cec353d3fd7ca5974f

"Notice the last two lines there. One is the ICAP address, the other is the raw hexadecimal address. The latter is an older representation of address that you'll sometimes see and is being phased out in favour of the shorter ICAP address which also includes a checksum to avoid problems with mistyping. All normal (aka direct) ICAP addresses begin with XE so you should be able to recognise them easily."

My concern is that if there was a previous decision to start moving to ICAP, I'm not sure if this will add confusion. However, if this helps give raw hex addresses a checksum I guess that can only be beneficial, even if everyone wants to move to ICAP eventually.

Member

Souptacular commented Jan 15, 2016

I saw comments on the TurboEthereum guide that suggested that we were moving away from raw hex keys into ICAP keys:

ICAP: XE472EVKU3CGMJF2YQ0J9RO1Y90BC0LDFZ
Raw hex: 0092e965928626f8880629cec353d3fd7ca5974f

"Notice the last two lines there. One is the ICAP address, the other is the raw hexadecimal address. The latter is an older representation of address that you'll sometimes see and is being phased out in favour of the shorter ICAP address which also includes a checksum to avoid problems with mistyping. All normal (aka direct) ICAP addresses begin with XE so you should be able to recognise them easily."

My concern is that if there was a previous decision to start moving to ICAP, I'm not sure if this will add confusion. However, if this helps give raw hex addresses a checksum I guess that can only be beneficial, even if everyone wants to move to ICAP eventually.

@tgerring

This comment has been minimized.

Show comment
Hide comment
@tgerring

tgerring Jan 17, 2016

Member

My preference is that a checksum-enabled Ethereum address is immediately recognizable as such.

The proposed solution is not immediately recognizable as being distinct from a standard Ethereum address and could be confused for being a strangely-cased version of non-checksummed addresses. Although it offers superior backwards compatibility, I believe will only cause additional confusion to the end-user.

Since the change in format serves to make the address less error prone through checksums, I posit they should also be immediately recognizable through a fixed prefix or otherwise obvious identifier. One reason why I prefer ICAP over this proposed solution is that it signals to the user clearly that this is an Ethereum address and cannot be confused with a transaction/block hash.

Member

tgerring commented Jan 17, 2016

My preference is that a checksum-enabled Ethereum address is immediately recognizable as such.

The proposed solution is not immediately recognizable as being distinct from a standard Ethereum address and could be confused for being a strangely-cased version of non-checksummed addresses. Although it offers superior backwards compatibility, I believe will only cause additional confusion to the end-user.

Since the change in format serves to make the address less error prone through checksums, I posit they should also be immediately recognizable through a fixed prefix or otherwise obvious identifier. One reason why I prefer ICAP over this proposed solution is that it signals to the user clearly that this is an Ethereum address and cannot be confused with a transaction/block hash.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 17, 2016

Contributor

Just saw this proposal now.

I disagree @tgerring that it will cause confusion: to a layman, it will be indistinguishable from a normal address. This approach is very easy to implement in the client side and doesnt require much. I would say this could be adopted as a great intermediary before ICAP – also would be a good alternative if ICAPs don't catch on.

Contributor

alexvandesande commented Feb 17, 2016

Just saw this proposal now.

I disagree @tgerring that it will cause confusion: to a layman, it will be indistinguishable from a normal address. This approach is very easy to implement in the client side and doesnt require much. I would say this could be adopted as a great intermediary before ICAP – also would be a good alternative if ICAPs don't catch on.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 18, 2016

Contributor

I did a rudimentary implementation on javascript in the web3 object:

var isAddress = function (address) {
    if (!/^(0x)?[0-9a-f]{40}$/i.test(address)) {
        // check if it has the basic requirements of an address
        return false;
    } else if (/^(0x)?[0-9a-f]{40}$/.test(address) || /^(0x)?[0-9A-F]{40}$/.test(address)) {
        // If it's all small caps or all all caps, return true
        return true;
    } else {
        // Otherwise check each case
        address = address.replace('0x','');

        // creates the case map using the binary form of the hash of the address
        var caseMap = parseInt(web3.sha3('0x'+address.toLowerCase()),16).toString(2).substring(0, 40);

        for (var i = 0; i < 40; i++ ) { 
            // the nth letter should be uppercase if the nth digit of casemap is 1
            if ((caseMap[i] == '1' && address[i].toUpperCase() != address[i])|| (caseMap[i] == '0' && address[i].toLowerCase() != address[i])) {
                return false;
            }
        }
        return true;
    }
};


/**
 * Makes a checksum address
 *
 * @method toChecksumAddress
 * @param {String} address the given HEX adress
 * @return {String}
*/
var toChecksumAddress = function (address) {

    var checksumAddress = '0x';
    address = address.toLowerCase().replace('0x','');

    // creates the case map using the binary form of the hash of the address
    var caseMap = parseInt(web3.sha3('0x'+address),16).toString(2).substring(0, 40);

    for (var i = 0; i < address.length; i++ ) {  
        if (caseMap[i] == '1') {
          checksumAddress += address[i].toUpperCase();
        } else {
            checksumAddress += address[i];
        }
    }

    console.log('create: ', address, caseMap, checksumAddress)
    return checksumAddress;
};

It works internally and it's almost invisible to the user. I don't really see a good reason not to implement it.
My results don't match yours, @vbuterin it might be interesting to figure out why. Here are my results omitting the '0x' before hashing the address:

  • 0xCD2a3d9F938e13cd947Ec05AbC7fE734dF8dd826
  • 0x9CA0E998df92C5351CeCBBb6DBa82Ac2266F7E0c
  • 0xCb16d0e54450cDd2368476E762b09D147972B637

And here the results including 0x on the hash of the address:

  • 0xCd2A3D9f938e13CD947ec05abC7Fe734dF8dd826
  • 0x9Ca0e998DF92c5351CECbBb6DbA82ac2266F7e0C
  • 0xCB16D0e54450cDD2368476e762b09d147972B637
Contributor

alexvandesande commented Feb 18, 2016

I did a rudimentary implementation on javascript in the web3 object:

var isAddress = function (address) {
    if (!/^(0x)?[0-9a-f]{40}$/i.test(address)) {
        // check if it has the basic requirements of an address
        return false;
    } else if (/^(0x)?[0-9a-f]{40}$/.test(address) || /^(0x)?[0-9A-F]{40}$/.test(address)) {
        // If it's all small caps or all all caps, return true
        return true;
    } else {
        // Otherwise check each case
        address = address.replace('0x','');

        // creates the case map using the binary form of the hash of the address
        var caseMap = parseInt(web3.sha3('0x'+address.toLowerCase()),16).toString(2).substring(0, 40);

        for (var i = 0; i < 40; i++ ) { 
            // the nth letter should be uppercase if the nth digit of casemap is 1
            if ((caseMap[i] == '1' && address[i].toUpperCase() != address[i])|| (caseMap[i] == '0' && address[i].toLowerCase() != address[i])) {
                return false;
            }
        }
        return true;
    }
};


/**
 * Makes a checksum address
 *
 * @method toChecksumAddress
 * @param {String} address the given HEX adress
 * @return {String}
*/
var toChecksumAddress = function (address) {

    var checksumAddress = '0x';
    address = address.toLowerCase().replace('0x','');

    // creates the case map using the binary form of the hash of the address
    var caseMap = parseInt(web3.sha3('0x'+address),16).toString(2).substring(0, 40);

    for (var i = 0; i < address.length; i++ ) {  
        if (caseMap[i] == '1') {
          checksumAddress += address[i].toUpperCase();
        } else {
            checksumAddress += address[i];
        }
    }

    console.log('create: ', address, caseMap, checksumAddress)
    return checksumAddress;
};

It works internally and it's almost invisible to the user. I don't really see a good reason not to implement it.
My results don't match yours, @vbuterin it might be interesting to figure out why. Here are my results omitting the '0x' before hashing the address:

  • 0xCD2a3d9F938e13cd947Ec05AbC7fE734dF8dd826
  • 0x9CA0E998df92C5351CeCBBb6DBa82Ac2266F7E0c
  • 0xCb16d0e54450cDd2368476E762b09D147972B637

And here the results including 0x on the hash of the address:

  • 0xCd2A3D9f938e13CD947ec05abC7Fe734dF8dd826
  • 0x9Ca0e998DF92c5351CECbBb6DbA82ac2266F7e0C
  • 0xCB16D0e54450cDD2368476e762b09d147972B637

@frozeman frozeman added the ERC label Feb 19, 2016

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 19, 2016

Contributor

Just pushed an experimental branch to web3.js and the wallet.

I would love feedback from anyone on those.

Contributor

alexvandesande commented Feb 19, 2016

Just pushed an experimental branch to web3.js and the wallet.

I would love feedback from anyone on those.

@vbuterin

This comment has been minimized.

Show comment
Hide comment
@vbuterin

vbuterin Feb 19, 2016

Collaborator

web3.sha3('0x'+address)

You're hashing the hex and not the binary.

Collaborator

vbuterin commented Feb 19, 2016

web3.sha3('0x'+address)

You're hashing the hex and not the binary.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 19, 2016

Contributor

Good catch, I switched to the sha3 of the binary but the results still won't match. I'm a bit confused on what you meant by # Takes a 20-byte binary address as input. Ethereum addresses are 160 bits..

For example:

  • address: 0xcd2a3d9f938e13cd947ec05abc7fe734df8dd826 (why cow? What joke did I miss?)
  • To binary: 1100110100101010001111011001111110010011100011100001001111001101100101000111111011000000010110101011110001111111111001110011010011011111100011011101100000100110
  • First 40 binary digits of the sha3 of the binary: 1110011101011010000000000110010011011010
  • That means it should start with three uppercase letters, followed by 2 lowercases, followed by 3 upper etc: 0xCD2a3D9F938E13Cd947ec05abC7fe734DF8DD826 not 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 in your example.

I suppose I am misunderstanding what you are using as input..

PS: you can probably simplify your example by not checking for letters: you can do uppercase conversions on numbers and although there is such a thing as a lowercase digits they are represented the same

Contributor

alexvandesande commented Feb 19, 2016

Good catch, I switched to the sha3 of the binary but the results still won't match. I'm a bit confused on what you meant by # Takes a 20-byte binary address as input. Ethereum addresses are 160 bits..

For example:

  • address: 0xcd2a3d9f938e13cd947ec05abc7fe734df8dd826 (why cow? What joke did I miss?)
  • To binary: 1100110100101010001111011001111110010011100011100001001111001101100101000111111011000000010110101011110001111111111001110011010011011111100011011101100000100110
  • First 40 binary digits of the sha3 of the binary: 1110011101011010000000000110010011011010
  • That means it should start with three uppercase letters, followed by 2 lowercases, followed by 3 upper etc: 0xCD2a3D9F938E13Cd947ec05abC7fe734DF8DD826 not 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 in your example.

I suppose I am misunderstanding what you are using as input..

PS: you can probably simplify your example by not checking for letters: you can do uppercase conversions on numbers and although there is such a thing as a lowercase digits they are represented the same

@vbuterin

This comment has been minimized.

Show comment
Hide comment
@vbuterin

vbuterin Feb 19, 2016

Collaborator
>>> from ethereum import utils
>>> base_addr = utils.privtoaddr(utils.sha3('cow'))
>>> base_addr
'\xcd*=\x9f\x93\x8e\x13\xcd\x94~\xc0Z\xbc\x7f\xe74\xdf\x8d\xd8&'
>>> utils.sha3(base_addr)
'\xa2\x86)\xe4\x18A\xcc^p(\x99"z\x10\xd8\xfd}\xeb\xed\x9c\xe8\x7fG\xa9]\xcc;\xed\xd9\xa8\xa4\xef'

By "binary" I meant "just the raw bytes, not any kind of encoded representation". There's also the special chars ¹²³⁴⁵⁶⁷⁸⁹⁰ I suppose, but that's not backwards-compatible anymore.

Collaborator

vbuterin commented Feb 19, 2016

>>> from ethereum import utils
>>> base_addr = utils.privtoaddr(utils.sha3('cow'))
>>> base_addr
'\xcd*=\x9f\x93\x8e\x13\xcd\x94~\xc0Z\xbc\x7f\xe74\xdf\x8d\xd8&'
>>> utils.sha3(base_addr)
'\xa2\x86)\xe4\x18A\xcc^p(\x99"z\x10\xd8\xfd}\xeb\xed\x9c\xe8\x7fG\xa9]\xcc;\xed\xd9\xa8\xa4\xef'

By "binary" I meant "just the raw bytes, not any kind of encoded representation". There's also the special chars ¹²³⁴⁵⁶⁷⁸⁹⁰ I suppose, but that's not backwards-compatible anymore.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 19, 2016

Member

I initially like this quite a bit. All of the cons that I see are extreme edge cases and I think that it's pretty trivial for library authors to handle gracefully. I like the backwards compatibility, the compatibility with existing hex parsing utilities.

Member

pipermerriam commented Feb 19, 2016

I initially like this quite a bit. All of the cons that I see are extreme edge cases and I think that it's pretty trivial for library authors to handle gracefully. I like the backwards compatibility, the compatibility with existing hex parsing utilities.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 19, 2016

Contributor

base_addr '\xcd*=\x9f\x93\x8e\x13\xcd\x94~\xc0Z\xbc\x7f\xe74\xdf\x8d\xd8&'

I'm not sure if the web3.js coverts to bytes. Also, pure javascript only supports binary conversion up to a hard limit, any larger and I had to use the BigNumber library. Wouldn't it be simpler just to use sha3(address)?

Contributor

alexvandesande commented Feb 19, 2016

base_addr '\xcd*=\x9f\x93\x8e\x13\xcd\x94~\xc0Z\xbc\x7f\xe74\xdf\x8d\xd8&'

I'm not sure if the web3.js coverts to bytes. Also, pure javascript only supports binary conversion up to a hard limit, any larger and I had to use the BigNumber library. Wouldn't it be simpler just to use sha3(address)?

@vbuterin

This comment has been minimized.

Show comment
Hide comment
@vbuterin

vbuterin Feb 19, 2016

Collaborator

Wouldn't it be simpler just to use sha3(address)?

Mathematically speaking it would be a bit ugly imo.

I'm not sure if the web3.js coverts to bytes

Yeah, I had this problem; for one of my example gambling dapps where I was using a hash-commit-reveal protocol I took an existing sha3 impl; you could do the same: https://github.com/ethereum/dapp-bin/blob/master/serpent_gamble/scripts/sha3.min.js

Collaborator

vbuterin commented Feb 19, 2016

Wouldn't it be simpler just to use sha3(address)?

Mathematically speaking it would be a bit ugly imo.

I'm not sure if the web3.js coverts to bytes

Yeah, I had this problem; for one of my example gambling dapps where I was using a hash-commit-reveal protocol I took an existing sha3 impl; you could do the same: https://github.com/ethereum/dapp-bin/blob/master/serpent_gamble/scripts/sha3.min.js

@simenfd

This comment has been minimized.

Show comment
Hide comment
@simenfd

simenfd Feb 19, 2016

I see some problems with ICAP's variable length and low checksum bitsize:

"XE7338O073KYGTWWZN0F2WZ0R8PX5ZPPZS": This is a 30 charaters address, IBAN compatible, based on the "Direct approach" from https://github.com/ethereum/wiki/wiki/ICAP:-Inter-exchange-Client-Address-Protocol

Now, If you enter such an address, and accidentally add another character somewhere, you have created a "Basic" (incompatible, but allowed and valid in ethereum ICAP implementation). The problem is that naively, without knowing all properties of the checksum algorithm, there is a 1% chance this will pass validation, and consequently you are sending money into a black hole.

On the topic of checksums in hex addresses:
0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826

I agree that there should be some easy identification mechanism to separate it from an unchecked address. Alternatives might include:
XxCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
ExCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
#Cd2a3d9f938e13Cd947eC05ABC7fe734df8DD826

This makes it not completely backwards compatilble, but increadably easy to edit to satisfy a legacy system without any checksums.

simenfd commented Feb 19, 2016

I see some problems with ICAP's variable length and low checksum bitsize:

"XE7338O073KYGTWWZN0F2WZ0R8PX5ZPPZS": This is a 30 charaters address, IBAN compatible, based on the "Direct approach" from https://github.com/ethereum/wiki/wiki/ICAP:-Inter-exchange-Client-Address-Protocol

Now, If you enter such an address, and accidentally add another character somewhere, you have created a "Basic" (incompatible, but allowed and valid in ethereum ICAP implementation). The problem is that naively, without knowing all properties of the checksum algorithm, there is a 1% chance this will pass validation, and consequently you are sending money into a black hole.

On the topic of checksums in hex addresses:
0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826

I agree that there should be some easy identification mechanism to separate it from an unchecked address. Alternatives might include:
XxCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
ExCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
#Cd2a3d9f938e13Cd947eC05ABC7fe734df8DD826

This makes it not completely backwards compatilble, but increadably easy to edit to satisfy a legacy system without any checksums.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 19, 2016

Contributor

@simenfd there should be some easy identification mechanism to separate it from an uncheck address.

I disagree. I think the whole point of this scheme is that it's completely backwards compatible. There's no point in separating them. In my implementation, if the address is all caps or all small caps then it assumes to be a unchecksummed address. In a 40 char address, there will be in average 15 letters, the chances of all of them being the same case is 1:16384 so I guess it's strong enough.

Contributor

alexvandesande commented Feb 19, 2016

@simenfd there should be some easy identification mechanism to separate it from an uncheck address.

I disagree. I think the whole point of this scheme is that it's completely backwards compatible. There's no point in separating them. In my implementation, if the address is all caps or all small caps then it assumes to be a unchecksummed address. In a 40 char address, there will be in average 15 letters, the chances of all of them being the same case is 1:16384 so I guess it's strong enough.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 19, 2016

Member

the chances of all of them being the same case is 1:16384

That was exactly my line of thinking as well. It's safe enough to assume that all caps or all lower addresses are not checksummed.

Member

pipermerriam commented Feb 19, 2016

the chances of all of them being the same case is 1:16384

That was exactly my line of thinking as well. It's safe enough to assume that all caps or all lower addresses are not checksummed.

@christianlundkvist

This comment has been minimized.

Show comment
Hide comment
@christianlundkvist

christianlundkvist Feb 19, 2016

The backwards compatibility is nice but IMO presents a clear danger: If the user believes that the address has a checksum she might be willing to input an address by hand. If she then happens to use an old version of transaction handling that just parses the hex ignoring the case then her funds are lost in the case of a typo.

For this reason my feeling is that I prefer a scheme that would make a normal hex parser throw an error, rather than a user thinking she's protected by a checksum when in fact she is not.

christianlundkvist commented Feb 19, 2016

The backwards compatibility is nice but IMO presents a clear danger: If the user believes that the address has a checksum she might be willing to input an address by hand. If she then happens to use an old version of transaction handling that just parses the hex ignoring the case then her funds are lost in the case of a typo.

For this reason my feeling is that I prefer a scheme that would make a normal hex parser throw an error, rather than a user thinking she's protected by a checksum when in fact she is not.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 19, 2016

Contributor

@christianlundkvist that's a good point, which can be solved with UI: show red when it fails, show yellow when it's not checksummed.

Contributor

alexvandesande commented Feb 19, 2016

@christianlundkvist that's a good point, which can be solved with UI: show red when it fails, show yellow when it's not checksummed.

@simenfd

This comment has been minimized.

Show comment
Hide comment
@simenfd

simenfd Feb 19, 2016

@christianlundkvist Exactly my point: False security might be more dangerous than no security. E.g. when I enter a bitcoin address by hand (yeah, quite rarely), I am quite confident that the system will capture an error with the 32bit checksum that is universally implemented there; I wish I will get this confidence in ethereum as well.

For fun, I tried to make some ICAP addresses, using the functions in the go-ethereum implementation. The first two in bold are the original addresses, and the ICAP, the remaining are all ICAP mutation-addresses that validate, but of course, are different addresses.
0x11c5496aee77c1ba1f0854206a26dda82a81d6d8 == XE1222Q908LN1QBBU6XUQSO1OHWJIOS46OO

XE1222Q908LN1QBBU6XUQSO1OHWJIOS4603
XE1222Q908LN1QBBU6XUQSO1OHWJIOS4700
XE1222Q908LN1QBBU6XUQSO1OHWJIOS48IO
XE1222Q908LN1QBBU6XUQSO1OHWJIOS49FO
XE1222Q908LN1QBBU6XUQSO1OHWJIOS5AO5

simenfd commented Feb 19, 2016

@christianlundkvist Exactly my point: False security might be more dangerous than no security. E.g. when I enter a bitcoin address by hand (yeah, quite rarely), I am quite confident that the system will capture an error with the 32bit checksum that is universally implemented there; I wish I will get this confidence in ethereum as well.

For fun, I tried to make some ICAP addresses, using the functions in the go-ethereum implementation. The first two in bold are the original addresses, and the ICAP, the remaining are all ICAP mutation-addresses that validate, but of course, are different addresses.
0x11c5496aee77c1ba1f0854206a26dda82a81d6d8 == XE1222Q908LN1QBBU6XUQSO1OHWJIOS46OO

XE1222Q908LN1QBBU6XUQSO1OHWJIOS4603
XE1222Q908LN1QBBU6XUQSO1OHWJIOS4700
XE1222Q908LN1QBBU6XUQSO1OHWJIOS48IO
XE1222Q908LN1QBBU6XUQSO1OHWJIOS49FO
XE1222Q908LN1QBBU6XUQSO1OHWJIOS5AO5

@christianlundkvist

This comment has been minimized.

Show comment
Hide comment
@christianlundkvist

christianlundkvist Feb 19, 2016

@alexvandesande: My main point was that backwards compatibility allows you to use the address in a dapp that was created before this EIP. So the UI in this case wouldn't know anything about checksummed addresses and wouldn't give the user any specific warning. If the user receives an address like 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 they would think "sweet, it's checksummed!" and type it by hand into an app which hasn't been updated, and lose Ether when they make a typo.

christianlundkvist commented Feb 19, 2016

@alexvandesande: My main point was that backwards compatibility allows you to use the address in a dapp that was created before this EIP. So the UI in this case wouldn't know anything about checksummed addresses and wouldn't give the user any specific warning. If the user receives an address like 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 they would think "sweet, it's checksummed!" and type it by hand into an app which hasn't been updated, and lose Ether when they make a typo.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 19, 2016

Member

I'd like to challenge the idea that we should pay much attention to the "type it in by hand" use cases. If the ecosystem matures then we'll have good tooling around QR-code based transmission of addresses or something else that's even better UX.

If the user receives an address like 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 they would think "sweet, it's checksummed!" and type it by hand into an app which hasn't been updated, and lose Ether

The only way to avoid this situation is to have checksummed addresses be backwards incompatible. I'm of the opinion that backwards incompatibility is worse than cases where someone burns ether using an app that doesn't implement checksumming using an address that "looks" like it's checksummed. I think this situation is likely to be rare and to largely apply to using old software from before the checksum days, or poorly written software.

Member

pipermerriam commented Feb 19, 2016

I'd like to challenge the idea that we should pay much attention to the "type it in by hand" use cases. If the ecosystem matures then we'll have good tooling around QR-code based transmission of addresses or something else that's even better UX.

If the user receives an address like 0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826 they would think "sweet, it's checksummed!" and type it by hand into an app which hasn't been updated, and lose Ether

The only way to avoid this situation is to have checksummed addresses be backwards incompatible. I'm of the opinion that backwards incompatibility is worse than cases where someone burns ether using an app that doesn't implement checksumming using an address that "looks" like it's checksummed. I think this situation is likely to be rare and to largely apply to using old software from before the checksum days, or poorly written software.

@christianlundkvist

This comment has been minimized.

Show comment
Hide comment
@christianlundkvist

christianlundkvist Feb 19, 2016

@pipermerriam:

I'd like to challenge the idea that we should pay much attention to the "type it in by hand" use cases.

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

The only way to avoid this situation is to have checksummed addresses be backwards incompatible.

I feel like this would be preferred.

I think this situation is likely to be rare and to largely apply to using old software from before the checksum days, or poorly written software.

My view is that the moment the checksum is introduced a majority of software becomes old software, and people are notoriously slow at updating too...

christianlundkvist commented Feb 19, 2016

@pipermerriam:

I'd like to challenge the idea that we should pay much attention to the "type it in by hand" use cases.

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

The only way to avoid this situation is to have checksummed addresses be backwards incompatible.

I feel like this would be preferred.

I think this situation is likely to be rare and to largely apply to using old software from before the checksum days, or poorly written software.

My view is that the moment the checksum is introduced a majority of software becomes old software, and people are notoriously slow at updating too...

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 19, 2016

Member

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

My point was that I believe the type-by-hand use case is a small corner case where the user is potentially already doing something questionable. We can still apply checksums to these, but I am of the opinion that we don't need to cater to this use case.

As for the other stuff, I don't have very strong opinions on the matter. Backwards compatibility seems nice but I see the validity in the idea that a breaking change is also a way to achieve a level of security in the area since it removes ambiguity.

Member

pipermerriam commented Feb 19, 2016

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

My point was that I believe the type-by-hand use case is a small corner case where the user is potentially already doing something questionable. We can still apply checksums to these, but I am of the opinion that we don't need to cater to this use case.

As for the other stuff, I don't have very strong opinions on the matter. Backwards compatibility seems nice but I see the validity in the idea that a breaking change is also a way to achieve a level of security in the area since it removes ambiguity.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 20, 2016

Contributor

I don't believe we can expect any users to realize the difference between a check summed address and a normal one (most people don't realize this even for bank accounts when the last digit is separated like12345-7), this is not the point of the checksum.

The point of backwards of compatibility is that transactions between checksum enabled wallets are safer. If you make a typo in a non checksum enabled wallet you'll lose your ether, just like you do now, and it's that particular wallet's developer job to make that client more secure.

Also, I don't think copying by hand is the main situation here, if we were trying to optimize that then we should be talking about pseudo-word seeds and name registries. Checksums are just extra securities against accidental typos, letters that were cut out by copying the wrong digit and are an extra assurance to the user that the address is still intact, just like the icon is.

I don't really see any disadvantage of adding these are they were very simple to implement to web3.js

Although I still haven't matched the initial implementation, probably because basic primitives on Python are very different than what JavaScript comes up with. Since a lot of implementations will be JavaScript I still think it makes more sense to use the sha of the hex, since that's how it comes to the library..

On Feb 19, 2016, at 18:08, Piper Merriam notifications@github.com wrote:

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

My point was that I believe the type-by-hand use case is a small corner case where the user is potentially already doing something questionable. We can still apply checksums to these, but I am of the opinion that we don't need to cater to this use case.

As for the other stuff, I don't have very strong opinions on the matter. Backwards compatibility seems nice but I see the validity in the idea that a breaking change is also a way to achieve a level of security in the area since it removes ambiguity.


Reply to this email directly or view it on GitHub.

Contributor

alexvandesande commented Feb 20, 2016

I don't believe we can expect any users to realize the difference between a check summed address and a normal one (most people don't realize this even for bank accounts when the last digit is separated like12345-7), this is not the point of the checksum.

The point of backwards of compatibility is that transactions between checksum enabled wallets are safer. If you make a typo in a non checksum enabled wallet you'll lose your ether, just like you do now, and it's that particular wallet's developer job to make that client more secure.

Also, I don't think copying by hand is the main situation here, if we were trying to optimize that then we should be talking about pseudo-word seeds and name registries. Checksums are just extra securities against accidental typos, letters that were cut out by copying the wrong digit and are an extra assurance to the user that the address is still intact, just like the icon is.

I don't really see any disadvantage of adding these are they were very simple to implement to web3.js

Although I still haven't matched the initial implementation, probably because basic primitives on Python are very different than what JavaScript comes up with. Since a lot of implementations will be JavaScript I still think it makes more sense to use the sha of the hex, since that's how it comes to the library..

On Feb 19, 2016, at 18:08, Piper Merriam notifications@github.com wrote:

In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?

My point was that I believe the type-by-hand use case is a small corner case where the user is potentially already doing something questionable. We can still apply checksums to these, but I am of the opinion that we don't need to cater to this use case.

As for the other stuff, I don't have very strong opinions on the matter. Backwards compatibility seems nice but I see the validity in the idea that a breaking change is also a way to achieve a level of security in the area since it removes ambiguity.


Reply to this email directly or view it on GitHub.

@christianlundkvist

This comment has been minimized.

Show comment
Hide comment
@christianlundkvist

christianlundkvist Feb 20, 2016

I don't really feel very strongly either way TBH and the design of this particular checksum scheme is actually super cool. 😊
Thinking about my own interactions it's the need to always tell people to NEVER EVER type in an address by hand that gets annoying. But you are right @alexvandesande that as long as I update my own tools to use checksums I don't have to give people this advice anymore when advising them on using the tools that I build. 😊

christianlundkvist commented Feb 20, 2016

I don't really feel very strongly either way TBH and the design of this particular checksum scheme is actually super cool. 😊
Thinking about my own interactions it's the need to always tell people to NEVER EVER type in an address by hand that gets annoying. But you are right @alexvandesande that as long as I update my own tools to use checksums I don't have to give people this advice anymore when advising them on using the tools that I build. 😊

@ethernomad

This comment has been minimized.

Show comment
Hide comment
@ethernomad

ethernomad Feb 20, 2016

Any reason we don't use good old base 58?

ethernomad commented Feb 20, 2016

Any reason we don't use good old base 58?

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 20, 2016

Contributor

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

Sent from my iPhone

On Feb 20, 2016, at 03:08, Jonathan Brown notifications@github.com wrote:

Any reason we don't use good old base 58?


Reply to this email directly or view it on GitHub.

Contributor

alexvandesande commented Feb 20, 2016

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

Sent from my iPhone

On Feb 20, 2016, at 03:08, Jonathan Brown notifications@github.com wrote:

Any reason we don't use good old base 58?


Reply to this email directly or view it on GitHub.

@taoteh1221

This comment has been minimized.

Show comment
Hide comment
@taoteh1221

taoteh1221 Feb 20, 2016

Just chiming in as a web2 dev mostly being an observer (of your work and of end users discussions): If you look at the Ethereum subreddit these days there are a ton of new adopters with no tech experience at all trying to find out how to use Ethereum. In short, I believe anything including typing addresses by hand should be expected. I remember seeing twitter pinned tweets in 2014 with images (not text) of dogecoin addresses for charities etc. A lot of adopters may barely know their way around a computer at all, and I think if you accomplish retaining them you are a raging success and have what is needed for mass adoption.

taoteh1221 commented Feb 20, 2016

Just chiming in as a web2 dev mostly being an observer (of your work and of end users discussions): If you look at the Ethereum subreddit these days there are a ton of new adopters with no tech experience at all trying to find out how to use Ethereum. In short, I believe anything including typing addresses by hand should be expected. I remember seeing twitter pinned tweets in 2014 with images (not text) of dogecoin addresses for charities etc. A lot of adopters may barely know their way around a computer at all, and I think if you accomplish retaining them you are a raging success and have what is needed for mass adoption.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 20, 2016

Contributor

Agree. And adding a case sensitive checksum increases security for those cases, while being invisible for implementations that don't support it

On Feb 20, 2016, at 12:26, Michael Kilday notifications@github.com wrote:

Just chiming in as a web2 dev mostly being an observer (of your work and of end users discussions): If you look at the Ethereum subreddit these days there are a ton of new adopters with no tech experience at all trying to find out how to use Ethereum. In short, I believe anything including typing addresses by hand should be expected. I remember seeing twitter pinned tweets in 2014 with images (not text) of dogecoin addresses for charities etc. A lot of adopters may barely know their way around a computer at all, and I think if you accomplish retaining them you are a raging success and have what is needed for mass adoption.


Reply to this email directly or view it on GitHub.

Contributor

alexvandesande commented Feb 20, 2016

Agree. And adding a case sensitive checksum increases security for those cases, while being invisible for implementations that don't support it

On Feb 20, 2016, at 12:26, Michael Kilday notifications@github.com wrote:

Just chiming in as a web2 dev mostly being an observer (of your work and of end users discussions): If you look at the Ethereum subreddit these days there are a ton of new adopters with no tech experience at all trying to find out how to use Ethereum. In short, I believe anything including typing addresses by hand should be expected. I remember seeing twitter pinned tweets in 2014 with images (not text) of dogecoin addresses for charities etc. A lot of adopters may barely know their way around a computer at all, and I think if you accomplish retaining them you are a raging success and have what is needed for mass adoption.


Reply to this email directly or view it on GitHub.

@jprichardson

This comment has been minimized.

Show comment
Hide comment
@jprichardson

jprichardson Feb 20, 2016

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

The IBAN / ICAP proposal is pretty bad though for those of us who want to use any wallet w/ HD capabilities. If you want HD, you can't have compatible IBAN/ICAP addresses in which case, you might as well just pick something else. Base58-check encoding is a good compromise that's familiar and easy to implement.

jprichardson commented Feb 20, 2016

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

The IBAN / ICAP proposal is pretty bad though for those of us who want to use any wallet w/ HD capabilities. If you want HD, you can't have compatible IBAN/ICAP addresses in which case, you might as well just pick something else. Base58-check encoding is a good compromise that's familiar and easy to implement.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 21, 2016

Contributor

Jonathan, this tread is not about IBAN. There are substantial enough roadblocks and criticism of IBAN that it may be hard to make it the new standard and these can be addresses elsewhere but suffice to say this is a good argument to have a backwards compatible checksum right now.

On Feb 20, 2016, at 20:52, JP Richardson notifications@github.com wrote:

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

The IBAN / ICAP proposal is pretty bad though for those of us who want to use any wallet w/ HD capabilities. If you want HD, you can't have compatible IBAN/ICAP addresses in which case, you might as well just pick something else. Base58-check encoding is a good compromise that's familiar and easy to implement.


Reply to this email directly or view it on GitHub.

Contributor

alexvandesande commented Feb 21, 2016

Jonathan, this tread is not about IBAN. There are substantial enough roadblocks and criticism of IBAN that it may be hard to make it the new standard and these can be addresses elsewhere but suffice to say this is a good argument to have a backwards compatible checksum right now.

On Feb 20, 2016, at 20:52, JP Richardson notifications@github.com wrote:

Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN

The IBAN / ICAP proposal is pretty bad though for those of us who want to use any wallet w/ HD capabilities. If you want HD, you can't have compatible IBAN/ICAP addresses in which case, you might as well just pick something else. Base58-check encoding is a good compromise that's familiar and easy to implement.


Reply to this email directly or view it on GitHub.

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 22, 2016

Contributor

Using binary byte conversion on javascript would be a complicated burden and add a lot of unnecessary complexity on the code IMHO. Instead I was able to simplify the code a lot by simply using the sha3(address.toLowerCase()) and the checking each nth letter of the hash. If it's 9 or upper (including a-f) then it should be uppercase, otherwise, it should be lowercase.

// Make a checksum address
var toChecksumAddress = function (address) {    
    address = address.toLowerCase().replace('0x','');
    var addressHash = web3.sha3(address);
    var checksumAddress = '0x';

    for (var i = 0; i < address.length; i++ ) { 
        // If ith character is 9 to f then make it uppercase 
        if (parseInt(addressHash[i], 16) > 8) {
          checksumAddress += address[i].toUpperCase();
        } else {
            checksumAddress += address[i];
        }
    }
    return checksumAddress;
};

//Check if address is checksum
var isAddress = function (address) {
    if (!/^(0x)?[0-9a-f]{40}$/i.test(address)) {
        // check if it has the basic requirements of an address
        return false;
    } else if (/^(0x)?[0-9a-f]{40}$/.test(address) || /^(0x)?[0-9A-F]{40}$/.test(address)) {
        // If it's all small caps or all all caps, return true
        return true;
    } else {
        // Otherwise check each case
        address = address.replace('0x','');
        var addressHash = web3.sha3(address.toLowerCase());

        for (var i = 0; i < 40; i++ ) { 
            // the nth letter should be uppercase if the nth digit of casemap is 1
            if ((parseInt(addressHash[i], 16) > 8 && address[i].toUpperCase() != address[i]) || (parseInt(addressHash[i], 16) <= 8 && address[i].toLowerCase() != address[i])) {
                return false;
            }
        }
        return true;
    }
};

It's a very simple code, very low impact and I don't see a good reason not to push it on the next release..

Contributor

alexvandesande commented Feb 22, 2016

Using binary byte conversion on javascript would be a complicated burden and add a lot of unnecessary complexity on the code IMHO. Instead I was able to simplify the code a lot by simply using the sha3(address.toLowerCase()) and the checking each nth letter of the hash. If it's 9 or upper (including a-f) then it should be uppercase, otherwise, it should be lowercase.

// Make a checksum address
var toChecksumAddress = function (address) {    
    address = address.toLowerCase().replace('0x','');
    var addressHash = web3.sha3(address);
    var checksumAddress = '0x';

    for (var i = 0; i < address.length; i++ ) { 
        // If ith character is 9 to f then make it uppercase 
        if (parseInt(addressHash[i], 16) > 8) {
          checksumAddress += address[i].toUpperCase();
        } else {
            checksumAddress += address[i];
        }
    }
    return checksumAddress;
};

//Check if address is checksum
var isAddress = function (address) {
    if (!/^(0x)?[0-9a-f]{40}$/i.test(address)) {
        // check if it has the basic requirements of an address
        return false;
    } else if (/^(0x)?[0-9a-f]{40}$/.test(address) || /^(0x)?[0-9A-F]{40}$/.test(address)) {
        // If it's all small caps or all all caps, return true
        return true;
    } else {
        // Otherwise check each case
        address = address.replace('0x','');
        var addressHash = web3.sha3(address.toLowerCase());

        for (var i = 0; i < 40; i++ ) { 
            // the nth letter should be uppercase if the nth digit of casemap is 1
            if ((parseInt(addressHash[i], 16) > 8 && address[i].toUpperCase() != address[i]) || (parseInt(addressHash[i], 16) <= 8 && address[i].toLowerCase() != address[i])) {
                return false;
            }
        }
        return true;
    }
};

It's a very simple code, very low impact and I don't see a good reason not to push it on the next release..

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 22, 2016

Member

@alexvandesande

If it's 9 or upper (including a-f) then it should be uppercase, otherwise, it should be lowercase.

Shouldn't this be "8 or upper" to make the split 8/8 instead of 9/7 for upper/lowercase? Was this intentional or an off-by one error or something else?

Also, can you post some example checksums so that I can test a python implementation against your implementation?

Member

pipermerriam commented Feb 22, 2016

@alexvandesande

If it's 9 or upper (including a-f) then it should be uppercase, otherwise, it should be lowercase.

Shouldn't this be "8 or upper" to make the split 8/8 instead of 9/7 for upper/lowercase? Was this intentional or an off-by one error or something else?

Also, can you post some example checksums so that I can test a python implementation against your implementation?

@jprichardson

This comment has been minimized.

Show comment
Hide comment
@jprichardson

jprichardson Feb 22, 2016

Also, can you post some example checksums so that I can test a python implementation against your implementation?

Yes, I think we're gonna do this as well, so test vectors would be appreciated!

jprichardson commented Feb 22, 2016

Also, can you post some example checksums so that I can test a python implementation against your implementation?

Yes, I think we're gonna do this as well, so test vectors would be appreciated!

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 22, 2016

Member

here is a gist with a python implementation.

https://gist.github.com/pipermerriam/f7633dc2657b1292860c

It differs from @alexvandesande 's js implementation in using 0-7 to mean lowercase and 8-f to mean uppercase. This results in approximately 1:1100 addresses being either all uppercase or lowercase in their checksum format.

Using 0-8/9-f for uppercase/lowercase split changes the collision rate to about 1:700

Member

pipermerriam commented Feb 22, 2016

here is a gist with a python implementation.

https://gist.github.com/pipermerriam/f7633dc2657b1292860c

It differs from @alexvandesande 's js implementation in using 0-7 to mean lowercase and 8-f to mean uppercase. This results in approximately 1:1100 addresses being either all uppercase or lowercase in their checksum format.

Using 0-8/9-f for uppercase/lowercase split changes the collision rate to about 1:700

@alexvandesande

This comment has been minimized.

Show comment
Hide comment
@alexvandesande

alexvandesande Feb 23, 2016

Contributor

@pipermerriam is completely right, I missed by one, should be 8 or higher.

  1. Lowercase and remove 0x from address
  2. sha3 address
  3. change nth letter of address according to this from the nth letter of the hash:
  • 0,1,2,3,4,5,6,7 → Lowercase
  • 8, 9, a, b, c, d, e, f → Uppercase

Here ares some examples:

  • 0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
  • 0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359
  • 0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB
  • 0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb

Would love to see if it matches anyone's else implementation.

Contributor

alexvandesande commented Feb 23, 2016

@pipermerriam is completely right, I missed by one, should be 8 or higher.

  1. Lowercase and remove 0x from address
  2. sha3 address
  3. change nth letter of address according to this from the nth letter of the hash:
  • 0,1,2,3,4,5,6,7 → Lowercase
  • 8, 9, a, b, c, d, e, f → Uppercase

Here ares some examples:

  • 0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
  • 0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359
  • 0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB
  • 0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb

Would love to see if it matches anyone's else implementation.

@vbuterin

This comment has been minimized.

Show comment
Hide comment
@vbuterin

vbuterin Feb 23, 2016

Collaborator

UPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.

Collaborator

vbuterin commented Feb 23, 2016

UPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 23, 2016

Member

@avsa our code agrees on those checksums. Added 4 more test vectors for all upper and all lower checksums. Heres the updated list.

# All caps
0x52908400098527886E0F7030069857D2E4169EE7
0x8617E340B3D01FA5F11F306F4090FD50E238070D
# All Lower
0xde709f2102306220921060314715629080e2fb77
0x27b1fdb04752bbc536007a920d24acb045561c26
# Normal
0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359
0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB
0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb
Member

pipermerriam commented Feb 23, 2016

@avsa our code agrees on those checksums. Added 4 more test vectors for all upper and all lower checksums. Heres the updated list.

# All caps
0x52908400098527886E0F7030069857D2E4169EE7
0x8617E340B3D01FA5F11F306F4090FD50E238070D
# All Lower
0xde709f2102306220921060314715629080e2fb77
0x27b1fdb04752bbc536007a920d24acb045561c26
# Normal
0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359
0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB
0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb
@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Feb 23, 2016

Member

Also, I had the idea of using the x in 0x... as another bit of checksum data by using either an upper or lowercase X/x. Thoughts? Worth the complexity?

Member

pipermerriam commented Feb 23, 2016

Also, I had the idea of using the x in 0x... as another bit of checksum data by using either an upper or lowercase X/x. Thoughts? Worth the complexity?

@axic

This comment has been minimized.

Show comment
Hide comment
@axic

axic Oct 21, 2016

Member

@chevdor the main tree is at https://github.com/ethereumjs/ethereumjs-util/ and passes the tests listed in #55 (comment)

Member

axic commented Oct 21, 2016

@chevdor the main tree is at https://github.com/ethereumjs/ethereumjs-util/ and passes the tests listed in #55 (comment)

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Oct 25, 2016

Member

8-f is 50/50.

On Thu, Oct 20, 2016, 2:24 PM Chevdor notifications@github.com wrote:

Short summary because it seems that implementations have evolved and I
chased the correct implementation.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAyTgkbzOVgKgo023hX_YeKWcyp_gR3Aks5q184MgaJpZM4HEtnF
.

Member

pipermerriam commented Oct 25, 2016

8-f is 50/50.

On Thu, Oct 20, 2016, 2:24 PM Chevdor notifications@github.com wrote:

Short summary because it seems that implementations have evolved and I
chased the correct implementation.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAyTgkbzOVgKgo023hX_YeKWcyp_gR3Aks5q184MgaJpZM4HEtnF
.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Oct 25, 2016

Member

Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be
capitalised anyway.

This isn't correct. You capitalize based on the digit in the sha3 of the
lowcased 40 character (20 byte) hexidecimal representation of the address.
The capitalization is done to the actual characters of the address itself
so there is a difference between >=8 and >=9. >=8 is the correct
implementation.

Another python implementation here:
https://github.com/pipermerriam/web3.py/blob/master/web3/utils/address.py#L45

On Fri, Oct 21, 2016 at 3:37 AM Alex Beregszaszi notifications@github.com
wrote:

@chevdor https://github.com/chevdor the main tree is at
https://github.com/ethereumjs/ethereumjs-util/ and passes the tests
listed in this EIP.

Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be
capitalised anyway.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAyTgk486tOB8GMnuXw2RKBFQYpq30Alks5q2IftgaJpZM4HEtnF
.

Member

pipermerriam commented Oct 25, 2016

Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be
capitalised anyway.

This isn't correct. You capitalize based on the digit in the sha3 of the
lowcased 40 character (20 byte) hexidecimal representation of the address.
The capitalization is done to the actual characters of the address itself
so there is a difference between >=8 and >=9. >=8 is the correct
implementation.

Another python implementation here:
https://github.com/pipermerriam/web3.py/blob/master/web3/utils/address.py#L45

On Fri, Oct 21, 2016 at 3:37 AM Alex Beregszaszi notifications@github.com
wrote:

@chevdor https://github.com/chevdor the main tree is at
https://github.com/ethereumjs/ethereumjs-util/ and passes the tests
listed in this EIP.

Capitalising for >= 8 or >= a should be identical as 8 and 9 cannot be
capitalised anyway.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAyTgk486tOB8GMnuXw2RKBFQYpq30Alks5q2IftgaJpZM4HEtnF
.

@chevdor

This comment has been minimized.

Show comment
Hide comment
@chevdor

chevdor Oct 25, 2016

@pipermerriam I think you are commenting old comments. I discussed with @axic and the topic is clear.
I do agree with your comment about >=8 not being the same than >=9 since it is based on the hash.

chevdor commented Oct 25, 2016

@pipermerriam I think you are commenting old comments. I discussed with @axic and the topic is clear.
I do agree with your comment about >=8 not being the same than >=9 since it is based on the hash.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Oct 25, 2016

Member

@chevdor not sure what happened there. I must have been looking at really old email notifications or something. 😄 carry on.. nothing to see here...

Member

pipermerriam commented Oct 25, 2016

@chevdor not sure what happened there. I must have been looking at really old email notifications or something. 😄 carry on.. nothing to see here...

@axic

This comment has been minimized.

Show comment
Hide comment
@axic

axic Oct 25, 2016

Member

@pipermerriam I've commented that without reading the implementation from months ago 😃

Member

axic commented Oct 25, 2016

@pipermerriam I've commented that without reading the implementation from months ago 😃

@Recmo

This comment has been minimized.

Show comment
Hide comment
@Recmo

Recmo Nov 18, 2016

Initially, @vbuterin suggested to capitalise whenever the hash character is a..f

No. The original proposal capitalizes the n-th hex-digit whenever the n-th bit in the hash of the address is set. So the first 40 bits of the 224 bit hash are used.

The current implementation modifies this by taking the hash of the lowercase hexadecimal encoding of the address and then it uses every fourth bit for capitalization (so 1st bit, 5th bit, etc.). The main reason for this extra complexity is that Javascript or it's libraries are bad at handling binary data, and this is somehow easier.

Here is @vbuterin original implementation updated with these changes. It passes @alexvandesande's test vectors:

from ethereum import utils

def checksum_encode2(addr): # Takes a 20-byte binary address as input
    o = ''
    v = utils.big_endian_to_int(utils.sha3(addr.hex()))
    for i, c in enumerate(addr.hex()):
        if c in '0123456789':
            o += c
        else:
            o += c.upper() if (v & (2**(255 - 4*i))) else c.lower()
    return '0x'+o

def test(addrstr):
    assert(addrstr == checksum_encode2(bytes.fromhex(addrstr[2:])))

test('0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed')
test('0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359')
test('0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB')
test('0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb')

Recmo commented Nov 18, 2016

Initially, @vbuterin suggested to capitalise whenever the hash character is a..f

No. The original proposal capitalizes the n-th hex-digit whenever the n-th bit in the hash of the address is set. So the first 40 bits of the 224 bit hash are used.

The current implementation modifies this by taking the hash of the lowercase hexadecimal encoding of the address and then it uses every fourth bit for capitalization (so 1st bit, 5th bit, etc.). The main reason for this extra complexity is that Javascript or it's libraries are bad at handling binary data, and this is somehow easier.

Here is @vbuterin original implementation updated with these changes. It passes @alexvandesande's test vectors:

from ethereum import utils

def checksum_encode2(addr): # Takes a 20-byte binary address as input
    o = ''
    v = utils.big_endian_to_int(utils.sha3(addr.hex()))
    for i, c in enumerate(addr.hex()):
        if c in '0123456789':
            o += c
        else:
            o += c.upper() if (v & (2**(255 - 4*i))) else c.lower()
    return '0x'+o

def test(addrstr):
    assert(addrstr == checksum_encode2(bytes.fromhex(addrstr[2:])))

test('0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed')
test('0xfB6916095ca1df60bB79Ce92cE3Ea74c37c5d359')
test('0xdbF03B407c01E7cD3CBea99509d93f8DDDC8C6FB')
test('0xD1220A0cf47c7B9Be7A2E6BA89F429762e7b9aDb')
@sathishvj

This comment has been minimized.

Show comment
Hide comment
@sathishvj

sathishvj Dec 26, 2016

Is there a valid, latest go implementation of this that you could recommend?

sathishvj commented Dec 26, 2016

Is there a valid, latest go implementation of this that you could recommend?

@almindor

This comment has been minimized.

Show comment
Hide comment
@almindor

almindor Jun 19, 2017

Contributor

Could someone please finally specify the Hash algorithm used to hash the address and get the bits from?

There are at least 3 different hashes mentioned and even used in various imlementations.

My understanding is that the correct hash is supposed to be SHA3-256, but it seems some implementations are using SHA3-224 and others use Keccak-256 and Keccak-224

Contributor

almindor commented Jun 19, 2017

Could someone please finally specify the Hash algorithm used to hash the address and get the bits from?

There are at least 3 different hashes mentioned and even used in various imlementations.

My understanding is that the correct hash is supposed to be SHA3-256, but it seems some implementations are using SHA3-224 and others use Keccak-256 and Keccak-224

@vaib999

This comment has been minimized.

Show comment
Hide comment
@vaib999

vaib999 Jun 27, 2017

I am curious what java implementation of this is ?

vaib999 commented Jun 27, 2017

I am curious what java implementation of this is ?

@cdetrio

This comment has been minimized.

Show comment
Hide comment
@cdetrio

cdetrio Jun 27, 2017

Member

@almindor

You'll find the correct specification and example implementations at the file here: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. The file also includes an adoption table to help track the adoption of EIP-55 checksums in the ecosystem.

We're going to close this issue now. If any corrections need to be made (or to update the adoption table), please open a PR on the file.

Member

cdetrio commented Jun 27, 2017

@almindor

You'll find the correct specification and example implementations at the file here: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. The file also includes an adoption table to help track the adoption of EIP-55 checksums in the ecosystem.

We're going to close this issue now. If any corrections need to be made (or to update the adoption table), please open a PR on the file.

@prusnak

This comment has been minimized.

Show comment
Hide comment
@prusnak

prusnak Jul 11, 2017

You should edit the example code and test vectors in the first post. It is wrong and someone who does not read the whole conversation will use the incorrect implementation.

prusnak commented Jul 11, 2017

You should edit the example code and test vectors in the first post. It is wrong and someone who does not read the whole conversation will use the incorrect implementation.

@cdetrio

This comment has been minimized.

Show comment
Hide comment
@cdetrio

cdetrio Aug 24, 2017

Member

This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. Please go there for the correct specification. The text in this issue may be incorrect or outdated, and is not maintained.

Member

cdetrio commented Aug 24, 2017

This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. Please go there for the correct specification. The text in this issue may be incorrect or outdated, and is not maintained.

@axic

This comment has been minimized.

Show comment
Hide comment
@axic

axic Nov 16, 2017

Member

@cdetrio can you push the "official test suite" into the EIP?

I believe it is this one: #55 (comment)

Member

axic commented Nov 16, 2017

@cdetrio can you push the "official test suite" into the EIP?

I believe it is this one: #55 (comment)

@adyliu

This comment has been minimized.

Show comment
Hide comment

adyliu commented Aug 3, 2018

@voron

This comment has been minimized.

Show comment
Hide comment
@voron

voron Aug 6, 2018

Current python3 eth-utils implementation

python3 -c "from eth_utils import address; import sys; print(address.to_checksum_address(sys.argv[1]));" 0x5aaeb6053f3e94c9b9a09f33669435e7ef1beaed

Output is

0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed

voron commented Aug 6, 2018

Current python3 eth-utils implementation

python3 -c "from eth_utils import address; import sys; print(address.to_checksum_address(sys.argv[1]));" 0x5aaeb6053f3e94c9b9a09f33669435e7ef1beaed

Output is

0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment