Add NFKD normalization + Fix for Japanese #27

dabura667 · 2015-07-12T11:34:24Z

This library was not using normalization at all, which for Spanish and Japanese would have produced invalid seeds. (So anyone using this library to generate wallets from Spanish or Japanese phrases must use the old non-normalizing version to recover their funds. I doubt anyone is there... but a warning might be necessary? or maybe create a new function for generating non-normalized seed?)

Chinese and English were ok, as their wordlists were pre-normalized (so unorm.nfkd(words) == words) so they should be fine as is.

Also, I added in one change for Japanese, mentioned on the BIP39:
Japanese must be shown to the user being separated by an ideographic space. This is crucial to ensure users don't accidentally view 2 words as 1 word.

Ex.

"a part" // this is a normal ASCII space
"a　part" // this is an ideographic space

It doesn't seem that necessary when letters are so small, but looking at Japanese.

"あさ ごはん" // this is a normal ASCII space
"あさ　ごはん" // this is an ideographic space

It makes a huge difference, and the latter must be shown to the user.

Also notice that ideographic space will be replaced by ASCII space when NFKD normalized, so while the words themselves are NFKD in the wordlist, because Japanese requires non-NFKD ideographic spaces for the "phrase" string, I have placed a catch-all NFKD in the call to pbkdf2 around this.phrase

Users in Japan will also likely input the phrase using ideographic spaces to input it, so I NFKD the mnemonic input to outward facing functions.

dabura667 · 2015-07-12T11:54:39Z

My main motivation for this fix is because it seems like Copay might use BIP39 in backups some way, and I was checking to see the ideographic space usage and luckily I did, as Spanish would have been generating bad wallets (non-BIP39-standard).

Also note that Japanese phrases are unique in that ALL Japanese characters are "breakable" and thus textwrap will break a word in the middle on a line break. Care must be taken to ensure a word is not broken on the line break and shown to the user.

Reference: (breadwallet is still trying to get it right... it is difficult)
voisine/breadwallet-ios#231
voisine/breadwallet-ios@dd1bfae

dabura667 · 2015-07-12T12:02:59Z

Wait a sec, I will add Japanese test vectors.

dabura667 · 2015-07-12T12:24:51Z

I tried to alter the test vectors to also use a Japanese passphrase with pbkdf2 instead of just "TREZOR" so that it could test normalizing of passphrase as well.

I realized how long it will take, and I don't have enough time, so I will just submit this PR as is.

Here are the Japanese test vectors I have prepared:
non-normalized strings: https://raw.githubusercontent.com/bip32JP/bip32JP.github.io/6f6090b49bb718711904468bce99a73770e09071/test_JP_BIP39.json
normalized (except for spaces) strings: https://raw.githubusercontent.com/bip32JP/bip32JP.github.io/377f72c5087533c34c79ba02335d1fbc5509dfa5/test_JP_BIP39.json

matiu · 2015-07-13T13:32:21Z

LGTM, great work.

pnagurny · 2015-07-13T15:47:08Z

LGTM

Add NFKD normalization + Fix for Japanese

dabura667 added 5 commits July 12, 2015 19:40

Fix wordlist to pre-normalized list

d8bc908

Add NFKD normalization + ideographic space for JP

d7e2ea1

Add unorm to dependencies

79a3838

NFKD normalize spanish

6599c4f

Fixed test to use normalized spanish

07df984

dabura667 mentioned this pull request Jul 12, 2015

Better backups bitpay/wallet#2955

Closed

dabura667 closed this Jul 12, 2015

dabura667 reopened this Jul 12, 2015

matiu added a commit that referenced this pull request Jul 13, 2015

Merge pull request #27 from dabura667/master

b7ad160

Add NFKD normalization + Fix for Japanese

matiu merged commit b7ad160 into bitpay:master Jul 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NFKD normalization + Fix for Japanese #27

Add NFKD normalization + Fix for Japanese #27

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

matiu commented Jul 13, 2015

pnagurny commented Jul 13, 2015

Add NFKD normalization + Fix for Japanese #27

Add NFKD normalization + Fix for Japanese #27

Conversation

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

dabura667 commented Jul 12, 2015

matiu commented Jul 13, 2015

pnagurny commented Jul 13, 2015