Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add proposed Ukrainian wordlist (bitcoin/bips#442)
Notice: This is hidden behind the -W flag; see 8aaa6f3. This is not exactly the wordlist proposed in the pull request. The file ukrainian.txt from Bohdat/bips@152fc59 has a bug, in addition to the usual normalization and sorting concerns: A trailing space (0x20) and tab (0x09, '\t') after the word at original index 1393, 1-based line number 1394, and before the newline '\n'. The problem was first identified by failure of easyseed's extensive internal self-tests, followed by examination of the problem with cmp(1) and hex dumps to diagnose the difference between the wordlist in my source tree, and the wordlist printed on stdout by `easyseed -W -P -l uk`. The following is edited for line length limits in the git log, but it adequately shows the problem: $ grep -E '[[:space:]]$' ukrainian.txt | hd 00000000 d0 bf d1 96 d1 81 d0 bd d1 8f 20 09 0a $ grep -En '[[:space:]]$' ukrainian.txt 1394:пісня <*end of line is here*> It is fixed with the following command: $ sed -E -e 's/[[:space:]]+$//' < ukrainian.txt > ukfix1/uk_fixed0.txt After verification that this command made no other changes, it is normalized and sorted: $ ls -l ukrainian.txt ukfix1/uk_fixed0.txt -rw-r--r-- 1 user user 24550 Jan 7 21:26 ukfix1/uk_fixed0.txt -rw-r--r-- 1 user user 24552 Jan 7 20:31 ukrainian.txt $ diff -u3 ukrainian.txt ukfix1/uk_fixed0.txt [...showing only the desired line changed...] $ uconv -f utf-8 -t utf-8 -x '::nfkd;' < uk_fixed0.txt | \ LC_ALL=C LANG=C sort -s > uk_fixed1.txt $ mv -i uk_fixed1.txt ../../easyseed/wordlist/ukrainian.txt mv: overwrite '../../easyseed/wordlist/ukrainian.txt'? y (Note with ref to 234c66c: When normalizing and sorting the russian.txt list, I forgot to force the locale for `sort(1)`. I verified that this makes no difference, and the 234c66c russian.txt is correct. It *does* make a very large difference for the Ukrainian wordlist.) SHA-256 hash for the resulting ukrainian.txt: 612ee29e1fa13dc38c9e1b31c7ef980db8f3c8dd30f1c9377170d1b10e895dc9
- Loading branch information