Update README.md

timwarnock · Jun 7, 2019 · 473e828 · 473e828
1 parent 2e3e473
commit 473e828
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -81,7 +81,7 @@ Scan through a 10,000 x 10,000 character grid. As expected, datrie outperformed
     0inputs+0outputs (0major+76902minor)pagefaults 0swaps
 
 ## Trie vs set() for Japanese (N=1,000,000)
-Interestingly, for Japanese characters (scanning through a random grid of mostly Kanji), the performance difference was more pronounced. This is useful to know because Japanese (like Chinese) does not use obvious word boundaries and would benefit from using set() rather than Trie for Japanese language parsers.
+Interestingly, for Japanese characters (scanning through a random grid of mostly Kanji), the performance difference was more pronounced. This is useful to know because Japanese (like Chinese) does not use obvious word boundaries and would benefit from using set() rather than Trie for Japanese language parsers. For 日本.txt, I extracted all kanji and kana from [EDICT](http://edrdg.org/jmdict/edict.html).
 
     $ /usr/bin/time ./test_j_set.py
     4140