Summary: Some Japanese phrases have millions of possible
pronunciations most likely because a critical phrase is not in the
mecab dictionary system.  Presently this is handled by splitting on
punctuation and getting the translations for each part and truncating
the list if it exceeds 10,000 entries.  The result will probably be
wrong, but the index process will always finish rather than consuming
all RAM, SWAP until it fails.

A bug introduce by the above require retaining supported punctuation
as this is required to search for certain Japanese words.