Implementation of Japanese version of "Unsupervised joke generation from big data" (ACL 2013)
You have to prepare Japanese WordNet sqlite3 database. Download from here (http://nlpwww.nict.go.jp/wn-ja/). Put wnjpn.db in the same directory as the scripts.
To train the model, run this command. $ python joke.py --corpus [your n-gram file]
N-gram file should consist of the line which has the format as follows: [token-1][token-2]...[token-n][count]
Google N-gram corpus follows this format, so you can use it as corpus.
To generate the nazokake, run this command.
$ python joke.py --model [model generated by training mode]
Sasa Petrovic and David Matthews, "Unsupervised joke generation from big data," The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013), Sofia, Bulgaria, August 4-9, 2013
yanbe.diff, "Frontend program to search from Japanese WordNet database." (Japanese: 日本語WordNetのデータベースを探索するフロントエンドプログラム) http://subtech.g.hatena.ne.jp/y_yanbe/20090314/p2