Implementation of "Unsupervised joke generation from big data" (ACL 2013)
Implementation of Japanese version of "Unsupervised joke generation from big data" (ACL 2013)


You have to prepare Japanese WordNet sqlite3 database. Download from here ( Put wnjpn.db in the same directory as the scripts.

To train the model, run this command. $ python --corpus [your n-gram file]

N-gram file should consist of the line which has the format as follows: [token-1][token-2]...[token-n][count]

Google N-gram corpus follows this format, so you can use it as corpus.


To generate the nazokake, run this command.

$ python --model [model generated by training mode]

Example Output


Sasa Petrovic and David Matthews, "Unsupervised joke generation from big data," The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013), Sofia, Bulgaria, August 4-9, 2013

yanbe.diff, "Frontend program to search from Japanese WordNet database." (Japanese: 日本語WordNetのデータベースを探索するフロントエンドプログラム)