File employed to create dictionary: yomichan_fre_dict_from_tsv.py
Corpus of Everyday Japanese Conversation https://www2.ninjal.ac.jp/conversation/cejc/cejc-wc.html (2nd zip, file 3_cejc_frequencylist_suw_token.xlsx)
https://github.com/FooSoft/yomichan
https://www.ninjal.ac.jp/english/research/cr-project/project-3/institute/spoken-language/
The Corpus of Everyday Japanese Conversation (CEJC) is a vocabulary and word count table based on 200 hours of recorded data (approximately from April 2016 to 2020).
Our project will develop a large-scale corpus of Japanese everyday conversation in a balanced manner. Since informants record their conversations in everyday situations by themselves, naturally occurring conversations can be collected. To build an empirical foundation for the corpus design, we conducted a survey of ordinary conversational behavior of about 250 adults."
Since there were several ranks included in the file, the overall rank was chosen to generate this frequency dictionary.