Skip to content

Frequency dictionaries for yomitan based on the Corpus of Spontaneous Japanese and NINJAL Web Japanese Corpus datasets

Notifications You must be signed in to change notification settings

Maltesaa/CSJ_and_NWJC_yomitan_freq_dict

 
 

Repository files navigation

Corpus of Spontaneous Japanese and NINJAL Web Japanese Corpus Yomitan Frequency Dictionaries

Fork of forsakeninfinity’s script to support converting CSJ and NWJC. Check his repo for information.

The Corpus of Spontaneous Japanese - CSJ

Goes up to 31,605 frequency

Download here

“The Corpus of Spontaneous Japanese” (or CSJ) is a database containing a large collection of Japanese spoken language data and information for use in linguistic research; jointly developed by NINJAL, NICT and the Tokyo Institute of Technology, the CSJ is world-class in both the quantity and quality of the available data.

Has different domains you can download from the CSJ Releases folder.

More information can be found here

NINJAL Web Japanese Corpus - NWJC

Goes up to 106,762 frequency

Download here

More information can be found here (in Japanese)

About

Frequency dictionaries for yomitan based on the Corpus of Spontaneous Japanese and NINJAL Web Japanese Corpus datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%