Evaluation Dataset for Japanese Lexical Simplification
Sentences selected from BCCWJ, so they are not published.
Here, program which extract sentence is published.
This program is made by Python 2.7 .
git clone https://github.com/KodairaTomonori/EvaluationDataset
python get\sent_from_BCCWJ.py xxxx/BCCWJ/SUW/
substitution ranking is in substitutes folder.
subs.csv: target word list
ave_rank.csv and mle_rank.csv: Substitutes in these file is sorted by average score and MLE score.
Cmma is indicated different rank, and space is indicated same rank.
Tokyo Metropolitan University
System Design - Komachi Lab
Name: Kodaira Tomonori