Parse words using a Lark context-free-grammar file
parse_files.sh
parse_with_lark.py
This readme file
README.md
sed
script to convert from Stolfi's format to Lark
stolfi_to_lark.sh
Generate a parse tree .png
image
tree_img.py
tree_label.sh
trees.sh
The revised syllable-based grammar used in the experiments
grammar/EMS_edited.lark
A context-free grammar closer to Emma May Smith's original
grammar/EMS_orig.lark
Lark version of Stolfi's grammar
grammar/NormalWord.grx.lark
Stolfi's grammar used in the experiments: some rare rules were removed.
grammar/Stolfi_reduced.lark
Most frequent 1000 words in the four files listed below
in/1000_neg_king_james_36500.freq
in/1000_neg_TT_shifted_good_chars_no_vms_36500.freq
in/1000_prob_obs_txt.n_word.frq_NO_bh.csv
in/1000_TT_freq.csv
Non-VMS words from King James bible
in/neg_king_james_36500.freq
Non-VMS words generated by randomly moving characters in Takahashi's trascription
in/neg_TT_shifted_good_chars_no_vms_36500.freq
List of VMS words from Stolfi's site. bh
converted back to ee
in/prob_obs_txt.n_word.frq_NO_bh.csv
List of VMS words from Takahashi's transcription
in/TT_freq.csv
stolfi/urls.txt
stolfi/word.frq
stolfi/word.grx