Skip to content

marcoponzi/vms_word_models

Repository files navigation

vms_word_models

Parse words using a Lark context-free-grammar file

parse_files.sh parse_with_lark.py

This readme file

README.md

sed script to convert from Stolfi's format to Lark

stolfi_to_lark.sh

Generate a parse tree .png image

tree_img.py tree_label.sh trees.sh

Lark grammar files

The revised syllable-based grammar used in the experiments

grammar/EMS_edited.lark

A context-free grammar closer to Emma May Smith's original

grammar/EMS_orig.lark

Lark version of Stolfi's grammar

grammar/NormalWord.grx.lark

Stolfi's grammar used in the experiments: some rare rules were removed.

grammar/Stolfi_reduced.lark

Input files

Most frequent 1000 words in the four files listed below

in/1000_neg_king_james_36500.freq in/1000_neg_TT_shifted_good_chars_no_vms_36500.freq in/1000_prob_obs_txt.n_word.frq_NO_bh.csv in/1000_TT_freq.csv

Non-VMS words from King James bible

in/neg_king_james_36500.freq

Non-VMS words generated by randomly moving characters in Takahashi's trascription

in/neg_TT_shifted_good_chars_no_vms_36500.freq

List of VMS words from Stolfi's site. bh converted back to ee

in/prob_obs_txt.n_word.frq_NO_bh.csv

List of VMS words from Takahashi's transcription

in/TT_freq.csv

Original files by Stolfi

stolfi/urls.txt stolfi/word.frq stolfi/word.grx

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages