You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
predict.py : start all steps of frequency model, word-level model, and char-level model. (Input : all xmls under XML directory.)
freq_get_input.py : convert all xmls into combined text & extract the text excluding all special characters & extract list of words with dots.
freq_train.py : get the list of the prediction of dot words.
freq_get_output.py : give the output of modified xmls after correction.
word_get_input.py : convert all xmls into combined text. (Input : the output of xmls from the frequency model.)
word_preprocess.py : generate training data and testing data as numpy array of embedding.
word_train.py : train the word-level language model & give the list of prediction of dot words.
word_get_output.py : convert combined text into each xmls. (word_lstm_model.py : definition the word-level language model, word_corpus_data.py : tokenize all the file contents.)
char_get_input.py : convert all xmls into combined text. (Input : the output of xmls from the word-level language model.)
char_preprocess.py : generate training data and testing data as numpy array of embedding.
char_train.py : train the char-level language model & give the list of prediction of dot words.
char_get_output.py : convert combined text into each xmls. (char_lstm_model.py : definition the char-level language model.)
Directories
XML : (INITIAL INPUT) all xml files
freq_data : text files and data file for the frequency model
freq_output : fixed xml files from the frequency model
word_data : text files and data file for the word-level model
word_ptb_models : word-level language model
word_output : fixed xml files from the word-level model
char_data : text files and data file for the char-level model
char_ptb_models : char-level language model
FIXED_XML : (FINAL OUTPUT) fixed xml files from the char-level model