Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmp files not genereted #1

Closed
candlewill opened this issue Aug 24, 2017 · 8 comments
Closed

cmp files not genereted #1

candlewill opened this issue Aug 24, 2017 · 8 comments

Comments

@candlewill
Copy link
Contributor

I try to run the rss_toy_demo example step by step according to the README.md. However, I get errors when execute python ./scripts/train.py -s rss_toy_demo -l rm naive_01_nn. I dig into the code, found that there is no *.cmp files under /train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/ folder.

Could anyone tell me why? And how to generate these files?

sys.exit('set_up_data.py: No matching data files found in %s and %s'%( \

Here is the detail info:

root@de-3879-ng-1-034425-3089955241-qx7zj:~/workspace/Projects/Ossian# python ./scripts/train.py -s rss_toy_demo -l rm naive_01_nn
 -- Gather corpus
 -- Train voice
/root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn
/root/workspace/Projects/Ossian/voices//rm/rss_toy_demo/naive_01_nn
try loading config from python...
/root/workspace/Projects/Ossian/recipes/naive_01_nn.cfg
{'state_contexts': [('start_time', './attribute::start'), ('end_time', './attribute::end'), ('htk_state', 'count(./preceding-sibling::state) + 1'), ('htk_monophone', './ancestor::segment/attribute::pronunciation'), ('ll_segment', './ancestor::segment/preceding::segment[2]/attribute::pronunciation'), ('l_segment', './ancestor::segment/preceding::segment[1]/attribute::pronunciation'), ('c_segment', './ancestor::segment/attribute::pronunciation'), ('r_segment', './ancestor::segment/following::segment[1]/attribute::pronunciation'), ('rr_segment', './ancestor::segment/following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'dur_label_maker': <FeatureDumper.FeatureDumper object at 0x7fd779ceb6d0>, 'SKLDecisionTreePausePredictor': <class 'SKLProcessors.SKLDecisionTreePausePredictor'>, 'train_stages': [[<Tokenisers.RegexTokeniser object at 0x7fd77a645b90>, <Phonetisers.NaivePhonetiser object at 0x7fd779cd5d50>, <VSMTagger.VSMTagger object at 0x7fd779ceb710>], [<FeatureDumper.FeatureDumper object at 0x7fd779ceb790>, <FeatureExtractor.WorldExtractor object at 0x7fd779ceb510>, <Aligner.StateAligner object at 0x7fd779ceb590>, <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fd779ceb610>, <PhraseMaker.PhraseMaker object at 0x7fd779ceb7d0>, <FeatureDumper.FeatureDumper object at 0x7fd779ceb690>], [<FeatureDumper.FeatureDumper object at 0x7fd779ceb6d0>, <NN.NNDurationPredictor object at 0x7fd77ea2bb50>, <FeatureDumper.FeatureDumper object at 0x7fd779cd5d90>, <NN.NNAcousticPredictor object at 0x7fd779cd5ed0>]], 'PUNC_PATT': '[\\p{C}||\\p{P}||\\p{S}]', 'JUNCTURE_NODES': "//token[@token_class='space'] | //token[@token_class='punctuation']", 'WorldExtractor': <class 'FeatureExtractor.WorldExtractor'>, 'RegexTokeniser': <class 'Tokenisers.RegexTokeniser'>, 'current_dir': '/root/workspace/Projects/Ossian/recipes', 'phrase_adder': <PhraseMaker.PhraseMaker object at 0x7fd779ceb7d0>, 'dur_data_maker': <FeatureDumper.FeatureDumper object at 0x7fd779ceb690>, 'pause_predictor': <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fd779ceb610>, 'speech_generation': [<FeatureDumper.FeatureDumper object at 0x7fd779ceb6d0>, <NN.NNDurationPredictor object at 0x7fd77ea2bb50>, <FeatureDumper.FeatureDumper object at 0x7fd779cd5d90>, <NN.NNAcousticPredictor object at 0x7fd779cd5ed0>], 'runtime_stages': [[<Tokenisers.RegexTokeniser object at 0x7fd77a645b90>, <Phonetisers.NaivePhonetiser object at 0x7fd779cd5d50>, <VSMTagger.VSMTagger object at 0x7fd779ceb710>], [<SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fd779ceb610>, <PhraseMaker.PhraseMaker object at 0x7fd779ceb7d0>], [<FeatureDumper.FeatureDumper object at 0x7fd779ceb6d0>, <NN.NNDurationPredictor object at 0x7fd77ea2bb50>, <FeatureDumper.FeatureDumper object at 0x7fd779cd5d90>, <NN.NNAcousticPredictor object at 0x7fd779cd5ed0>]], 'text_proc': [<Tokenisers.RegexTokeniser object at 0x7fd77a645b90>, <Phonetisers.NaivePhonetiser object at 0x7fd779cd5d50>, <VSMTagger.VSMTagger object at 0x7fd779ceb710>], 'acoustic_predictor': <NN.NNAcousticPredictor object at 0x7fd779cd5ed0>, 'duration_predictor': <NN.NNDurationPredictor object at 0x7fd77ea2bb50>, 'NNAcousticPredictor': <class 'NN.NNAcousticPredictor'>, 'align_label_dumper': <FeatureDumper.FeatureDumper object at 0x7fd779ceb790>, 'pause_predictor_features': [('response', './attribute::has_silence="yes"'), ('token_is_punctuation', './attribute::token_class="punctuation"'), ('since_start_utterance_in_words', "count(preceding::token[@token_class='word'])"), ('till_end_utterance_in_words', "count(following::token[@token_class='word'])"), ('L_vsm_d1', "./preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './attribute::vsm_d1'), ('R_vsm_d1', "./following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './attribute::vsm_d2'), ('R_vsm_d2', "./following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './attribute::vsm_d3'), ('R_vsm_d3', "./following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './attribute::vsm_d4'), ('R_vsm_d4', "./following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './attribute::vsm_d5'), ('R_vsm_d5', "./following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './attribute::vsm_d6'), ('R_vsm_d6', "./following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './attribute::vsm_d7'), ('R_vsm_d7', "./following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './attribute::vsm_d8'), ('R_vsm_d8', "./following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './attribute::vsm_d9'), ('R_vsm_d9', "./following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './attribute::vsm_d10'), ('R_vsm_d10', "./following::token[@token_class='word'][1]/attribute::vsm_d10")], 'PhraseMaker': <class 'PhraseMaker.PhraseMaker'>, 'alignment': [<FeatureDumper.FeatureDumper object at 0x7fd779ceb790>, <FeatureExtractor.WorldExtractor object at 0x7fd779ceb510>, <Aligner.StateAligner object at 0x7fd779ceb590>, <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fd779ceb610>, <PhraseMaker.PhraseMaker object at 0x7fd779ceb7d0>, <FeatureDumper.FeatureDumper object at 0x7fd779ceb690>], 'VSMTagger': <class 'VSMTagger.VSMTagger'>, 'duration_data_contexts': [('state_1_nframes', '(./state[1]/attribute::end - ./state[1]/attribute::start) div 5'), ('state_2_nframes', '(./state[2]/attribute::end - ./state[2]/attribute::start) div 5'), ('state_3_nframes', '(./state[3]/attribute::end - ./state[3]/attribute::start) div 5'), ('state_4_nframes', '(./state[4]/attribute::end - ./state[4]/attribute::start) div 5'), ('state_5_nframes', '(./state[5]/attribute::end - ./state[5]/attribute::start) div 5')], 'phone_and_state_contexts': [('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'word_vector_tagger': <VSMTagger.VSMTagger object at 0x7fd779ceb710>, 'inspect': <module 'inspect' from '/usr/lib/python2.7/inspect.pyc'>, 'tokenisation_pattern': '(\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]*\\p{Z}+|\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]+\\Z)', 'aligner': <Aligner.StateAligner object at 0x7fd779ceb590>, 'sys': <module 'sys' (built-in)>, 'dnn_label_maker': <FeatureDumper.FeatureDumper object at 0x7fd779cd5d90>, 'NaivePhonetiser': <class 'Phonetisers.NaivePhonetiser'>, 'tokeniser': <Tokenisers.RegexTokeniser object at 0x7fd77a645b90>, 'LETTER_PATT': '[\\p{L}||\\p{N}||\\p{M}]', 'AcousticModelWorld': <class 'AcousticModel.AcousticModelWorld'>, 'speech_feature_extractor': <FeatureExtractor.WorldExtractor object at 0x7fd779ceb510>, 'dim': 10, 'c': <module 'default.const' from '/root/workspace/Projects/Ossian/scripts/default/const.pyc'>, 'FeatureDumper': <class 'FeatureDumper.FeatureDumper'>, 'word_vsm_dim': 10, 'speech_coding_config': {'delta_delta_window': '1.0 -2.0 1.0', 'static_window': '1', 'order': 59, 'delta_window': '-0.5 0.0 0.5'}, 'pause_prediction': [<SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fd779ceb610>, <PhraseMaker.PhraseMaker object at 0x7fd779ceb7d0>], 'i': 5, 'SPACE_PATT': '\\p{Z}', 'PUNC_OR_SPACE_PATT': '[\\p{Z}||\\p{C}||\\p{P}||\\p{S}]', 'phonetiser': <Phonetisers.NaivePhonetiser object at 0x7fd779cd5d50>, 'NNDurationPredictor': <class 'NN.NNDurationPredictor'>, 'os': <module 'os' from '/usr/lib/python2.7/os.pyc'>, 'phone_contexts': [('htk_monophone', './attribute::pronunciation'), ('start_time', './attribute::start'), ('end_time', './attribute::end'), ('ll_segment', 'preceding::segment[2]/attribute::pronunciation'), ('l_segment', 'preceding::segment[1]/attribute::pronunciation'), ('c_segment', './attribute::pronunciation'), ('r_segment', 'following::segment[1]/attribute::pronunciation'), ('rr_segment', 'following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'StateAligner': <class 'Aligner.StateAligner'>}
train
Cannot load NN model from model_dir: /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/duration_predictor -- not trained yet
Cannot load NN model from model_dir: /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/acoustic_predictor -- not trained yet


== Train voice (proc no. 1 (word_splitter))  ==
Train processor word_splitter
RegexTokeniser requires no training
          Applying processor word_splitter
uuuuuuuuuuuuuuuuuuuuuuuuuuuuu

== Train voice (proc no. 2 (segment_adder))  ==
Train processor segment_adder
NaivePhonetiser requires no training
          Applying processor segment_adder
uuuuuuuuuuuuuuuuuuuuuuuuuuuuu

== Train voice (proc no. 3 (word_vector_tagger))  ==
          Applying processor word_vector_tagger
u u u u u u u u u u u u u u u u u u u u u u u u u u u u u 

== Train voice (proc no. 4 (feature_dumper))  ==
Train processor feature_dumper
FeatureDumper already trained -- questions exist:
/root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn//SomeFileName
          Applying processor feature_dumper
uuuuuuuuuuuuuuuuuuuuuuuuuuuuu

== Train voice (proc no. 5 (acoustic_feature_extractor))  ==
Train processor acoustic_feature_extractor
          Applying processor acoustic_feature_extractor
uuuuuuuuuuuuuuuuuuuuuuuuuuuuu

== Train voice (proc no. 6 (aligner))  ==
Train processor aligner

          Training aligner -- see /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/aligner/training/log.txt

set_up_data.py: No matching data files found in /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/align_lab and /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp
Aligner training failed

Thanks very much.

@oliverwatts
Copy link
Collaborator

oliverwatts commented Aug 24, 2017 via email

@oliverwatts
Copy link
Collaborator

oliverwatts commented Aug 24, 2017 via email

@candlewill
Copy link
Contributor Author

Thanks very much for your reply. @oliverwatts

The content of the two folders is pasted at the end. And the current train folder structure is also pasted here.

Then, I tried to start from scratch. I removed the train folder. There is no folder called voices.

rm train -rf

Train:

python ./scripts/train.py -s rss_toy_demo -l rm naive_01_nn

Here is the screen info when execute this command:

 -- Gather corpus
 -- Train voice
/root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn
/root/workspace/Projects/Ossian/voices//rm/rss_toy_demo/naive_01_nn
try loading config from python...
/root/workspace/Projects/Ossian/recipes/naive_01_nn.cfg
{'state_contexts': [('start_time', './attribute::start'), ('end_time', './attribute::end'), ('htk_state', 'count(./preceding-sibling::state) + 1'), ('htk_monophone', './ancestor::segment/attribute::pronunciation'), ('ll_segment', './ancestor::segment/preceding::segment[2]/attribute::pronunciation'), ('l_segment', './ancestor::segment/preceding::segment[1]/attribute::pronunciation'), ('c_segment', './ancestor::segment/attribute::pronunciation'), ('r_segment', './ancestor::segment/following::segment[1]/attribute::pronunciation'), ('rr_segment', './ancestor::segment/following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'dur_label_maker': <FeatureDumper.FeatureDumper object at 0x7fb28da786d0>, 'SKLDecisionTreePausePredictor': <class 'SKLProcessors.SKLDecisionTreePausePredictor'>, 'train_stages': [[<Tokenisers.RegexTokeniser object at 0x7fb28e3d3b90>, <Phonetisers.NaivePhonetiser object at 0x7fb28da62d50>, <VSMTagger.VSMTagger object at 0x7fb28da78710>], [<FeatureDumper.FeatureDumper object at 0x7fb28da78790>, <FeatureExtractor.WorldExtractor object at 0x7fb28da78510>, <Aligner.StateAligner object at 0x7fb28da78590>, <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fb28da78610>, <PhraseMaker.PhraseMaker object at 0x7fb28da787d0>, <FeatureDumper.FeatureDumper object at 0x7fb28da78690>], [<FeatureDumper.FeatureDumper object at 0x7fb28da786d0>, <NN.NNDurationPredictor object at 0x7fb2927b8b50>, <FeatureDumper.FeatureDumper object at 0x7fb28da62d90>, <NN.NNAcousticPredictor object at 0x7fb28da62ed0>]], 'PUNC_PATT': '[\\p{C}||\\p{P}||\\p{S}]', 'JUNCTURE_NODES': "//token[@token_class='space'] | //token[@token_class='punctuation']", 'WorldExtractor': <class 'FeatureExtractor.WorldExtractor'>, 'RegexTokeniser': <class 'Tokenisers.RegexTokeniser'>, 'current_dir': '/root/workspace/Projects/Ossian/recipes', 'phrase_adder': <PhraseMaker.PhraseMaker object at 0x7fb28da787d0>, 'dur_data_maker': <FeatureDumper.FeatureDumper object at 0x7fb28da78690>, 'pause_predictor': <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fb28da78610>, 'speech_generation': [<FeatureDumper.FeatureDumper object at 0x7fb28da786d0>, <NN.NNDurationPredictor object at 0x7fb2927b8b50>, <FeatureDumper.FeatureDumper object at 0x7fb28da62d90>, <NN.NNAcousticPredictor object at 0x7fb28da62ed0>], 'runtime_stages': [[<Tokenisers.RegexTokeniser object at 0x7fb28e3d3b90>, <Phonetisers.NaivePhonetiser object at 0x7fb28da62d50>, <VSMTagger.VSMTagger object at 0x7fb28da78710>], [<SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fb28da78610>, <PhraseMaker.PhraseMaker object at 0x7fb28da787d0>], [<FeatureDumper.FeatureDumper object at 0x7fb28da786d0>, <NN.NNDurationPredictor object at 0x7fb2927b8b50>, <FeatureDumper.FeatureDumper object at 0x7fb28da62d90>, <NN.NNAcousticPredictor object at 0x7fb28da62ed0>]], 'text_proc': [<Tokenisers.RegexTokeniser object at 0x7fb28e3d3b90>, <Phonetisers.NaivePhonetiser object at 0x7fb28da62d50>, <VSMTagger.VSMTagger object at 0x7fb28da78710>], 'acoustic_predictor': <NN.NNAcousticPredictor object at 0x7fb28da62ed0>, 'duration_predictor': <NN.NNDurationPredictor object at 0x7fb2927b8b50>, 'NNAcousticPredictor': <class 'NN.NNAcousticPredictor'>, 'align_label_dumper': <FeatureDumper.FeatureDumper object at 0x7fb28da78790>, 'pause_predictor_features': [('response', './attribute::has_silence="yes"'), ('token_is_punctuation', './attribute::token_class="punctuation"'), ('since_start_utterance_in_words', "count(preceding::token[@token_class='word'])"), ('till_end_utterance_in_words', "count(following::token[@token_class='word'])"), ('L_vsm_d1', "./preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './attribute::vsm_d1'), ('R_vsm_d1', "./following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './attribute::vsm_d2'), ('R_vsm_d2', "./following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './attribute::vsm_d3'), ('R_vsm_d3', "./following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './attribute::vsm_d4'), ('R_vsm_d4', "./following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './attribute::vsm_d5'), ('R_vsm_d5', "./following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './attribute::vsm_d6'), ('R_vsm_d6', "./following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './attribute::vsm_d7'), ('R_vsm_d7', "./following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './attribute::vsm_d8'), ('R_vsm_d8', "./following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './attribute::vsm_d9'), ('R_vsm_d9', "./following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './attribute::vsm_d10'), ('R_vsm_d10', "./following::token[@token_class='word'][1]/attribute::vsm_d10")], 'PhraseMaker': <class 'PhraseMaker.PhraseMaker'>, 'alignment': [<FeatureDumper.FeatureDumper object at 0x7fb28da78790>, <FeatureExtractor.WorldExtractor object at 0x7fb28da78510>, <Aligner.StateAligner object at 0x7fb28da78590>, <SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fb28da78610>, <PhraseMaker.PhraseMaker object at 0x7fb28da787d0>, <FeatureDumper.FeatureDumper object at 0x7fb28da78690>], 'VSMTagger': <class 'VSMTagger.VSMTagger'>, 'duration_data_contexts': [('state_1_nframes', '(./state[1]/attribute::end - ./state[1]/attribute::start) div 5'), ('state_2_nframes', '(./state[2]/attribute::end - ./state[2]/attribute::start) div 5'), ('state_3_nframes', '(./state[3]/attribute::end - ./state[3]/attribute::start) div 5'), ('state_4_nframes', '(./state[4]/attribute::end - ./state[4]/attribute::start) div 5'), ('state_5_nframes', '(./state[5]/attribute::end - ./state[5]/attribute::start) div 5')], 'phone_and_state_contexts': [('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'word_vector_tagger': <VSMTagger.VSMTagger object at 0x7fb28da78710>, 'inspect': <module 'inspect' from '/usr/lib/python2.7/inspect.pyc'>, 'tokenisation_pattern': '(\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]*\\p{Z}+|\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]+\\Z)', 'aligner': <Aligner.StateAligner object at 0x7fb28da78590>, 'sys': <module 'sys' (built-in)>, 'dnn_label_maker': <FeatureDumper.FeatureDumper object at 0x7fb28da62d90>, 'NaivePhonetiser': <class 'Phonetisers.NaivePhonetiser'>, 'tokeniser': <Tokenisers.RegexTokeniser object at 0x7fb28e3d3b90>, 'LETTER_PATT': '[\\p{L}||\\p{N}||\\p{M}]', 'AcousticModelWorld': <class 'AcousticModel.AcousticModelWorld'>, 'speech_feature_extractor': <FeatureExtractor.WorldExtractor object at 0x7fb28da78510>, 'dim': 10, 'c': <module 'default.const' from '/root/workspace/Projects/Ossian/scripts/default/const.pyc'>, 'FeatureDumper': <class 'FeatureDumper.FeatureDumper'>, 'word_vsm_dim': 10, 'speech_coding_config': {'delta_delta_window': '1.0 -2.0 1.0', 'static_window': '1', 'order': 59, 'delta_window': '-0.5 0.0 0.5'}, 'pause_prediction': [<SKLProcessors.SKLDecisionTreePausePredictor object at 0x7fb28da78610>, <PhraseMaker.PhraseMaker object at 0x7fb28da787d0>], 'i': 5, 'SPACE_PATT': '\\p{Z}', 'PUNC_OR_SPACE_PATT': '[\\p{Z}||\\p{C}||\\p{P}||\\p{S}]', 'phonetiser': <Phonetisers.NaivePhonetiser object at 0x7fb28da62d50>, 'NNDurationPredictor': <class 'NN.NNDurationPredictor'>, 'os': <module 'os' from '/usr/lib/python2.7/os.pyc'>, 'phone_contexts': [('htk_monophone', './attribute::pronunciation'), ('start_time', './attribute::start'), ('end_time', './attribute::end'), ('ll_segment', 'preceding::segment[2]/attribute::pronunciation'), ('l_segment', 'preceding::segment[1]/attribute::pronunciation'), ('c_segment', './attribute::pronunciation'), ('r_segment', 'following::segment[1]/attribute::pronunciation'), ('rr_segment', 'following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'StateAligner': <class 'Aligner.StateAligner'>}
train
Cannot load NN model from model_dir: /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/duration_predictor -- not trained yet
Cannot load NN model from model_dir: /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/acoustic_predictor -- not trained yet


== Train voice (proc no. 1 (word_splitter))  ==
Train processor word_splitter
RegexTokeniser requires no training
          Applying processor word_splitter
ppppppppppppppppppppppppppp

== Train voice (proc no. 2 (segment_adder))  ==
Train processor segment_adder
NaivePhonetiser requires no training
          Applying processor segment_adder
pppppppppppppppppppppppppp

== Train voice (proc no. 3 (word_vector_tagger))  ==
Train processor word_vector_tagger
Count types...
Assemble cooccurance matrix...
Factorise cooccurance matrix...
Write output to /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/word_vector_tagger/table_file.table
          Applying processor word_vector_tagger
p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 

== Train voice (proc no. 4 (feature_dumper))  ==
Train processor feature_dumper
          Applying processor feature_dumper
ppppppppppppppppppppppppppp

== Train voice (proc no. 5 (acoustic_feature_extractor))  ==
Train processor acoustic_feature_extractor
          Applying processor acoustic_feature_extractor
p double -> float conversion (stream: bap) failed on utterance adr_diph1_001
p double -> float conversion (stream: bap) failed on utterance adr_diph1_002
p double -> float conversion (stream: bap) failed on utterance adr_diph1_005
p double -> float conversion (stream: bap) failed on utterance adr_diph1_004
p double -> float conversion (stream: bap) failed on utterance adr_diph1_007
p double -> float conversion (stream: bap) failed on utterance adr_diph1_009
p double -> float conversion (stream: bap) failed on utterance adr_diph1_003
p double -> float conversion (stream: bap) failed on utterance adr_diph1_008
p double -> float conversion (stream: bap) failed on utterance adr_diph1_013
p double -> float conversion (stream: bap) failed on utterance adr_diph1_015
p double -> float conversion (stream: bap) failed on utterance adr_diph1_016
p double -> float conversion (stream: bap) failed on utterance adr_diph1_017
p double -> float conversion (stream: bap) failed on utterance adr_diph1_019
p double -> float conversion (stream: bap) failed on utterance adr_diph1_020
p double -> float conversion (stream: bap) failed on utterance adr_diph1_018
p double -> float conversion (stream: bap) failed on utterance adr_diph1_022
p double -> float conversion (stream: bap) failed on utterance adr_diph1_023
p double -> float conversion (stream: bap) failed on utterance adr_diph1_026
p double -> float conversion (stream: bap) failed on utterance adr_diph1_010
p double -> float conversion (stream: bap) failed on utterance adr_diph1_024
p double -> float conversion (stream: bap) failed on utterance adr_diph1_028
p double -> float conversion (stream: bap) failed on utterance adr_diph1_027
p double -> float conversion (stream: bap) failed on utterance adr_diph1_025
p double -> float conversion (stream: bap) failed on utterance adr_diph1_029


== Train voice (proc no. 6 (aligner))  ==
Train processor aligner

          Training aligner -- see /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/aligner/training/log.txt

File information
Sampling : 48000 Hz 16 Bit
Length 70456 [sample]
Length 1.467833 [sec]

Analysis
DIO: 120 [msec]
StoneMask: 102 [msec]
CheapTrick: 318 [msec]
D4C: 378 [msec]
294 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 83729 [sample]
Length 1.744354 [sec]

Analysis
DIO: 168 [msec]
StoneMask: 93 [msec]
CheapTrick: 324 [msec]
D4C: 408 [msec]
349 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 87229 [sample]
Length 1.817271 [sec]

Analysis
DIO: 113 [msec]
StoneMask: 106 [msec]
CheapTrick: 394 [msec]
D4C: 437 [msec]
364 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 92057 [sample]
Length 1.917854 [sec]

Analysis
DIO: 122 [msec]
StoneMask: 104 [msec]
CheapTrick: 474 [msec]
D4C: 404 [msec]
384 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 82708 [sample]
Length 1.723083 [sec]

Analysis
DIO: 191 [msec]
StoneMask: 112 [msec]
CheapTrick: 371 [msec]
D4C: 411 [msec]
345 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 90878 [sample]
Length 1.893292 [sec]

Analysis
DIO: 127 [msec]
StoneMask: 169 [msec]
CheapTrick: 392 [msec]
D4C: 425 [msec]
379 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 112013 [sample]
Length 2.333604 [sec]

Analysis
DIO: 60 [msec]
StoneMask: 159 [msec]
CheapTrick: 500 [msec]
D4C: 507 [msec]
467 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 103000 [sample]
Length 2.145833 [sec]

Analysis
DIO: 120 [msec]
StoneMask: 106 [msec]
CheapTrick: 464 [msec]
D4C: 512 [msec]
430 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 110450 [sample]
Length 2.301042 [sec]

Analysis
DIO: 116 [msec]
StoneMask: 118 [msec]
CheapTrick: 512 [msec]
D4C: 498 [msec]
461 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 107215 [sample]
Length 2.233646 [sec]

Analysis
DIO: 113 [msec]
StoneMask: 115 [msec]
CheapTrick: 480 [msec]
D4C: 526 [msec]
447 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 111048 [sample]
Length 2.313500 [sec]

Analysis
DIO: 112 [msec]
StoneMask: 111 [msec]
CheapTrick: 507 [msec]
D4C: 595 [msec]
463 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 107215 [sample]
Length 2.233646 [sec]

Analysis
DIO: 118 [msec]
StoneMask: 184 [msec]
CheapTrick: 421 [msec]
D4C: 606 [msec]
447 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 111299 [sample]
Length 2.318729 [sec]

Analysis
DIO: 204 [msec]
StoneMask: 115 [msec]
CheapTrick: 581 [msec]
D4C: 383 [msec]
464 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 130360 [sample]
Length 2.715833 [sec]

Analysis
DIO: 230 [msec]
StoneMask: 187 [msec]
CheapTrick: 537 [msec]
D4C: 484 [msec]
544 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 121510 [sample]
Length 2.531458 [sec]

Analysis
DIO: 194 [msec]
StoneMask: 196 [msec]
CheapTrick: 497 [msec]
D4C: 519 [msec]
507 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 115876 [sample]
Length 2.414083 [sec]

Analysis
DIO: 200 [msec]
StoneMask: 183 [msec]
CheapTrick: 513 [msec]
D4C: 520 [msec]
483 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 124573 [sample]
Length 2.595271 [sec]

Analysis
DIO: 170 [msec]
StoneMask: 194 [msec]
CheapTrick: 519 [msec]
D4C: 536 [msec]
520 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 121510 [sample]
Length 2.531458 [sec]

Analysis
DIO: 259 [msec]
StoneMask: 130 [msec]
CheapTrick: 603 [msec]
D4C: 468 [msec]
507 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 142160 [sample]
Length 2.961667 [sec]

Analysis
DIO: 310 [msec]
StoneMask: 125 [msec]
CheapTrick: 602 [msec]
D4C: 527 [msec]
593 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 146776 [sample]
Length 3.057833 [sec]

Analysis
DIO: 297 [msec]
StoneMask: 137 [msec]
CheapTrick: 600 [msec]
D4C: 575 [msec]
612 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 156372 [sample]
Length 3.257750 [sec]

Analysis
DIO: 235 [msec]
StoneMask: 176 [msec]
CheapTrick: 628 [msec]
D4C: 483 [msec]
652 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 139879 [sample]
Length 2.914146 [sec]

Analysis
DIO: 333 [msec]
StoneMask: 200 [msec]
CheapTrick: 586 [msec]
D4C: 496 [msec]
583 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 207282 [sample]
Length 4.318375 [sec]

Analysis
DIO: 293 [msec]
StoneMask: 219 [msec]
CheapTrick: 791 [msec]
D4C: 643 [msec]
864 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 202177 [sample]
Length 4.212021 [sec]

Analysis
DIO: 364 [msec]
StoneMask: 218 [msec]
CheapTrick: 733 [msec]
D4C: 638 [msec]
843 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 198092 [sample]
Length 4.126917 [sec]

Analysis
DIO: 225 [msec]
StoneMask: 265 [msec]
CheapTrick: 767 [msec]
D4C: 633 [msec]
826 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 229499 [sample]
Length 4.781229 [sec]

Analysis
DIO: 385 [msec]
StoneMask: 235 [msec]
CheapTrick: 813 [msec]
D4C: 720 [msec]
957 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 236894 [sample]
Length 4.935292 [sec]

Analysis
DIO: 301 [msec]
StoneMask: 248 [msec]
CheapTrick: 849 [msec]
D4C: 757 [msec]
988 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 315102 [sample]
Length 6.564625 [sec]

Analysis
DIO: 492 [msec]
StoneMask: 327 [msec]
CheapTrick: 920 [msec]
D4C: 942 [msec]
1313 2048 5
complete.
File information
Sampling : 48000 Hz 16 Bit
Length 303852 [sample]
Length 6.330250 [sec]

Analysis
DIO: 534 [msec]
StoneMask: 379 [msec]
CheapTrick: 831 [msec]
D4C: 962 [msec]
1267 2048 5
complete.

The content of the two folders:

root@de-3879-ng-1-034425-3089955241-qx7zj:~/workspace/Projects/Ossian# ls /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/align_lab
adr_diph1_001.align_lab  adr_diph1_006.align_lab  adr_diph1_011.align_lab  adr_diph1_016.align_lab  adr_diph1_021.align_lab  adr_diph1_026.align_lab
adr_diph1_002.align_lab  adr_diph1_007.align_lab  adr_diph1_012.align_lab  adr_diph1_017.align_lab  adr_diph1_022.align_lab  adr_diph1_027.align_lab
adr_diph1_003.align_lab  adr_diph1_008.align_lab  adr_diph1_013.align_lab  adr_diph1_018.align_lab  adr_diph1_023.align_lab  adr_diph1_028.align_lab
adr_diph1_004.align_lab  adr_diph1_009.align_lab  adr_diph1_014.align_lab  adr_diph1_019.align_lab  adr_diph1_024.align_lab  adr_diph1_029.align_lab
adr_diph1_005.align_lab  adr_diph1_010.align_lab  adr_diph1_015.align_lab  adr_diph1_020.align_lab  adr_diph1_025.align_lab
root@de-3879-ng-1-034425-3089955241-qx7zj:~/workspace/Projects/Ossian# ls /root/workspace/Projects/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp
adr_diph1_001.bap         adr_diph1_005.wav         adr_diph1_010.sp.double   adr_diph1_015.mgc         adr_diph1_020.log         adr_diph1_025.f0.double
adr_diph1_001.bap.double  adr_diph1_006.bap         adr_diph1_010.wav         adr_diph1_015.sp.double   adr_diph1_020.mgc         adr_diph1_025.log
adr_diph1_001.f0.double   adr_diph1_006.bap.double  adr_diph1_011.bap         adr_diph1_015.wav         adr_diph1_020.sp.double   adr_diph1_025.mgc
adr_diph1_001.log         adr_diph1_006.f0.double   adr_diph1_011.bap.double  adr_diph1_016.bap         adr_diph1_020.wav         adr_diph1_025.sp.double
adr_diph1_001.mgc         adr_diph1_006.log         adr_diph1_011.f0.double   adr_diph1_016.bap.double  adr_diph1_021.bap         adr_diph1_025.wav
adr_diph1_001.sp.double   adr_diph1_006.mgc         adr_diph1_011.log         adr_diph1_016.f0.double   adr_diph1_021.bap.double  adr_diph1_026.bap
adr_diph1_001.wav         adr_diph1_006.sp.double   adr_diph1_011.mgc         adr_diph1_016.log         adr_diph1_021.f0.double   adr_diph1_026.bap.double
adr_diph1_002.bap         adr_diph1_006.wav         adr_diph1_011.sp.double   adr_diph1_016.mgc         adr_diph1_021.log         adr_diph1_026.f0.double
adr_diph1_002.bap.double  adr_diph1_007.bap         adr_diph1_011.wav         adr_diph1_016.sp.double   adr_diph1_021.mgc         adr_diph1_026.log
adr_diph1_002.f0.double   adr_diph1_007.bap.double  adr_diph1_012.bap         adr_diph1_016.wav         adr_diph1_021.sp.double   adr_diph1_026.mgc
adr_diph1_002.log         adr_diph1_007.f0.double   adr_diph1_012.bap.double  adr_diph1_017.bap         adr_diph1_021.wav         adr_diph1_026.sp.double
adr_diph1_002.mgc         adr_diph1_007.log         adr_diph1_012.f0.double   adr_diph1_017.bap.double  adr_diph1_022.bap         adr_diph1_026.wav
adr_diph1_002.sp.double   adr_diph1_007.mgc         adr_diph1_012.log         adr_diph1_017.f0.double   adr_diph1_022.bap.double  adr_diph1_027.bap
adr_diph1_002.wav         adr_diph1_007.sp.double   adr_diph1_012.mgc         adr_diph1_017.log         adr_diph1_022.f0.double   adr_diph1_027.bap.double
adr_diph1_003.bap         adr_diph1_007.wav         adr_diph1_012.sp.double   adr_diph1_017.mgc         adr_diph1_022.log         adr_diph1_027.f0.double
adr_diph1_003.bap.double  adr_diph1_008.bap         adr_diph1_012.wav         adr_diph1_017.sp.double   adr_diph1_022.mgc         adr_diph1_027.log
adr_diph1_003.f0.double   adr_diph1_008.bap.double  adr_diph1_013.bap         adr_diph1_017.wav         adr_diph1_022.sp.double   adr_diph1_027.mgc
adr_diph1_003.log         adr_diph1_008.f0.double   adr_diph1_013.bap.double  adr_diph1_018.bap         adr_diph1_022.wav         adr_diph1_027.sp.double
adr_diph1_003.mgc         adr_diph1_008.log         adr_diph1_013.f0.double   adr_diph1_018.bap.double  adr_diph1_023.bap         adr_diph1_027.wav
adr_diph1_003.sp.double   adr_diph1_008.mgc         adr_diph1_013.log         adr_diph1_018.f0.double   adr_diph1_023.bap.double  adr_diph1_028.bap
adr_diph1_003.wav         adr_diph1_008.sp.double   adr_diph1_013.mgc         adr_diph1_018.log         adr_diph1_023.f0.double   adr_diph1_028.bap.double
adr_diph1_004.bap         adr_diph1_008.wav         adr_diph1_013.sp.double   adr_diph1_018.mgc         adr_diph1_023.log         adr_diph1_028.f0.double
adr_diph1_004.bap.double  adr_diph1_009.bap         adr_diph1_013.wav         adr_diph1_018.sp.double   adr_diph1_023.mgc         adr_diph1_028.log
adr_diph1_004.f0.double   adr_diph1_009.bap.double  adr_diph1_014.bap         adr_diph1_018.wav         adr_diph1_023.sp.double   adr_diph1_028.mgc
adr_diph1_004.log         adr_diph1_009.f0.double   adr_diph1_014.bap.double  adr_diph1_019.bap         adr_diph1_023.wav         adr_diph1_028.sp.double
adr_diph1_004.mgc         adr_diph1_009.log         adr_diph1_014.f0.double   adr_diph1_019.bap.double  adr_diph1_024.bap         adr_diph1_028.wav
adr_diph1_004.sp.double   adr_diph1_009.mgc         adr_diph1_014.log         adr_diph1_019.f0.double   adr_diph1_024.bap.double  adr_diph1_029.bap
adr_diph1_004.wav         adr_diph1_009.sp.double   adr_diph1_014.mgc         adr_diph1_019.log         adr_diph1_024.f0.double   adr_diph1_029.bap.double
adr_diph1_005.bap         adr_diph1_009.wav         adr_diph1_014.sp.double   adr_diph1_019.mgc         adr_diph1_024.log         adr_diph1_029.f0.double
adr_diph1_005.bap.double  adr_diph1_010.bap         adr_diph1_014.wav         adr_diph1_019.sp.double   adr_diph1_024.mgc         adr_diph1_029.log
adr_diph1_005.f0.double   adr_diph1_010.bap.double  adr_diph1_015.bap         adr_diph1_019.wav         adr_diph1_024.sp.double   adr_diph1_029.mgc
adr_diph1_005.log         adr_diph1_010.f0.double   adr_diph1_015.bap.double  adr_diph1_020.bap         adr_diph1_024.wav         adr_diph1_029.sp.double
adr_diph1_005.mgc         adr_diph1_010.log         adr_diph1_015.f0.double   adr_diph1_020.bap.double  adr_diph1_025.bap         adr_diph1_029.wav
adr_diph1_005.sp.double   adr_diph1_010.mgc         adr_diph1_015.log         adr_diph1_020.f0.double   adr_diph1_025.bap.double

Folder Structure:

root@de-3879-ng-1-034425-3089955241-qx7zj:~/workspace/Projects/Ossian# tree train/     
train/
`-- rm
    `-- speakers
        `-- rss_toy_demo
            `-- naive_01_nn
                |-- SomeFileName
                |-- SomeFileName.cont
                |-- SomeFileName.key
                |-- SomeFileName.values
                |-- align_lab
                |   |-- adr_diph1_001.align_lab
                |   |-- adr_diph1_002.align_lab
                |   |-- adr_diph1_003.align_lab
                |   |-- adr_diph1_004.align_lab
                |   |-- adr_diph1_005.align_lab
                |   |-- adr_diph1_006.align_lab
                |   |-- adr_diph1_007.align_lab
                |   |-- adr_diph1_008.align_lab
                |   |-- adr_diph1_009.align_lab
                |   |-- adr_diph1_010.align_lab
                |   |-- adr_diph1_011.align_lab
                |   |-- adr_diph1_012.align_lab
                |   |-- adr_diph1_013.align_lab
                |   |-- adr_diph1_014.align_lab
                |   |-- adr_diph1_015.align_lab
                |   |-- adr_diph1_016.align_lab
                |   |-- adr_diph1_017.align_lab
                |   |-- adr_diph1_018.align_lab
                |   |-- adr_diph1_019.align_lab
                |   |-- adr_diph1_020.align_lab
                |   |-- adr_diph1_021.align_lab
                |   |-- adr_diph1_022.align_lab
                |   |-- adr_diph1_023.align_lab
                |   |-- adr_diph1_024.align_lab
                |   |-- adr_diph1_025.align_lab
                |   |-- adr_diph1_026.align_lab
                |   |-- adr_diph1_027.align_lab
                |   |-- adr_diph1_028.align_lab
                |   `-- adr_diph1_029.align_lab
                |-- cmp
                |   |-- adr_diph1_001.bap
                |   |-- adr_diph1_001.bap.double
                |   |-- adr_diph1_001.f0.double
                |   |-- adr_diph1_001.log
                |   |-- adr_diph1_001.mgc
                |   |-- adr_diph1_001.sp.double
                |   |-- adr_diph1_001.wav
                |   |-- adr_diph1_002.bap
                |   |-- adr_diph1_002.bap.double
                |   |-- adr_diph1_002.f0.double
                |   |-- adr_diph1_002.log
                |   |-- adr_diph1_002.mgc
                |   |-- adr_diph1_002.sp.double
                |   |-- adr_diph1_002.wav
                |   |-- adr_diph1_003.bap
                |   |-- adr_diph1_003.bap.double
                |   |-- adr_diph1_003.f0.double
                |   |-- adr_diph1_003.log
                |   |-- adr_diph1_003.mgc
                |   |-- adr_diph1_003.sp.double
                |   |-- adr_diph1_003.wav
                |   |-- adr_diph1_004.bap
                |   |-- adr_diph1_004.bap.double
                |   |-- adr_diph1_004.f0.double
                |   |-- adr_diph1_004.log
                |   |-- adr_diph1_004.mgc
                |   |-- adr_diph1_004.sp.double
                |   |-- adr_diph1_004.wav
                |   |-- adr_diph1_005.bap
                |   |-- adr_diph1_005.bap.double
                |   |-- adr_diph1_005.f0.double
                |   |-- adr_diph1_005.log
                |   |-- adr_diph1_005.mgc
                |   |-- adr_diph1_005.sp.double
                |   |-- adr_diph1_005.wav
                |   |-- adr_diph1_006.bap
                |   |-- adr_diph1_006.bap.double
                |   |-- adr_diph1_006.f0.double
                |   |-- adr_diph1_006.log
                |   |-- adr_diph1_006.mgc
                |   |-- adr_diph1_006.sp.double
                |   |-- adr_diph1_006.wav
                |   |-- adr_diph1_007.bap
                |   |-- adr_diph1_007.bap.double
                |   |-- adr_diph1_007.f0.double
                |   |-- adr_diph1_007.log
                |   |-- adr_diph1_007.mgc
                |   |-- adr_diph1_007.sp.double
                |   |-- adr_diph1_007.wav
                |   |-- adr_diph1_008.bap
                |   |-- adr_diph1_008.bap.double
                |   |-- adr_diph1_008.f0.double
                |   |-- adr_diph1_008.log
                |   |-- adr_diph1_008.mgc
                |   |-- adr_diph1_008.sp.double
                |   |-- adr_diph1_008.wav
                |   |-- adr_diph1_009.bap
                |   |-- adr_diph1_009.bap.double
                |   |-- adr_diph1_009.f0.double
                |   |-- adr_diph1_009.log
                |   |-- adr_diph1_009.mgc
                |   |-- adr_diph1_009.sp.double
                |   |-- adr_diph1_009.wav
                |   |-- adr_diph1_010.bap
                |   |-- adr_diph1_010.bap.double
                |   |-- adr_diph1_010.f0.double
                |   |-- adr_diph1_010.log
                |   |-- adr_diph1_010.mgc
                |   |-- adr_diph1_010.sp.double
                |   |-- adr_diph1_010.wav
                |   |-- adr_diph1_011.bap
                |   |-- adr_diph1_011.bap.double
                |   |-- adr_diph1_011.f0.double
                |   |-- adr_diph1_011.log
                |   |-- adr_diph1_011.mgc
                |   |-- adr_diph1_011.sp.double
                |   |-- adr_diph1_011.wav
                |   |-- adr_diph1_012.bap
                |   |-- adr_diph1_012.bap.double
                |   |-- adr_diph1_012.f0.double
                |   |-- adr_diph1_012.log
                |   |-- adr_diph1_012.mgc
                |   |-- adr_diph1_012.sp.double
                |   |-- adr_diph1_012.wav
                |   |-- adr_diph1_013.bap
                |   |-- adr_diph1_013.bap.double
                |   |-- adr_diph1_013.f0.double
                |   |-- adr_diph1_013.log
                |   |-- adr_diph1_013.mgc
                |   |-- adr_diph1_013.sp.double
                |   |-- adr_diph1_013.wav
                |   |-- adr_diph1_014.bap
                |   |-- adr_diph1_014.bap.double
                |   |-- adr_diph1_014.f0.double
                |   |-- adr_diph1_014.log
                |   |-- adr_diph1_014.mgc
                |   |-- adr_diph1_014.sp.double
                |   |-- adr_diph1_014.wav
                |   |-- adr_diph1_015.bap
                |   |-- adr_diph1_015.bap.double
                |   |-- adr_diph1_015.f0.double
                |   |-- adr_diph1_015.log
                |   |-- adr_diph1_015.mgc
                |   |-- adr_diph1_015.sp.double
                |   |-- adr_diph1_015.wav
                |   |-- adr_diph1_016.bap
                |   |-- adr_diph1_016.bap.double
                |   |-- adr_diph1_016.f0.double
                |   |-- adr_diph1_016.log
                |   |-- adr_diph1_016.mgc
                |   |-- adr_diph1_016.sp.double
                |   |-- adr_diph1_016.wav
                |   |-- adr_diph1_017.bap
                |   |-- adr_diph1_017.bap.double
                |   |-- adr_diph1_017.f0.double
                |   |-- adr_diph1_017.log
                |   |-- adr_diph1_017.mgc
                |   |-- adr_diph1_017.sp.double
                |   |-- adr_diph1_017.wav
                |   |-- adr_diph1_018.bap
                |   |-- adr_diph1_018.bap.double
                |   |-- adr_diph1_018.f0.double
                |   |-- adr_diph1_018.log
                |   |-- adr_diph1_018.mgc
                |   |-- adr_diph1_018.sp.double
                |   |-- adr_diph1_018.wav
                |   |-- adr_diph1_019.bap
                |   |-- adr_diph1_019.bap.double
                |   |-- adr_diph1_019.f0.double
                |   |-- adr_diph1_019.log
                |   |-- adr_diph1_019.mgc
                |   |-- adr_diph1_019.sp.double
                |   |-- adr_diph1_019.wav
                |   |-- adr_diph1_020.bap
                |   |-- adr_diph1_020.bap.double
                |   |-- adr_diph1_020.f0.double
                |   |-- adr_diph1_020.log
                |   |-- adr_diph1_020.mgc
                |   |-- adr_diph1_020.sp.double
                |   |-- adr_diph1_020.wav
                |   |-- adr_diph1_021.bap
                |   |-- adr_diph1_021.bap.double
                |   |-- adr_diph1_021.f0.double
                |   |-- adr_diph1_021.log
                |   |-- adr_diph1_021.mgc
                |   |-- adr_diph1_021.sp.double
                |   |-- adr_diph1_021.wav
                |   |-- adr_diph1_022.bap
                |   |-- adr_diph1_022.bap.double
                |   |-- adr_diph1_022.f0.double
                |   |-- adr_diph1_022.log
                |   |-- adr_diph1_022.mgc
                |   |-- adr_diph1_022.sp.double
                |   |-- adr_diph1_022.wav
                |   |-- adr_diph1_023.bap
                |   |-- adr_diph1_023.bap.double
                |   |-- adr_diph1_023.f0.double
                |   |-- adr_diph1_023.log
                |   |-- adr_diph1_023.mgc
                |   |-- adr_diph1_023.sp.double
                |   |-- adr_diph1_023.wav
                |   |-- adr_diph1_024.bap
                |   |-- adr_diph1_024.bap.double
                |   |-- adr_diph1_024.f0.double
                |   |-- adr_diph1_024.log
                |   |-- adr_diph1_024.mgc
                |   |-- adr_diph1_024.sp.double
                |   |-- adr_diph1_024.wav
                |   |-- adr_diph1_025.bap
                |   |-- adr_diph1_025.bap.double
                |   |-- adr_diph1_025.f0.double
                |   |-- adr_diph1_025.log
                |   |-- adr_diph1_025.mgc
                |   |-- adr_diph1_025.sp.double
                |   |-- adr_diph1_025.wav
                |   |-- adr_diph1_026.bap
                |   |-- adr_diph1_026.bap.double
                |   |-- adr_diph1_026.f0.double
                |   |-- adr_diph1_026.log
                |   |-- adr_diph1_026.mgc
                |   |-- adr_diph1_026.sp.double
                |   |-- adr_diph1_026.wav
                |   |-- adr_diph1_027.bap
                |   |-- adr_diph1_027.bap.double
                |   |-- adr_diph1_027.f0.double
                |   |-- adr_diph1_027.log
                |   |-- adr_diph1_027.mgc
                |   |-- adr_diph1_027.sp.double
                |   |-- adr_diph1_027.wav
                |   |-- adr_diph1_028.bap
                |   |-- adr_diph1_028.bap.double
                |   |-- adr_diph1_028.f0.double
                |   |-- adr_diph1_028.log
                |   |-- adr_diph1_028.mgc
                |   |-- adr_diph1_028.sp.double
                |   |-- adr_diph1_028.wav
                |   |-- adr_diph1_029.bap
                |   |-- adr_diph1_029.bap.double
                |   |-- adr_diph1_029.f0.double
                |   |-- adr_diph1_029.log
                |   |-- adr_diph1_029.mgc
                |   |-- adr_diph1_029.sp.double
                |   `-- adr_diph1_029.wav
                |-- processors
                |   |-- acoustic_feature_extractor
                |   |   |-- acoustic_feats.cfg
                |   |   `-- training
                |   |       |-- delta_delta_window.win
                |   |       |-- delta_window.win
                |   |       `-- static_window.win
                |   |-- acoustic_predictor
                |   |-- aligner
                |   |   |-- extra_substitutions.txt
                |   |   `-- training
                |   |       |-- 1
                |   |       |   `-- data
                |   |       |-- log.txt
                |   |       `-- train.cfg
                |   |-- duration_predictor
                |   |-- pause_predictor
                |   `-- word_vector_tagger
                |       |-- table_file.table
                |       `-- training
                |           `-- train_data.txt
                `-- utt
                    |-- adr_diph1_001.utt
                    |-- adr_diph1_002.utt
                    |-- adr_diph1_003.utt
                    |-- adr_diph1_004.utt
                    |-- adr_diph1_005.utt
                    |-- adr_diph1_006.utt
                    |-- adr_diph1_007.utt
                    |-- adr_diph1_008.utt
                    |-- adr_diph1_009.utt
                    |-- adr_diph1_010.utt
                    |-- adr_diph1_011.utt
                    |-- adr_diph1_012.utt
                    |-- adr_diph1_013.utt
                    |-- adr_diph1_014.utt
                    |-- adr_diph1_015.utt
                    |-- adr_diph1_016.utt
                    |-- adr_diph1_017.utt
                    |-- adr_diph1_018.utt
                    |-- adr_diph1_019.utt
                    |-- adr_diph1_020.utt
                    |-- adr_diph1_021.utt
                    |-- adr_diph1_022.utt
                    |-- adr_diph1_023.utt
                    |-- adr_diph1_024.utt
                    |-- adr_diph1_025.utt
                    |-- adr_diph1_026.utt
                    |-- adr_diph1_027.utt
                    |-- adr_diph1_028.utt
                    `-- adr_diph1_029.utt

@oliverwatts
Copy link
Collaborator

Please attach or mail me your version of the file:

train/rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_001.bap.double

@candlewill
Copy link
Contributor Author

Here is just the *.bap.double files: cmp_double_files.tar.gz.

And all the files in this folder is here: naive_01_nn_cmp_folder.tar.gz

Thanks.

@oliverwatts
Copy link
Collaborator

oliverwatts commented Aug 25, 2017

It seems that the call to x2x at line 203 of Ossian/scripts/processors/FeatureExtractor.py gave a non-zero exit code, even though the data produced seems fine. Please comment out these lines at 203ff:

    if success != 0:
        print 'conversion of world spectrum to mel cepstra failed on utterance ' + utt.get("utterance_name")
        return          

and replace with:

    print 'bap exit code: %s'%( success)

The same might be necessary with other calls to x2x at lines 211, 218 and 232.

Then clean up:

rm -r $OSSIAN/train/rm/speakers/rss_toy_demo/naive_01_nn/ $OSSIAN/voices/rm/rss_toy_demo/naive_01_nn/

and try again, turning parallelisation off with flag -p 1:

python ./scripts/train.py -s rss_toy_demo -l rm -p 1 naive_01_nn

Let me know if this runs OK, and please paste output here.

@candlewill
Copy link
Contributor Author

Thanks. According to your advice, I analysed each system command called by os.system(). I found that the HTK tools is not installed correctly. After reinstall it, the *.cmp can be generated.

Thanks again.

@oliverwatts
Copy link
Collaborator

Commit f071281 addresses this by checking that the necessary executables are in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants