ecooper@kucing /proj/tts/tools/Ossian $ python ./scripts/train.py -s rss_toy_demo -l rm -p 1 naive_01_nn -- Gather corpus -- Train voice /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn /proj/tts/tools/Ossian/voices//rm/rss_toy_demo/naive_01_nn try loading config from python... /proj/tts/tools/Ossian/recipes/naive_01_nn.cfg {'state_contexts': [('start_time', './attribute::start'), ('end_time', './attribute::end'), ('htk_state', 'count(./preceding-sibling::state) + 1'), ('htk_monophone', './ancestor::segment/attribute::pronunciation'), ('ll_segment', './ancestor::segment/preceding::segment[2]/attribute::pronunciation'), ('l_segment', './ancestor::segment/preceding::segment[1]/attribute::pronunciation'), ('c_segment', './ancestor::segment/attribute::pronunciation'), ('r_segment', './ancestor::segment/following::segment[1]/attribute::pronunciation'), ('rr_segment', './ancestor::segment/following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'dur_label_maker': , 'SKLDecisionTreePausePredictor': , 'train_stages': [[, , ], [, , , , , ], [, , , ]], 'PUNC_PATT': '[\\p{C}||\\p{P}||\\p{S}]', 'JUNCTURE_NODES': "//token[@token_class='space'] | //token[@token_class='punctuation']", 'WorldExtractor': , 'RegexTokeniser': , 'current_dir': '/proj/tts/tools/Ossian/recipes', 'phrase_adder': , 'dur_data_maker': , 'pause_predictor': , 'speech_generation': [, , , ], 'runtime_stages': [[, , ], [, ], [, , , ]], 'text_proc': [, , ], 'acoustic_predictor': , 'duration_predictor': , 'NNAcousticPredictor': , 'align_label_dumper': , 'pause_predictor_features': [('response', './attribute::has_silence="yes"'), ('token_is_punctuation', './attribute::token_class="punctuation"'), ('since_start_utterance_in_words', "count(preceding::token[@token_class='word'])"), ('till_end_utterance_in_words', "count(following::token[@token_class='word'])"), ('L_vsm_d1', "./preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './attribute::vsm_d1'), ('R_vsm_d1', "./following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './attribute::vsm_d2'), ('R_vsm_d2', "./following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './attribute::vsm_d3'), ('R_vsm_d3', "./following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './attribute::vsm_d4'), ('R_vsm_d4', "./following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './attribute::vsm_d5'), ('R_vsm_d5', "./following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './attribute::vsm_d6'), ('R_vsm_d6', "./following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './attribute::vsm_d7'), ('R_vsm_d7', "./following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './attribute::vsm_d8'), ('R_vsm_d8', "./following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './attribute::vsm_d9'), ('R_vsm_d9', "./following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './attribute::vsm_d10'), ('R_vsm_d10', "./following::token[@token_class='word'][1]/attribute::vsm_d10")], 'PhraseMaker': , 'alignment': [, , , , , ], 'VSMTagger': , 'duration_data_contexts': [('state_1_nframes', '(./state[1]/attribute::end - ./state[1]/attribute::start) div 5'), ('state_2_nframes', '(./state[2]/attribute::end - ./state[2]/attribute::start) div 5'), ('state_3_nframes', '(./state[3]/attribute::end - ./state[3]/attribute::start) div 5'), ('state_4_nframes', '(./state[4]/attribute::end - ./state[4]/attribute::start) div 5'), ('state_5_nframes', '(./state[5]/attribute::end - ./state[5]/attribute::start) div 5')], 'phone_and_state_contexts': [('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'word_vector_tagger': , 'inspect': , 'tokenisation_pattern': '(\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]*\\p{Z}+|\\p{Z}*[\\p{C}||\\p{P}||\\p{S}]+\\Z)', 'aligner': , 'sys': , 'dnn_label_maker': , 'NaivePhonetiser': , 'tokeniser': , 'LETTER_PATT': '[\\p{L}||\\p{N}||\\p{M}]', 'AcousticModelWorld': , 'speech_feature_extractor': , 'dim': 10, 'c': , 'FeatureDumper': , 'word_vsm_dim': 10, 'speech_coding_config': {'delta_delta_window': '1.0 -2.0 1.0', 'static_window': '1', 'order': 59, 'delta_window': '-0.5 0.0 0.5'}, 'pause_prediction': [, ], 'i': 5, 'SPACE_PATT': '\\p{Z}', 'PUNC_OR_SPACE_PATT': '[\\p{Z}||\\p{C}||\\p{P}||\\p{S}]', 'phonetiser': , 'NNDurationPredictor': , 'os': , 'phone_contexts': [('htk_monophone', './attribute::pronunciation'), ('start_time', './attribute::start'), ('end_time', './attribute::end'), ('ll_segment', 'preceding::segment[2]/attribute::pronunciation'), ('l_segment', 'preceding::segment[1]/attribute::pronunciation'), ('c_segment', './attribute::pronunciation'), ('r_segment', 'following::segment[1]/attribute::pronunciation'), ('rr_segment', 'following::segment[2]/attribute::pronunciation'), ('length_left_word', "count(ancestor::token/preceding::token[@token_class='word'][1]/descendant::segment)"), ('length_current_word', 'count(ancestor::token/descendant::segment)'), ('length_right_word', "count(ancestor::token/following::token[@token_class='word'][1]/descendant::segment)"), ('since_beginning_of_word', "count_Xs_since_start_Y('segment', 'token')"), ('till_end_of_word', "count_Xs_till_end_Y('segment', 'token')"), ('length_l_phrase_in_words', "count(ancestor::phrase/preceding::phrase[1]/descendant::token[@token_class='word'])"), ('length_c_phrase_in_words', "count(ancestor::phrase/descendant::token[@token_class='word'])"), ('length_r_phrase_in_words', "count(ancestor::phrase/following::phrase[1]/descendant::token[@token_class='word'])"), ('length_l_phrase_in_segments', 'count(ancestor::phrase/preceding::phrase[1]/descendant::segment)'), ('length_c_phrase_in_segments', 'count(ancestor::phrase/descendant::segment)'), ('length_r_phrase_in_segments', 'count(ancestor::phrase/following::phrase[1]/descendant::segment)'), ('since_phrase_start_in_segs', "count_Xs_since_start_Y('segment', 'phrase')"), ('till_phrase_end_in_segs', "count_Xs_till_end_Y('segment', 'phrase')"), ('since_phrase_start_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'phrase\')'), ('till_phrase_end_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'phrase\')'), ('since_start_sentence_in_segments', "count_Xs_since_start_Y('segment', 'utt')"), ('since_start_sentence_in_words', 'count_Xs_since_start_Y(\'token[@token_class="word"]\', \'utt\')'), ('since_start_sentence_in_phrases', "count_Xs_since_start_Y('phrase', 'utt')"), ('till_end_sentence_in_segments', "count_Xs_till_end_Y('segment', 'utt')"), ('till_end_sentence_in_words', 'count_Xs_till_end_Y(\'token[@token_class="word"]\', \'utt\')'), ('till_end_sentence_in_phrases', "count_Xs_till_end_Y('phrase', 'utt')"), ('length_sentence_in_segments', 'count(ancestor::utt/descendant::segment)'), ('length_sentence_in_words', "count(ancestor::utt/descendant::token[@token_class='word'])"), ('length_sentence_in_phrases', 'count(ancestor::utt/descendant::phrase)'), ('L_vsm_d1', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d1"), ('C_vsm_d1', './ancestor::token/attribute::vsm_d1'), ('R_vsm_d1', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d1"), ('L_vsm_d2', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d2"), ('C_vsm_d2', './ancestor::token/attribute::vsm_d2'), ('R_vsm_d2', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d2"), ('L_vsm_d3', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d3"), ('C_vsm_d3', './ancestor::token/attribute::vsm_d3'), ('R_vsm_d3', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d3"), ('L_vsm_d4', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d4"), ('C_vsm_d4', './ancestor::token/attribute::vsm_d4'), ('R_vsm_d4', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d4"), ('L_vsm_d5', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d5"), ('C_vsm_d5', './ancestor::token/attribute::vsm_d5'), ('R_vsm_d5', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d5"), ('L_vsm_d6', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d6"), ('C_vsm_d6', './ancestor::token/attribute::vsm_d6'), ('R_vsm_d6', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d6"), ('L_vsm_d7', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d7"), ('C_vsm_d7', './ancestor::token/attribute::vsm_d7'), ('R_vsm_d7', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d7"), ('L_vsm_d8', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d8"), ('C_vsm_d8', './ancestor::token/attribute::vsm_d8'), ('R_vsm_d8', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d8"), ('L_vsm_d9', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d9"), ('C_vsm_d9', './ancestor::token/attribute::vsm_d9'), ('R_vsm_d9', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d9"), ('L_vsm_d10', "./ancestor::token/preceding::token[@token_class='word'][1]/attribute::vsm_d10"), ('C_vsm_d10', './ancestor::token/attribute::vsm_d10'), ('R_vsm_d10', "./ancestor::token/following::token[@token_class='word'][1]/attribute::vsm_d10")], 'StateAligner': } train Cannot load NN model from model_dir: /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/duration_predictor -- not trained yet Cannot load NN model from model_dir: /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/acoustic_predictor -- not trained yet == Train voice (proc no. 1 (word_splitter)) == Train processor word_splitter RegexTokeniser requires no training Applying processor word_splitter p p p p p p p p p p p p p p p p p p p p p p p p p p p p p == Train voice (proc no. 2 (segment_adder)) == Train processor segment_adder NaivePhonetiser requires no training Applying processor segment_adder p p p p p p p p p p p p p p p p p p p p p p p p p p p p p == Train voice (proc no. 3 (word_vector_tagger)) == Train processor word_vector_tagger Count types... Assemble cooccurance matrix... Factorise cooccurance matrix... Write output to /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/word_vector_tagger/table_file.table Applying processor word_vector_tagger p p p p p p p p p p p p p p p p p p p p p p p p p p p p p == Train voice (proc no. 4 (feature_dumper)) == Train processor feature_dumper Applying processor feature_dumper p p p p p p p p p p p p p p p p p p p p p p p p p p p p p == Train voice (proc no. 5 (acoustic_feature_extractor)) == Train processor acoustic_feature_extractor Applying processor acoustic_feature_extractor Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_001.sp.double! File information Sampling : 48000 Hz 16 Bit Length 112013 [sample] Length 2.333604 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_001.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_001 DIO: 106 [msec] StoneMask: 49 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_002.sp.double! File information Sampling : 48000 Hz 16 Bit Length 87229 [sample] Length 1.817271 [sec] Analysis CheapTrick: 185 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_002.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_002 DIO: 70 [msec] StoneMask: 40 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_003.sp.double! File information Sampling : 48000 Hz 16 Bit Length 130360 [sample] Length 2.715833 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_003.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_003 CheapTrick: 183 [msec] D4C: 301 [msec] DIO: 131 [msec] StoneMask: 53 [msec] 467 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_004.sp.double! D4C: 263 [msec] CheapTrick: 205 [msec] File information Sampling : 48000 Hz 16 Bit Length 303852 [sample] Length 6.330250 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_004.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_004 364 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_005.sp.double! Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_005.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_005 File information Sampling : 48000 Hz 16 Bit Length 315102 [sample] Length 6.564625 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_006.sp.double! File information Sampling : 48000 Hz 16 Bit Length 142160 [sample] Length 2.961667 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_006.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_006 Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_007.sp.double! D4C: 282 [msec] File information Sampling : 48000 Hz 16 Bit Length 110450 [sample] Length 2.301042 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_007.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_007 544 2048 5 complete. DIO: 93 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_008.sp.double! File information Sampling : 48000 Hz 16 Bit Length 111048 [sample] Length 2.313500 [sec] Analysis StoneMask: 46 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_008.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_008 Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_009.sp.double! DIO: 78 [msec] File information Sampling : 48000 Hz 16 Bit Length 92057 [sample] Length 1.917854 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_009.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_009 DIO: 337 [msec] StoneMask: 51 [msec] DIO: 69 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_010.sp.double! StoneMask: 66 [msec] CheapTrick: 187 [msec] StoneMask: 36 [msec] File information Sampling : 48000 Hz 16 Bit Length 229499 [sample] Length 4.781229 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_010.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_010 DIO: 656 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_011.sp.double! File information Sampling : 48000 Hz 16 Bit Length 146776 [sample] Length 3.057833 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_011.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_011 CheapTrick: 190 [msec] DIO: 138 [msec] CheapTrick: 154 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_012.sp.double! File information Sampling : 48000 Hz 16 Bit Length 139879 [sample] Length 2.914146 [sec] Analysis CheapTrick: 217 [msec] StoneMask: 160 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_012.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_012 StoneMask: 119 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_013.sp.double! D4C: 294 [msec] File information Sampling : 48000 Hz 16 Bit Length 103000 [sample] Length 2.145833 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_013.bap.double! DIO: 819 [msec] D4C: 218 [msec] p double -> float conversion (stream: bap) failed on utterance adr_diph1_013 DIO: 90 [msec] 461 2048 5 complete. D4C: 280 [msec] DIO: 326 [msec] StoneMask: 50 [msec] 384 2048 5 complete. StoneMask: 139 [msec] StoneMask: 76 [msec] DIO: 317 [msec] 463 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_014.sp.double! File information Sampling : 48000 Hz 16 Bit Length 115876 [sample] Length 2.414083 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_014.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_014 D4C: 383 [msec] StoneMask: 72 [msec] CheapTrick: 343 [msec] CheapTrick: 167 [msec] DIO: 74 [msec] CheapTrick: 448 [msec] 593 2048 5 StoneMask: 59 [msec] complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_015.sp.double! File information Sampling : 48000 Hz 16 Bit Length 107215 [sample] Length 2.233646 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_015.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_015 CheapTrick: 237 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_016.sp.double! File information Sampling : 48000 Hz 16 Bit Length 90878 [sample] Length 1.893292 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_016.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_016 CheapTrick: 212 [msec] DIO: 119 [msec] DIO: 71 [msec] StoneMask: 49 [msec] CheapTrick: 191 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_017.sp.double! File information Sampling : 48000 Hz 16 Bit Length 121510 [sample] Length 2.531458 [sec] Analysis StoneMask: 36 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_017.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_017 D4C: 316 [msec] DIO: 68 [msec] 430 2048 5 complete. StoneMask: 51 [msec] CheapTrick: 525 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_018.sp.double! CheapTrick: 133 [msec] File information Sampling : 48000 Hz 16 Bit Length 207282 [sample] Length 4.318375 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_018.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_018 CheapTrick: 175 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_019.sp.double! File information Sampling : 48000 Hz 16 Bit Length 107215 [sample] Length 2.233646 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_019.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_019 DIO: 130 [msec] CheapTrick: 179 [msec] DIO: 66 [msec] D4C: 447 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_020.sp.double! D4C: 398 [msec] File information Sampling : 48000 Hz 16 Bit Length 121510 [sample] Length 2.531458 [sec] Analysis D4C: 348 [msec] StoneMask: 53 [msec] StoneMask: 94 [msec] D4C: 234 [msec] D4C: 676 [msec] DIO: 100 [msec] D4C: 280 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_020.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_020 612 2048 5 complete. StoneMask: 64 [msec] 583 2048 5 complete. CheapTrick: 154 [msec] 483 2048 5 complete. D4C: 334 [msec] D4C: 873 [msec] CheapTrick: 300 [msec] 379 2048 5 complete. 447 2048 5 complete. CheapTrick: 196 [msec] 957 2048 5 complete. D4C: 324 [msec] 507 2048 5 complete. 1267 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_021.sp.double! 447 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_021.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_021 File information Sampling : 48000 Hz 16 Bit Length 124573 [sample] Length 2.595271 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_022.sp.double! D4C: 841 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_022.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_022 File information Sampling : 48000 Hz 16 Bit Length 82708 [sample] Length 1.723083 [sec] Analysis DIO: 90 [msec] DIO: 86 [msec] D4C: 383 [msec] StoneMask: 59 [msec] StoneMask: 40 [msec] 1313 2048 5 complete. 507 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_023.sp.double! Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_023.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_023 File information Sampling : 48000 Hz 16 Bit Length 202177 [sample] Length 4.212021 [sec] Analysis CheapTrick: 135 [msec] D4C: 574 [msec] CheapTrick: 233 [msec] 864 2048 5 complete. Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_024.sp.double! Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_024.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_024 File information Sampling : 48000 Hz 16 Bit Length 236894 [sample] Length 4.935292 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_025.sp.double! Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_025.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_025 File information Sampling : 48000 Hz 16 Bit Length 156372 [sample] Length 3.257750 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_026.sp.double! File information Sampling : 48000 Hz 16 Bit Length 83729 [sample] Length 1.744354 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_026.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_026 D4C: 244 [msec] DIO: 132 [msec] File information Sampling : 48000 Hz 16 Bit Length 111299 [sample] Length 2.318729 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_027.sp.double! DIO: 68 [msec] DIO: 290 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_027.bap.double! 345 2048 5 complete. StoneMask: 32 [msec] p double -> float conversion (stream: bap) failed on utterance adr_diph1_027 DIO: 143 [msec] DIO: 73 [msec] StoneMask: 106 [msec] StoneMask: 59 [msec] StoneMask: 42 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_028.sp.double! StoneMask: 103 [msec] File information Sampling : 48000 Hz 16 Bit Length 70456 [sample] Length 1.467833 [sec] Analysis Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_028.bap.double! p double -> float conversion (stream: bap) failed on utterance adr_diph1_028 CheapTrick: 113 [msec] DIO: 59 [msec] StoneMask: 27 [msec] D4C: 403 [msec] 520 2048 5 complete. CheapTrick: 181 [msec] CheapTrick: 99 [msec] CheapTrick: 219 [msec] Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_029.sp.double! Cannot open file /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp/adr_diph1_029.bap.double! D4C: 190 [msec] p double -> float conversion (stream: bap) failed on utterance adr_diph1_029 File information Sampling : 48000 Hz 16 Bit Length 198092 [sample] Length 4.126917 [sec] Analysis == Train voice (proc no. 6 (aligner)) == Train processor aligner 349 2048 5 complete. CheapTrick: 293 [msec] CheapTrick: 360 [msec] D4C: 186 [msec] Training aligner -- see /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/processors/aligner/training/log.txt 294 2048 5 complete. D4C: 271 [msec] DIO: 258 [msec] 464 2048 5 complete. StoneMask: 77 [msec] D4C: 379 [msec] 652 2048 5 complete. set_up_data.py: No matching data files found in /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/align_lab and /proj/tts/tools/Ossian/train//rm/speakers/rss_toy_demo/naive_01_nn/cmp Aligner training failed CheapTrick: 237 [msec] ecooper@kucing /proj/tts/tools/Ossian $ D4C: 564 [msec] 843 2048 5 complete. D4C: 796 [msec] 988 2048 5 complete. D4C: 521 [msec] 826 2048 5 complete. ecooper@kucing /proj/tts/tools/Ossian $ ecooper@kucing /proj/tts/tools/Ossian $