Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastspeech2 training error #203

Closed
mataym opened this issue Aug 11, 2020 · 44 comments
Closed

fastspeech2 training error #203

mataym opened this issue Aug 11, 2020 · 44 comments
Assignees
Labels
bug 🐛 Something isn't working question ❓ Further information is requested

Comments

@mataym
Copy link

mataym commented Aug 11, 2020

i have already created durations with MFA, and also ran well two preprocess script(tensorflow-tts-preprocess, tensorflow-tts-normalize) with no error. but when i ran the train script, there is an error occurred as follows:
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: batch_size = 16
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: remove_short_samples = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: allow_cache = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mel_length_threshold = 32
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: is_shuffle = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 5e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_max_steps = 200000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: save_interval_steps = 5000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: eval_interval_steps = 500
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: log_interval_steps = 200
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: num_save_intermediate_results = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_dir = ./dump/train/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: dev_dir = ./dump/valid/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: use_norm = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: f0_stat = ./dump/stats_f0.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: energy_stat = ./dump/stats_energy.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: outdir = ./examples/fastspeech2/exp/train.fastspeech2.v1/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: config = ./examples/fastspeech2/conf/fastspeech2.v1.yaml
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: resume =
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: verbose = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mixed_precision = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: version = 0.6.1
Traceback (most recent call last):
File "examples/fastspeech2/train_fastspeech2.py", line 400, in
main()
File "examples/fastspeech2/train_fastspeech2.py", line 316, in main
mel_length_threshold=mel_length_threshold,
File "/home/speechlab/TensorflowTTS/examples/fastspeech2/fastspeech2_dataset.py", line 104, in init
), f"Number of charactor, mel, duration, f0 and energy files are different"
AssertionError: Number of charactor, mel, duration, f0 and energy files are different

how do i solve this problem? can anybody help me ? thank a lot!

@machineko
Copy link
Contributor

machineko commented Aug 11, 2020

Run fix mismatch to fix few frames difference in audio and duration files and save durations into dump directory (its last step in mfa example):

https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/mfa_extraction

python examples/mfa_extraction/fix_mismatch.py \
  --base_path ./dump \
  --trimmed_dur_path ./dataset/trimmed-durations \
  --dur_path ./dataset/durations

@dathudeptrai
Copy link
Collaborator

@mataym make sure all duration/charactor/mel/f0/energy files is in ./dump/train, ./dump/valid and the number of file of each input is the same.

@mataym
Copy link
Author

mataym commented Aug 12, 2020

after i fixed the durations with mfa_extraction script, i ran the model train script(train_fastspeech2.py), but there is fatal error occurred:
(tensorflowtts) [speechlab@localhost TensorflowTTS]$ CUDA_VISIBLE_DEVICES=0 python examples/fastspeech2/train_fastspeech2.py \

--train-dir ./dump/train/
--dev-dir ./dump/valid/
--outdir ./examples/fastspeech2/exp/train.fastspeech2.v1/
--config ./examples/fastspeech2/conf/fastspeech2.v1.yaml
--use-norm 1
--f0-stat ./dump/stats_f0.npy
--energy-stat ./dump/stats_energy.npy
--mixed_precision 1
--resume ""
2020-08-12 11:13:02.070006: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-12 11:13:03.297113: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-08-12 11:13:07.236504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2020-08-12 11:13:07.236595: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-12 11:13:07.239380: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-08-12 11:13:07.241273: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-12 11:13:07.241620: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-12 11:13:07.244302: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-12 11:13:07.246252: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-08-12 11:13:07.251879: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-08-12 11:13:07.255859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow_addons/utils/ensure_tf_install.py:68: UserWarning: Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.2.0 and strictly below 2.3.0 (nightly versions are not supported).
The versions of TensorFlow you are currently using is 2.3.0 and is not supported.
Some things might work, some things might not.
If you were to encounter a bug, do not file an issue.
If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version.
You can find the compatibility matrix in TensorFlow Addon's readme:
https://github.com/tensorflow/addons
UserWarning,
2020-08-12 11:13:08.041127: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-12 11:13:08.059890: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2399950000 Hz
2020-08-12 11:13:08.061938: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b70c762e80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-12 11:13:08.062005: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-08-12 11:13:08.232431: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b70c7cf5f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-12 11:13:08.232483: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla V100-PCIE-16GB, Compute Capability 7.0
2020-08-12 11:13:08.234590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2020-08-12 11:13:08.234658: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-12 11:13:08.234700: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-08-12 11:13:08.234729: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-12 11:13:08.234756: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-12 11:13:08.234783: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-12 11:13:08.234806: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-08-12 11:13:08.234831: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-08-12 11:13:08.238585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-12 11:13:08.238629: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-12 11:13:09.155030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-12 11:13:09.155104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-08-12 11:13:09.155144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-08-12 11:13:09.160381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14729 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:2d:00.0, compute capability: 7.0)
2020-08-12 11:13:09,179 (train_fastspeech2:289) INFO: hop_size = 256
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: format = npy
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: model_type = fastspeech2
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: fastspeech2_params = {'n_speakers': 1, 'encoder_hidden_size': 384, 'encoder_num_hidden_layers': 4, 'encoder_num_attention_heads': 2, 'encoder_attention_head_size': 192, 'encoder_intermediate_size': 1024, 'encoder_intermediate_kernel_size': 3, 'encoder_hidden_act': 'mish', 'decoder_hidden_size': 384, 'decoder_num_hidden_layers': 4, 'decoder_num_attention_heads': 2, 'decoder_attention_head_size': 192, 'decoder_intermediate_size': 1024, 'decoder_intermediate_kernel_size': 3, 'decoder_hidden_act': 'mish', 'variant_prediction_num_conv_layers': 2, 'variant_predictor_filter': 256, 'variant_predictor_kernel_size': 3, 'variant_predictor_dropout_rate': 0.5, 'num_mels': 80, 'hidden_dropout_prob': 0.2, 'attention_probs_dropout_prob': 0.1, 'max_position_embeddings': 2048, 'initializer_range': 0.02, 'output_attentions': False, 'output_hidden_states': False}
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: batch_size = 16
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: remove_short_samples = True
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: allow_cache = True
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: mel_length_threshold = 32
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: is_shuffle = True
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 5e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: train_max_steps = 200000
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: save_interval_steps = 5000
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: eval_interval_steps = 500
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: log_interval_steps = 200
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: num_save_intermediate_results = 1
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: train_dir = ./dump/train/
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: dev_dir = ./dump/valid/
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: use_norm = True
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: f0_stat = ./dump/stats_f0.npy
2020-08-12 11:13:09,180 (train_fastspeech2:289) INFO: energy_stat = ./dump/stats_energy.npy
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: outdir = ./examples/fastspeech2/exp/train.fastspeech2.v1/
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: config = ./examples/fastspeech2/conf/fastspeech2.v1.yaml
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: resume =
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: verbose = 1
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: mixed_precision = True
2020-08-12 11:13:09,181 (train_fastspeech2:289) INFO: version = 0.6.1
2020-08-12 11:13:16.322486: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-08-12 11:13:17.858452: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
Model: "tf_fast_speech2"


Layer (type) Output Shape Param #

embeddings (TFFastSpeechEmbe multiple 844032


encoder (TFFastSpeechEncoder multiple 11814400


length_regulator (TFFastSpee multiple 0


decoder (TFFastSpeechDecoder multiple 12601216


mel_before (Dense) multiple 30800


postnet (TFTacotronPostnet) multiple 4352400


f0_predictor (TFFastSpeechVa multiple 493313


energy_predictor (TFFastSpee multiple 493313


duration_predictor (TFFastSp multiple 493313


f0_embeddings (Conv1D) multiple 3840


dropout_32 (Dropout) multiple 0


energy_embeddings (Conv1D) multiple 3840


dropout_33 (Dropout) multiple 0

Total params: 31,130,467
Trainable params: 29,552,579
Non-trainable params: 1,577,888


[train]: 0%| | 0/200000 [00:00<?, ?it/s]2020-08-12 11:13:24.628420: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1345] No whitelist ops found, nothing to do
2020-08-12 11:13:24.643382: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1345] No whitelist ops found, nothing to do
2020-08-12 11:13:34.486241: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 631 of 2050
2020-08-12 11:13:44.492164: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 1286 of 2050
2020-08-12 11:13:54.491977: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 1912 of 2050
2020-08-12 11:13:57.385583: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:221] Shuffle buffer filled.
/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/framework/indexed_slices.py:432: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-08-12 11:14:20.627617: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1924] Converted 1123/9897 nodes to float16 precision using 113 cast(s) to float16 (excluding Const and Variable casts)
2020-08-12 11:14:24.429316: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1924] Converted 0/8317 nodes to float16 precision using 0 cast(s) to float16 (excluding Const and Variable casts)
Traceback (most recent call last):
**File "examples/fastspeech2/train_fastspeech2.py", line 400, in
main()
File "examples/fastspeech2/train_fastspeech2.py", line 392, in main
resume=args.resume,
File "/home/speechlab/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 852, in fit
self.run()
File "/home/speechlab/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 101, in run
self._train_epoch()
File "/home/speechlab/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 123, in _train_epoch
self._train_step(batch)
File "/home/speechlab/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 666, in _train_step
self.one_step_forward(batch)
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in call
result = self._call(*args, **kwds)
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2829, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/home/speechlab/anaconda3/envs/tensorflowtts/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [16,98,384] vs. [16,113,384]
[[node tf_fast_speech2/add_1 (defined at /home/speechlab/TensorflowTTS/tensorflow_tts/models/fastspeech2.py:181) ]]
[[tf_fast_speech2/length_regulator/while/LoopCond/_92/_132]]
(1) Invalid argument: Incompatible shapes: [16,98,384] vs. [16,113,384]
[[node tf_fast_speech2/add_1 (defined at /home/speechlab/TensorflowTTS/tensorflow_tts/models/fastspeech2.py:181) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference__one_step_forward_45935]

Errors may have originated from an input operation.
Input Source operations connected to node tf_fast_speech2/add_1:
tf_fast_speech2/encoder/layer_._3/mul (defined at /home/speechlab/TensorflowTTS/tensorflow_tts/models/fastspeech.py:380)

Input Source operations connected to node tf_fast_speech2/add_1:
tf_fast_speech2/encoder/layer_._3/mul (defined at /home/speechlab/TensorflowTTS/tensorflow_tts/models/fastspeech.py:380)

Function call stack:
_one_step_forward -> _one_step_forward

[train]: 0%| | 0/200000 [01:03<?, ?it/s]**

@dathudeptrai
Copy link
Collaborator

@mataym can you check 2 things bellow:

  1. sum(duration) == len(mel) in all files.
  2. all element in each duration files is positive (>=0)

@mataym
Copy link
Author

mataym commented Aug 12, 2020

@mataym can you check 2 things bellow:

  1. sum(duration) == len(mel) in all files.
  2. all element in each duration files is positive (>=0)
    sorry, im very new to tts,idont know how to do that? can you tell me how to do that in detail or give me link?thanks

@dathudeptrai
Copy link
Collaborator

@mataym can you give me ur structure file in ./dump/train and ./dump/valid ?

@mataym
Copy link
Author

mataym commented Aug 12, 2020

@mataym can you check 2 things bellow:

  1. sum(duration) == len(mel) in all files.
  2. all element in each duration files is positive (>=0)
    sorry, im very new to tts,idont know how to do that? can you tell me how to do that in detail or give me link?thanks

@mataym can you give me ur structure file in ./dump/train and ./dump/valid ?

the structure of dump folder in my workspace is a follows:
dump/
├── --config
├── --dev-dir
├── --energy-stat
├── --f0-stat
├── --mixed_precision
├── _o
├── --outdir
├── --resume
├── stats_energy.npy
├── stats_f0.npy
├── stats.npy
├── train
│   ├── fix_dur
│   │   ├── 00000001-durations.npy
...
│   │   └── 00002158-durations.npy
│   ├── ids
│   │   ├── 00000001-ids.npy
│   │   ...
│   │   └── 00002158-ids.npy
│   ├── norm-feats
│   │   ├── 00000001-norm-feats.npy
│   │   ...
│   │   └── 00002158-norm-feats.npy
│   ├── raw-energies
│   │   ├── 00000001-raw-energy.npy
│   │   ...
│   │   └── 00002158-raw-energy.npy
│   ├── raw-f0
│   │   ├── 00000001-raw-f0.npy
│   │   ...
│   │   └── 00002158-raw-f0.npy
│   ├── raw-feats
│   │   ├── 00000001-raw-feats.npy
│   │   ├...
│   │   └── 00002158-raw-feats.npy
│   └── wavs
│   ├── 00000001-wave.npy
│   ...
│   └── 00002158-wave.npy
├── --train-dir
├── train_utt_ids.npy
├── --use-norm
├── valid
│   ├── fix_dur
│   │   ├── 00000030-durations.npy
│   │   ...
│   │   └── 00002135-durations.npy
│   ├── ids
│   │   ├── 00000030-ids.npy
│   │   ├...
│   │   └── 00002135-ids.npy
│   ├── norm-feats
│   │   ├── 00000030-norm-feats.npy
│   │   ├...
│   │   └── 00002135-norm-feats.npy
│   ├── raw-energies
│   │   ├── 00000030-raw-energy.npy
│   │   ├...
│   │   └── 00002135-raw-energy.npy
│   ├── raw-f0
│   │   ├── 00000030-raw-f0.npy
│   │   ├...
│   │   └── 00002135-raw-f0.npy
│   ├── raw-feats
│   │   ├── 00000030-raw-feats.npy
│   │   ...
│   │   └── 00002135-raw-feats.npy
│   └── wavs
│   ├── 00000030-wave.npy
│   ...
│   └── 00002135-wave.npy
└── valid_utt_ids.npy
i have 2158 wav and txt file in my corpus.

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Aug 12, 2020

@mataym in ur fix_dur, can you load each file and check if all element of each file is positive value ?. And check if sum(np.load(''./fix_dur/...-durations.npy")) == len(np.load("./norm-feats/...-norm-feats.npy"))

@mataym
Copy link
Author

mataym commented Aug 12, 2020

@mataym in ur fix_dur, can you load each file and check if all element of each file is positive value ?. And check if sum(np.load(''./fix_dur/...-durations.npy")) == len(np.load("./norm-feats/...-norm-feats.npy"))

  1. i checked the value in ...-fix_dur.npy in train and valid value, all of them is positive value.
  2. i checked the sum(np.load(''./fix_dur/...-durations.npy")) == len(np.load("./norm-feats/...-norm-feats.npy")) in train and valid folder , the sum of ...-durations.npy is equal to ...len(-norm-feats.npy), there is no not equal values.
    what can i do next?

@machineko
Copy link
Contributor

machineko commented Aug 12, 2020

Upload dur and norm-feats files somewhere and send it to me ill check it locally

or just run

for i in ["train", "valid"]:
    for j in os.listdir(f"{i}/fix_dur"):
        assert np.sum(np.load(f"{i}/fix_dur/{j}")) == len(np.load(f"{i}/norm-feats/{j.split('-')[0]}-raw-feats.npy"))

@mataym
Copy link
Author

mataym commented Aug 12, 2020

for i in ["train", "valid"]:
for j in os.listdir(f"{i}/fix_dur"):
assert np.sum(np.load(f"{i}/fix_dur/{j}")) == len(np.load(f"{i}/norm-feats/{j.split('-')[0]}-raw-feats.npy"))

thanks,raw-feats.npy in your crode should be changed to norm_feats.npy. and after I run the program, assert is ok. does it mean that data preprocessing is ok?

@machineko
Copy link
Contributor

Ye it is, are u using some sort of debugger if yes check values in debugger before training breaks

@mataym
Copy link
Author

mataym commented Aug 12, 2020

in fix_mismatch.py script, what is --trimmed_dur_path ./trimmed-durations ? i have mo ./trimmed-durations folder anyway.
python examples/mfa_extraction/fix_mismatch.py
--base_path ./dump
--trimmed_dur_path ./trimmed-durations
--dur_path ./durations

@machineko
Copy link
Contributor

machineko commented Aug 12, 2020

I don't know where u saved trimmed durations it's up to u if u just follow mfa extraction steps everything is in dataset/ (or libritts) folder

@mataym
Copy link
Author

mataym commented Aug 13, 2020

How to generate the trimmed-durations folder in this project?

@dathudeptrai
Copy link
Collaborator

@mataym see here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/preprocess/preprocess_libritts.yaml#L19). You should add trim_mfa: true in ur preprocesing config. The trimmed-durations dir created automatically when u run the preprocessing script (see https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/bin/preprocess.py#L129).

@manmay-nakhashi
Copy link

manmay-nakhashi commented Aug 13, 2020

@dathudeptrai got same error in fastspeech2 , when i am attempting symbol based training for my dataset
when preprocess.py phonemes and mfa phonemes are same it works fine , but when i am switching phoneme to symbol based training with mfa extracted durations it throws this error

Traceback (most recent call last):
  File "examples/fastspeech2/train_fastspeech2.py", line 411, in <module>
    main()
  File "examples/fastspeech2/train_fastspeech2.py", line 403, in main
    resume=args.resume,
  File "/mnt/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 852, in fit
    self.run()
  File "/mnt/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 101, in run
    self._train_epoch()
  File "/mnt/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 123, in _train_epoch
    self._train_step(batch)
  File "/mnt/TensorflowTTS/tensorflow_tts/trainers/base_trainer.py", line 666, in _train_step
    self.one_step_forward(batch)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 550, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  Incompatible shapes: [16,132,256] vs. [16,156,256]
         [[node tf_fast_speech2/add_1 (defined at /mnt/TensorflowTTS/tensorflow_tts/models/fastspeech2.py:181) ]]
         [[tf_fast_speech2/length_regulator/while/LoopCond/_92/_108]]
  (1) Invalid argument:  Incompatible shapes: [16,132,256] vs. [16,156,256]
         [[node tf_fast_speech2/add_1 (defined at /mnt/TensorflowTTS/tensorflow_tts/models/fastspeech2.py:181) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference__one_step_forward_41923]

Errors may have originated from an input operation.
Input Source operations connected to node tf_fast_speech2/add_1:
 tf_fast_speech2/encoder/layer_._2/mul (defined at /mnt/TensorflowTTS/tensorflow_tts/models/fastspeech.py:380)

Input Source operations connected to node tf_fast_speech2/add_1:
 tf_fast_speech2/encoder/layer_._2/mul (defined at /mnt/TensorflowTTS/tensorflow_tts/models/fastspeech.py:380)

Function call stack:
_one_step_forward -> _one_step_forward

@dathudeptrai
Copy link
Collaborator

@manmay-nakhashi did you follow the instruction ?, pls add trim_mfa: true in the preprocess config, you should also run fix_mismatch script.

@manmay-nakhashi
Copy link

@dathudeptrai still same error

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Aug 13, 2020

@manmay-nakhashi again, this bug is only cause by the mismatch between duration and mel length. Let check by yourself if the sum duration is equal len(mel) (both raw-feats and norm-feats), also make sure all element in each duration file is positive. You should check what duration files you are using for training, duration extract from textgirds or durations after fix mixmatch. In ur log, the bug is here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/models/fastspeech2.py#L181), that mean you should check the len(ids) and len(f0s) and len(energys). In (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/examples/fastspeech2/fastspeech2_dataset.py#L158) pls add bellow code:

assert len(charactor) == len(f0) == len(energy)

@manmay-nakhashi
Copy link

manmay-nakhashi commented Aug 13, 2020

@dathudeptrai ok i'll check that , we are doing tf_average_by_duration for f0 and energy , do we also have to do for charactor ??
after doing that it doesn't throw that error , but is it right thing to do ?

@mataym
Copy link
Author

mataym commented Aug 13, 2020

in ljspeech.py file, i cannot understand the list valid_symbols[], it has 84 element, what is that?is that english phones?but i know english has just only 39 phones.
in my case, i have 32 different phones in my-lexicon.txt.
meta.yaml file's structure in test-g2p-model.zip(g2p model generated from test-lexicon.txt with MFA) as follows:
architecture: phonetisaurus
graphemes: [a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z]
phones: [a, b, c, d, ddd, e, f, fff, g, ggg, h, hhh, i, j, jjj, k, kkk, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z]
version: 1.0.0
What configuration should I modify before preprocessing and training the model? should i change valid_symbols[] elements with my 32 phones? i ask the experts for advice, thank a lot!

@dathudeptrai
Copy link
Collaborator

@mataym you just need change the symbols :)). here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/processor/ljspeech.py#L110) to ur symbols :)). note that pad symbols always has id = 0 :D.

@mataym
Copy link
Author

mataym commented Aug 13, 2020

@mataym you just need change the symbols :)). here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/processor/ljspeech.py#L110) to ur symbols :)). note that pad symbols always has id = 0 :D.
hi @dathudeptrai , my symbols all of them are in _letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz", so it is not necessary to add my symbols there, isn't it? but my phonese is different from english phones.

@dathudeptrai
Copy link
Collaborator

@mataym will u use phone or charactor ? , you just need make sure the symbols in the code cover all ur symbols :)).

@deepConnectionism
Copy link

i have already created durations with MFA, and also ran well two preprocess script(tensorflow-tts-preprocess, tensorflow-tts-normalize) with no error. but when i ran the train script, there is an error occurred as follows:
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: batch_size = 16
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: remove_short_samples = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: allow_cache = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mel_length_threshold = 32
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: is_shuffle = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 5e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_max_steps = 200000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: save_interval_steps = 5000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: eval_interval_steps = 500
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: log_interval_steps = 200
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: num_save_intermediate_results = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_dir = ./dump/train/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: dev_dir = ./dump/valid/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: use_norm = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: f0_stat = ./dump/stats_f0.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: energy_stat = ./dump/stats_energy.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: outdir = ./examples/fastspeech2/exp/train.fastspeech2.v1/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: config = ./examples/fastspeech2/conf/fastspeech2.v1.yaml
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: resume =
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: verbose = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mixed_precision = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: version = 0.6.1
Traceback (most recent call last): File "examples/fastspeech2/train_fastspeech2.py", line 400, in main() File "examples/fastspeech2/train_fastspeech2.py", line 316, in main mel_length_threshold=mel_length_threshold, File "/home/speechlab/TensorflowTTS/examples/fastspeech2/fastspeech2_dataset.py", line 104, in init ), f"Number of charactor, mel, duration, f0 and energy files are different" AssertionError: Number of charactor, mel, duration, f0 and energy files are different
how do i solve this problem? can anybody help me ? thank a lot!

i have already created durations with MFA, and also ran well two preprocess script(tensorflow-tts-preprocess, tensorflow-tts-normalize) with no error. but when i ran the train script, there is an error occurred as follows:
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: batch_size = 16
2020-08-12 02:19:06,034 (train_fastspeech2:289) INFO: remove_short_samples = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: allow_cache = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mel_length_threshold = 32
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: is_shuffle = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 5e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_max_steps = 200000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: save_interval_steps = 5000
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: eval_interval_steps = 500
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: log_interval_steps = 200
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: num_save_intermediate_results = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: train_dir = ./dump/train/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: dev_dir = ./dump/valid/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: use_norm = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: f0_stat = ./dump/stats_f0.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: energy_stat = ./dump/stats_energy.npy
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: outdir = ./examples/fastspeech2/exp/train.fastspeech2.v1/
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: config = ./examples/fastspeech2/conf/fastspeech2.v1.yaml
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: resume =
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: verbose = 1
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: mixed_precision = True
2020-08-12 02:19:06,035 (train_fastspeech2:289) INFO: version = 0.6.1
Traceback (most recent call last): File "examples/fastspeech2/train_fastspeech2.py", line 400, in main() File "examples/fastspeech2/train_fastspeech2.py", line 316, in main mel_length_threshold=mel_length_threshold, File "/home/speechlab/TensorflowTTS/examples/fastspeech2/fastspeech2_dataset.py", line 104, in init ), f"Number of charactor, mel, duration, f0 and energy files are different" AssertionError: Number of charactor, mel, duration, f0 and energy files are different
how do i solve this problem? can anybody help me ? thank a lot!

You also can download this extracted durations at 40k steps at link. Then put them in appropriate folders.
You can refer to the folder placement here.Step 4: Extract duration from alignments for FastSpeech

@machineko
Copy link
Contributor

@Hymnhyz If you're using your own dataset u need to retrain taco2 for extraction (otherwise it works pretty bad)

@mataym
Copy link
Author

mataym commented Aug 18, 2020

@manmay-nakhashi again, this bug is only cause by the mismatch between duration and mel length. Let check by yourself if the sum duration is equal len(mel) (both raw-feats and norm-feats), also make sure all element in each duration file is positive. You should check what duration files you are using for training, duration extract from textgirds or durations after fix mixmatch. In ur log, the bug is here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/models/fastspeech2.py#L181), that mean you should check the len(ids) and len(f0s) and len(energys). In (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/examples/fastspeech2/fastspeech2_dataset.py#L158) pls add bellow code:

assert len(charactor) == len(f0) == len(energy)
  1. i checked the the duration files with sum(duration) == len(mel) in all files, it is no problem
  2. i checked each duration files is positive (>=0), it also no problem.
    then, according to your suggestion, I put assert(len(charactor)== len(f0)==len(energy)) in fastspeech2_dataset generator function, and every file reported an error, and none of the files passed from this assert line.
    I suspect that the problem lies in the generation of TextGrid files. because I don't have an acoustic model for my working language, so I can't use mfa_align script (bin/mfa_align corpus_directory dictionary_path acoustic_model_path output_directory). The mfa_train_and_align (bin/mfa_train_and_align corpus_directory dictionary_path output_directory) script is used instead, because this script can generate TextGrid files without acoustic model. Then all other operations are the same as the official methods.
    I really don't know how to solve this problem.

@dathudeptrai
Copy link
Collaborator

this bug is because in the duration, there is zero value :))). So, the condition is not >=0, it should be > 0.

@gbaian10
Copy link

gbaian10 commented Sep 1, 2020

@mataym can you check 2 things bellow:

  1. sum(duration) == len(mel) in all files.
  2. all element in each duration files is positive (>=0)

@dathudeptrai If my sum(duration) != len(mel),how can I fix it?
I use LJSpeech and your Tacotron2 Extract duration from Googledrive

@machineko
Copy link
Contributor

@gbaian10 Just pad data our change duration value of the last duration token to match len(mel)

@gbaian10
Copy link

gbaian10 commented Sep 1, 2020

@gbaian10 Just pad data our change duration value of the last duration token to match len(mel)
@machineko I don't know how to do it, but I found that their length is only difference 1

@machineko
Copy link
Contributor

@gbaian10 add +1 to last element in duration array or just follow https://github.com/TensorSpeech/TensorFlowTTS/blob/master/examples/mfa_extraction/fix_mismatch.py

@gbaian10
Copy link

gbaian10 commented Sep 1, 2020

@gbaian10 add +1 to last element in duration array or just follow https://github.com/TensorSpeech/TensorFlowTTS/blob/master/examples/mfa_extraction/fix_mismatch.py

+1 any element?

@machineko
Copy link
Contributor

@gbaian10 yes u can add it to any element

@dathudeptrai dathudeptrai self-assigned this Sep 2, 2020
@dathudeptrai dathudeptrai added bug 🐛 Something isn't working question ❓ Further information is requested labels Sep 2, 2020
@gbaian10
Copy link

gbaian10 commented Sep 2, 2020

this bug is because in the duration, there is zero value :))). So, the condition is not >=0, it should be > 0.

@dathudeptrai
Include ndim 0? I have run "fix_mismatch.py"

I downloaded your duration file and run fix.
I checked all the duration files with sum(duration) == len(mel)
"fix_mismatch.py" fixed the last ndim +1.
But ndim 0 is still 0 in part of .npy. Even some middle parts are still 0.
How can i fix it?

@dathudeptrai
Copy link
Collaborator

this bug is because in the duration, there is zero value :))). So, the condition is not >=0, it should be > 0.

@dathudeptrai
Include ndim 0? I have run "fix_mismatch.py"

I downloaded your duration file and run fix.
I checked all the duration files with sum(duration) == len(mel)
"fix_mismatch.py" fixed the last ndim +1.
But ndim 0 is still 0 in part of .npy. Even some middle parts are still 0.
How can i fix it?

can you pass ur sample .npy that include zero value here ?

@gbaian10
Copy link

gbaian10 commented Sep 2, 2020

this bug is because in the duration, there is zero value :))). So, the condition is not >=0, it should be > 0.

@dathudeptrai
Include ndim 0? I have run "fix_mismatch.py"
I downloaded your duration file and run fix.
I checked all the duration files with sum(duration) == len(mel)
"fix_mismatch.py" fixed the last ndim +1.
But ndim 0 is still 0 in part of .npy. Even some middle parts are still 0.
How can i fix it?

can you pass ur sample .npy that include zero value here ?

How can I send it to you?

@dathudeptrai
Copy link
Collaborator

this bug is because in the duration, there is zero value :))). So, the condition is not >=0, it should be > 0.

@dathudeptrai
Include ndim 0? I have run "fix_mismatch.py"
I downloaded your duration file and run fix.
I checked all the duration files with sum(duration) == len(mel)
"fix_mismatch.py" fixed the last ndim +1.
But ndim 0 is still 0 in part of .npy. Even some middle parts are still 0.
How can i fix it?

can you pass ur sample .npy that include zero value here ?

How can I send it to you?

just pass a numpy array to here (print it and pass here)

@gbaian10
Copy link

gbaian10 commented Sep 2, 2020

np.load("LJ001-0001-durations.npy")
array([ 1, 5, 4, 6, 7, 6, 15, 3, 9, 18, 2, 8, 1, 2, 2, 3, 5,
7, 8, 6, 4, 5, 8, 9, 13, 4, 3, 4, 3, 4, 2, 2, 4, 2,
2, 4, 7, 3, 3, 4, 5, 5, 4, 5, 5, 3, 4, 5, 3, 5, 2,
7, 5, 6, 5, 2, 2, 6, 1, 7, 7, 13, 8, 8, 5, 7, 10, 20,
11, 6, 4, 3, 3, 13, 6, 9, 10, 4, 4, 2, 3, 7, 9, 9, 7,
5, 3, 4, 6, 6, 6, 7, 4, 3, 5, 3, 3, 7, 7, 9, 6, 6,
1, 1, 3, 6, 8, 5, 6, 2, 4, 3, 1, 4, 4, 4, 6, 11, 9,
10, 2, 6, 5, 10, 5, 3, 6, 4, 6, 4, 4, 5, 4, 9, 5, 5,
1, 2, 2, 3, 3, 10, 8, 6, 4, 5, 6, 5, 3, 6, 25],
dtype=int32)
np.load("LJ001-0002-durations.npy")
array([ 1, 8, 2, 5, 5, 6, 5, 1, 3, 4, 4, 2, 7, 10, 5, 7, 6,
8, 3, 4, 6, 5, 4, 6, 11, 8, 3, 8, 6, 12], dtype=int32)
np.load("LJ001-0003-durations.npy")
array([ 1, 3, 6, 7, 4, 4, 3, 4, 2, 4, 3, 3, 3, 3, 0, 1, 3,
7, 6, 11, 4, 15, 4, 2, 7, 5, 4, 4, 3, 2, 9, 2, 7, 3,
6, 5, 3, 2, 2, 8, 10, 10, 10, 5, 2, 2, 4, 5, 8, 5, 2,
3, 6, 6, 11, 9, 8, 6, 32, 8, 6, 8, 4, 8, 12, 1, 6, 4,
4, 4, 3, 3, 5, 5, 7, 6, 4, 6, 7, 5, 2, 3, 7, 8, 8,
4, 6, 3, 8, 5, 1, 5, 4, 3, 6, 6, 3, 5, 3, 1, 1, 1,
3, 3, 7, 5, 4, 6, 10, 5, 3, 2, 11, 3, 4, 5, 3, 3, 1,
2, 1, 6, 5, 7, 4, 1, 3, 4, 8, 12, 1, 8, 8, 2, 20, 10,
8, 8, 1, 3, 6, 7, 4, 4, 5, 4, 3, 4, 5, 2, 15, 8, 12,
7, 20], dtype=int32)
np.load("LJ001-0004-durations.npy")
array([ 1, 4, 3, 5, 11, 8, 4, 5, 4, 4, 4, 0, 5, 3, 5, 9, 4,
4, 5, 4, 4, 8, 8, 10, 16, 5, 11, 1, 2, 4, 3, 3, 2, 2,
2, 3, 5, 2, 2, 3, 8, 6, 2, 9, 7, 5, 4, 6, 4, 1, 3,
6, 2, 8, 4, 7, 6, 8, 4, 5, 5, 10, 2, 5, 4, 2, 2, 1,
2, 2, 5, 6, 3, 10, 5, 5, 5, 3, 5, 5, 5, 6, 5, 3, 6,
4, 8, 11, 16], dtype=int32)
np.load("LJ001-0005-durations.npy")
array([ 1, 1, 3, 2, 6, 4, 7, 5, 7, 3, 2, 5, 4, 3, 4, 3, 3,
9, 6, 6, 5, 4, 4, 3, 5, 9, 5, 7, 3, 2, 5, 7, 6, 6,
2, 4, 13, 3, 6, 2, 4, 2, 0, 1, 3, 2, 5, 5, 4, 2, 3,
4, 3, 3, 2, 3, 2, 1, 2, 3, 6, 5, 6, 10, 5, 5, 5, 4,
1, 4, 5, 6, 7, 4, 3, 8, 20, 15, 12, 3, 3, 6, 7, 10, 4,
7, 7, 5, 5, 5, 3, 7, 3, 2, 9, 5, 9, 5, 2, 3, 10, 6,
24, 9, 4, 5, 1, 3, 3, 5, 5, 3, 6, 7, 7, 4, 2, 8, 1,
3, 5, 1, 3, 1, 2, 3, 7, 5, 6, 5, 2, 3, 2, 3, 7, 3,
8, 4, 5, 8, 7, 1, 13], dtype=int32)
np.load("LJ001-0006-durations.npy")
array([ 1, 17, 12, 22, 2, 3, 2, 6, 3, 5, 4, 6, 7, 7, 0, 5, 5,
3, 7, 3, 4, 5, 7, 4, 5, 5, 4, 5, 9, 11, 5, 6, 14, 5,
10, 22, 2, 3, 12, 2, 13, 6, 2, 4, 4, 2, 3, 9, 8, 10, 9,
6, 3, 3, 12, 4, 5, 1, 8, 13, 7, 5, 3, 3, 12, 7, 16, 4,
3, 5, 5, 5, 11, 20], dtype=int32)
np.load("LJ001-0007-durations.npy")
array([ 1, 0, 2, 7, 1, 7, 9, 7, 5, 9, 5, 5, 5, 6, 3, 14, 7,
7, 9, 3, 4, 5, 7, 5, 5, 3, 3, 3, 5, 2, 4, 8, 6, 6,
6, 2, 4, 4, 6, 6, 17, 9, 6, 18, 11, 0, 11, 0, 2, 5, 10,
10, 8, 3, 3, 6, 4, 15, 5, 1, 14, 12, 6, 11, 6, 8, 6, 5,
5, 5, 6, 8, 4, 6, 6, 16, 4, 4, 2, 9, 14, 6, 6, 9, 21,
5, 4, 2, 5, 5, 8, 3, 5, 4, 5, 5, 5, 3, 9, 8, 7, 5,
3, 6, 5, 5, 8, 4, 6, 7, 21, 8, 1, 13], dtype=int32)
np.load("LJ001-0008-durations.npy")
array([ 1, 8, 0, 8, 6, 6, 5, 2, 3, 7, 4, 2, 2, 7, 4, 6, 6,
5, 5, 27, 8, 2, 15, 4, 12], dtype=int32)

np.load("LJ001-0009-durations.npy")
array([ 0, 5, 5, 7, 9, 4, 5, 1, 1, 3, 5, 5, 16, 7, 3, 1, 6,
3, 3, 4, 11, 7, 6, 3, 6, 9, 4, 7, 10, 2, 9, 24, 16, 3,
3, 3, 5, 3, 4, 6, 5, 3, 6, 10, 8, 4, 3, 1, 7, 2, 2,
5, 7, 3, 2, 3, 3, 9, 7, 10, 5, 9, 4, 4, 4, 7, 8, 7,
3, 4, 3, 4, 6, 7, 10, 6, 18, 24, 8, 11, 4, 6, 4, 8, 5,
8, 2, 5, 5, 5, 10, 7, 7, 5, 5, 6, 6, 7, 7, 14, 6, 6,
27, 1], dtype=int32)
np.load("LJ001-0036-durations.npy")
array([18, 5, 3, 30, 6, 7, 6, 5, 3, 0, 4, 0, 3, 6, 8, 11, 6,
2, 5, 5, 7, 10, 19, 16, 21, 8, 5, 5, 6, 10, 9, 4, 2, 9,
2, 4, 7, 4, 9, 11, 7, 6, 4, 14, 6, 13, 6, 6, 4, 17, 2,
4, 3, 3, 4, 7, 3, 9, 4, 6, 8, 4, 3, 2, 4, 6, 5, 15,
1, 5, 3, 2, 1, 3, 4, 2, 3, 5, 9, 4, 3, 8, 4, 5, 9,
4, 4, 8, 5, 6, 6, 7, 11, 7, 7, 20, 1], dtype=int32)
np.load("LJ001-0048-durations.npy")
array([ 0, 4, 9, 5, 5, 5, 5, 3, 2, 6, 4, 10, 3, 10, 15, 4, 6,
3, 5, 3, 4, 8, 7, 5, 6, 6, 7, 5, 4, 8, 9, 4, 6, 4,
7, 9, 6, 6, 4, 8, 10, 18, 11, 6, 8, 6, 12, 4, 4, 5, 6,
4, 7, 5, 6, 1, 5, 6, 3, 6, 4, 6, 4, 5, 7, 5, 5, 10,
2, 8, 4, 4, 5, 4, 4, 4, 6, 3, 4, 3, 1, 4, 5, 12, 5,
4, 3, 6, 6, 16, 3, 19, 2], dtype=int32)
np.load("LJ001-0051-durations.npy")
array([10, 7, 6, 4, 7, 7, 4, 3, 5, 2, 4, 3, 3, 4, 6, 5, 0,
6, 2, 5, 8, 5, 3, 2, 5, 10, 2, 5, 3, 1, 2, 1, 4, 3,
6, 6, 10, 1, 6, 4, 2, 6, 11, 7, 31, 2, 6, 6, 14, 3, 6,
5, 4, 4, 6, 6, 10, 12, 6, 6, 5, 3, 5, 2, 5, 4, 4, 7,
5, 7, 7, 9, 6, 17, 1], dtype=int32)
np.load("LJ001-0064-durations.npy")
array([ 5, 3, 12, 8, 3, 4, 3, 1, 3, 4, 4, 7, 5, 4, 5, 15, 9,
5, 8, 1, 0, 5, 4, 5, 14, 9, 3, 2, 4, 2, 4, 5, 2, 3,
3, 2, 9, 4, 1, 4, 2, 1, 4, 0, 3, 5, 9, 7, 7, 11, 8,
7, 7, 7, 1, 8, 10, 8, 5, 20, 23, 2, 5, 3, 3, 3, 2, 3,
4, 3, 5, 4, 9, 6, 9, 7, 3, 5, 4, 5, 5, 4, 5, 3, 5,
6, 4, 7, 5, 6, 8, 17, 1], dtype=int32)
np.load("LJ001-0100-durations.npy")
array([ 2, 3, 3, 3, 2, 3, 7, 7, 4, 12, 7, 6, 4, 2, 4, 2, 3,
6, 4, 11, 7, 12, 2, 12, 8, 7, 3, 4, 6, 6, 7, 28, 7, 10,
3, 13, 8, 4, 3, 4, 5, 4, 5, 2, 7, 5, 4, 6, 1, 5, 1,
2, 2, 4, 5, 3, 8, 8, 3, 3, 4, 5, 4, 13, 5, 5, 10, 3,
20, 12, 2, 4, 3, 3, 3, 3, 2, 1, 0, 4, 3, 5, 8, 7, 4,
5, 5, 6, 4, 4, 13, 4, 6, 7, 2, 4, 3, 4, 3, 5, 4, 6,
4, 2, 4, 6, 4, 23, 10, 3, 5, 6, 3, 2, 3, 6, 2, 10, 4,
4, 4, 2, 10, 5, 4, 4, 5, 9, 4, 42, 0, 1], dtype=int32)
np.load("LJ001-0120-durations.npy")
array([ 0, 7, 2, 2, 3, 3, 5, 6, 11, 6, 4, 5, 5, 11, 10, 7, 39,
4, 12, 2, 5, 11, 2, 6, 5, 3, 4, 9, 8, 5, 6, 5, 4, 3,
6, 2, 6, 7, 12, 3, 5, 3, 5, 3, 4, 5, 4, 4, 3, 4, 5,
7, 6, 6, 4, 15, 11, 9, 14, 7, 12, 17, 3, 6, 4, 5, 6, 6,
6, 5, 4, 7, 1, 9, 5, 4, 2, 4, 4, 4, 6, 5, 5, 11, 5,
5, 2, 5, 4, 4, 3, 3, 2, 2, 4, 5, 10, 2, 2, 4, 18, 1],
dtype=int32)

@dathudeptrai
Copy link
Collaborator

@gbaian10 can you calculate how many samples have zero value ?. If it's not much, you can ignore those samples :D

@gbaian10
Copy link

gbaian10 commented Sep 3, 2020

can you calculate how many samples have zero value ?. If it's not much, you can ignore those samples :D

In 12445 training data, there are a total of 7237 files have zero value.

@dathudeptrai
Copy link
Collaborator

@gbaian10 can you pass the error when you training fs ?

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Sep 3, 2020

@gbaian10 sorry, i think everything is ok, even the duration of some charator is zero, the shape is still match :)). I will close issue now, please open new issue if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working question ❓ Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants