Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练成功过,中途退出了,重新开始训练,出现新的问题了 AssertionError:如果capturable=False,则state_步骤不应为CUDA张量。 #672

Closed
pzhyyd opened this issue Jul 22, 2022 · 5 comments

Comments

@pzhyyd
Copy link

pzhyyd commented Jul 22, 2022

Summary[问题简述(一句话)]
训练成功过,中途退出了,重新开始训练,出现新的问题了

assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors."

AssertionError: If capturable=False, state_steps should not be CUDA tensors.

Env & To Reproduce[复现与环境]
PYTHON 3.9

Screenshots[截图(如有)]
image

@VERT2022
Copy link

pytorch版本问题,将CUDA降到11.5,安装对应版本的pytorch 1.11.0

@pzhyyd
Copy link
Author

pzhyyd commented Jul 23, 2022

pytorch版本问题,将CUDA降到11.5,安装对应版本的pytorch 1.11.0

我已经成功开始过训练,从75K训练到78K了,我主动停止了训练,在没有更新的情况下,再次启动训练,就弹出问题了。

@pzhyyd
Copy link
Author

pzhyyd commented Jul 23, 2022

Arguments:
run_id: cjgbtest
syn_dir: E:\cjgb\SV2TTS\synthesizer
models_dir: synthesizer/saved_models/
save_every: 1000
backup_every: 25000
log_every: 200
force_restart: False
hparams:

Checkpoint path: synthesizer\saved_models\cjgbtest\cjgbtest.pt
Loading training data from: E:\cjgb\SV2TTS\synthesizer\train.txt
Using model: Tacotron
Using device: cuda

Initialising Tacotron Model...

\Loading the json with %s
{'sample_rate': 16000, 'n_fft': 800, 'num_mels': 80, 'hop_size': 200, 'win_size': 800, 'fmin': 55, 'min_level_db': -100, 'ref_level_db': 20, 'max_abs_value': 4.0, 'preemphasis': 0.97, 'preemphasize': True, 'tts_embed_dims': 512, 'tts_encoder_dims': 256, 'tts_decoder_dims': 128, 'tts_postnet_dims': 512, 'tts_encoder_K': 5, 'tts_lstm_dims': 1024, 'tts_postnet_K': 5, 'tts_num_highways': 4, 'tts_dropout': 0.5, 'tts_cleaner_names': ['basic_cleaners'], 'tts_stop_threshold': -3.4, 'tts_schedule': [[2, 0.001, 10000, 12], [2, 0.0005, 15000, 12], [2, 0.0002, 20000, 12], [2, 0.0001, 30000, 12], [2, 5e-05, 40000, 12], [2, 1e-05, 60000, 12], [2, 5e-06, 160000, 12], [2, 3e-06, 320000, 12], [2, 1e-06, 640000, 12]], 'tts_clip_grad_norm': 1.0, 'tts_eval_interval': 500, 'tts_eval_num_samples': 1, 'tts_finetune_layers': [], 'max_mel_frames': 900, 'rescale': True, 'rescaling_max': 0.9, 'synthesis_batch_size': 16, 'signal_normalization': True, 'power': 1.5, 'griffin_lim_iters': 60, 'fmax': 7600, 'allow_clipping_in_normalization': True, 'clip_mels_length': True, 'use_lws': False, 'symmetric_mels': True, 'trim_silence': True, 'speaker_embedding_size': 256, 'silence_min_duration_split': 0.4, 'utterance_min_duration': 1.6, 'use_gst': True, 'use_ser_for_gst': True}
Trainable Parameters: 0.000M

Loading weights at synthesizer\saved_models\cjgbtest\cjgbtest.pt
Tacotron weights loaded from step 78000
Using inputs from:
E:\cjgb\SV2TTS\synthesizer\train.txt
E:\cjgb\SV2TTS\synthesizer\mels
E:\cjgb\SV2TTS\synthesizer\embeds
Found 47 samples
+----------------+------------+---------------+------------------+
| Steps with r=2 | Batch Size | Learning Rate | Outputs/Step (r) |
+----------------+------------+---------------+------------------+
| 82k Steps | 6 | 5e-06 | 2 |
+----------------+------------+---------------+------------------+

Traceback (most recent call last):
File "E:\MockingBird-main\synthesizer_train.py", line 37, in
train(**vars(args))
File "E:\MockingBird-main\synthesizer\train.py", line 215, in train
optimizer.step()
File "G:\Anaconda3\lib\site-packages\torch\optim\optimizer.py", line 109, in wrapper
return func(*args, **kwargs)
File "G:\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "G:\Anaconda3\lib\site-packages\torch\optim\adam.py", line 157, in step
adam(params_with_grad,
File "G:\Anaconda3\lib\site-packages\torch\optim\adam.py", line 213, in adam
func(params,
File "G:\Anaconda3\lib\site-packages\torch\optim\adam.py", line 255, in _single_tensor_adam
assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors."
AssertionError: If capturable=False, state_steps should not be CUDA tensors.

@pzhyyd
Copy link
Author

pzhyyd commented Jul 23, 2022

论据:
运行id:cjgbtest
syn\u dir:E:\cjgb\SV2TTS\synthesizer
models\u dir:合成器/保存的\u模型/
每1000保存一次
备份间隔:25000
log_每隔:200
强制重新启动:False
H参数:
检查点路径:synthesizer\saved\u models\cjgbtest\cjgbtest.pt
从以下位置加载训练数据:E:\cjgb\SV2TTS\synthesizer\train.txt
使用型号:Tacotron
使用设备:cuda
正在初始化Tacotron模型。。。
\正在加载带有%s的json
{sample\u rate:16000,'n\u fft:800,'num\u mels:80,'hop\u size:200,'win\u size:800,'fmin:55,'min\u level\u db':-100,'ref\u level\u db:20,'max\u abs\u value:4.0,'preemphasis:0.97,'preemphasize:True,'tts\u embed\u dims:512,'tts\u encoder\u dims:256,'tts\u decoder\u dims:128,'tts\u postnet\u dims:512,'tts\u encoder\u K:5,“tts\u lstm\u dims”:1024,“tts\u postnet\u K”:5,“tts\u num\u highways”:4,“tts\u drop”输出:0.5,“tts\u cleaner\u names”:[“basic\u cleaners”],“tts\u stop\u threshold”:-3.4,“tts\u schedule”:[[2,0.001,10000,12],[2,0.0005,15000,12],[2,0.0002,20000,12],[2,0.0001,30000,12],[2,5e-05,40000,12],[2,1e-05,60000,12],[2,5e-06,160000,12],[2,3e-06,320000,12],[2,1e-06640000,12]],“tts\u clip\u grad\u norm”:1.0,“tts\u eval\u interval”:500,“tts\u eval\u num\u samples”:1,“tts\u finetune\u layers”:[],“max\u mel\u frames”:900,“rescale”:True,“rescaling\u max”:0.9,“synthesis\u batch\u size”:16,“signal\u normalization”:True,“power”:1.5,“griffin\u lim\u iters”:60,“fmax”:7600,“allow\u clipping\u in\u normalization”:True,“clip\u mels\u length”:True,“use\u lws”:False,“symmetric\u mels”:True,“trim\u silence”:True,“speaker\u embeding\u size”:256,“silence\u min\u duration\u拆分:0.4,“outrance\u min\u duration”:1.6,“use\u gst”:True,“use\u ser\u for\u gst”:True}
可训练参数:0.000M
在合成器\ saved\u models\cjgbtest\cjgbtest.pt处加载权重
从步骤78000加载的Tacotron重量
使用以下输入:
E: \cjgb\SV2TTS\synthesizer\train.txt
E: \cjgb\SV2TTS\synthesizer\mels
E: \cjgb\SV2TTS\synthesizer\embeddes
找到47个样本
+----------------+------------+---------------+------------------+
|r=2的步骤|批量|学习率|输出/步骤(r)|
+----------------+------------+---------------+------------------+
|82k步| 6 | 5e-06 | 2|
+----------------+------------+---------------+------------------+
回溯(最近一次呼叫最后一次):
文件“E:\MockingBird main\synthesizer\u train.py”,第37行,在<模块>
列车(**VAR(ARG))
文件“E:\MockingBird main\synthesizer\train.py”,第215行,in-train
优化器。步骤()
文件“G:\Anaconda3\lib\site packages\torch\optim\optimizer.py”,第109行,在包装器中
return func(*args,**kwargs)
文件“G:\Anaconda3\lib\site packages\torch\autograd\grad\u mode.py”,第27行,在decoration\u上下文中
return func(*args,**kwargs)
文件“G:\Anaconda3\lib\site packages\torch\optim\adam.py”,第157行,步骤
adam(params_with_grad,
adam中的文件“G:\Anaconda3\lib\site packages\torch\optim\adam.py”,第213行
func(参数,
文件“G:\Anaconda3\lib\site packages\torch\optim\adam.py”,第255行,在\u single\u tensor\u adam中
断言not step\t。is_cuda,“如果capturable=False,则state_步长不应为cuda张量。”
AssertionError:如果capturable=False,则state_步骤不应为CUDA张量。

@pzhyyd
Copy link
Author

pzhyyd commented Jul 23, 2022

找到解决方法了

首先卸载pytorch

cmd 键入 pip uninstall torch 执行卸载

重新装pytorch 我这里是一个旧版本:# CUDA 11.1
(cmd键入以下内容)
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

@pzhyyd pzhyyd closed this as completed Jul 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants