Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问用到了GPU加速吗 #95

Open
xiatian815 opened this issue Dec 18, 2023 · 5 comments
Open

请问用到了GPU加速吗 #95

xiatian815 opened this issue Dec 18, 2023 · 5 comments

Comments

@xiatian815
Copy link

Args in experiment:
Namespace(activation='gelu', affine=0, batch_size=128, c_out=7, checkpoints='./checkpoints/', d_ff=256, d_layers=1, d_model=128, data='custom', data_path='weather.csv', dec_in=7, decomposition=0, des='Exp', devices='0,1,2,3', distil=True, do_predict=False, dropout=0.2, e_layers=3, embed='timeF', embed_type=0, enc_in=21, factor=1, fc_dropout=0.2, features='M', freq='h', gpu=0, head_dropout=0.0, individual=0, is_training=1, itr=1, kernel_size=25, label_len=48, learning_rate=0.0001, loss='mse', lradj='type3', model='PatchTST', model_id='336_96', moving_avg=25, n_heads=16, num_workers=10, output_attention=False, padding_patch='end', patch_len=16, patience=20, pct_start=0.3, pred_len=96, random_seed=2021, revin=1, root_path='./dataset/', seq_len=336, stride=8, subtract_last=0, target='OT', test_flop=False, train_epochs=100, use_amp=False, use_gpu=True, use_multi_gpu=False)
Use GPU: cuda:0

start training : 336_96_PatchTST_custom_ftM_sl336_ll48_pl96_dm128_nh16_el3_dl1_df256_fc1_ebtimeF_dtTrue_Exp_0>>>>>>>>>>>>>>>>>>>>>>>>>>
train 36456
val 5175
test 10444
iters: 100, epoch: 1 | loss: 0.6968859
speed: 0.8145s/iter; left time: 23051.8019s
iters: 200, epoch: 1 | loss: 0.6983635
speed: 0.6318s/iter; left time: 17817.0645s
Epoch: 1 cost time: 198.10548210144043
Epoch: 1, Steps: 284 | Train Loss: 0.7457502 Vali Loss: 0.5418562 Test Loss: 0.2217388
Validation loss decreased (inf --> 0.541856). Saving model ...
Updating learning rate to 0.0001
iters: 100, epoch: 2 | loss: 0.3526905
speed: 1.9244s/iter; left time: 53915.4483s
iters: 200, epoch: 2 | loss: 0.4079723
speed: 0.6331s/iter; left time: 17674.2574s

@lileishitou
Copy link

how to use learner.distributed(), in self supervised pretrain code ?

@WHU-EE
Copy link

WHU-EE commented Feb 22, 2024

I have the same question, have you solve it?
my gpu is rtx 3060
this is partial log:
iters: 100, epoch: 1 | loss: 1.0063255
speed: 0.7884s/iter; left time: 22311.7481s
iters: 200, epoch: 1 | loss: 0.6996360
speed: 0.3332s/iter; left time: 9397.4480s
Epoch: 1 cost time: 141.5194652080536

@WHU-EE
Copy link

WHU-EE commented Feb 22, 2024

Args in experiment: Namespace(activation='gelu', affine=0, batch_size=128, c_out=7, checkpoints='./checkpoints/', d_ff=256, d_layers=1, d_model=128, data='custom', data_path='weather.csv', dec_in=7, decomposition=0, des='Exp', devices='0,1,2,3', distil=True, do_predict=False, dropout=0.2, e_layers=3, embed='timeF', embed_type=0, enc_in=21, factor=1, fc_dropout=0.2, features='M', freq='h', gpu=0, head_dropout=0.0, individual=0, is_training=1, itr=1, kernel_size=25, label_len=48, learning_rate=0.0001, loss='mse', lradj='type3', model='PatchTST', model_id='336_96', moving_avg=25, n_heads=16, num_workers=10, output_attention=False, padding_patch='end', patch_len=16, patience=20, pct_start=0.3, pred_len=96, random_seed=2021, revin=1, root_path='./dataset/', seq_len=336, stride=8, subtract_last=0, target='OT', test_flop=False, train_epochs=100, use_amp=False, use_gpu=True, use_multi_gpu=False) Use GPU: cuda:0

start training : 336_96_PatchTST_custom_ftM_sl336_ll48_pl96_dm128_nh16_el3_dl1_df256_fc1_ebtimeF_dtTrue_Exp_0>>>>>>>>>>>>>>>>>>>>>>>>>>
train 36456
val 5175
test 10444
iters: 100, epoch: 1 | loss: 0.6968859
speed: 0.8145s/iter; left time: 23051.8019s
iters: 200, epoch: 1 | loss: 0.6983635
speed: 0.6318s/iter; left time: 17817.0645s
Epoch: 1 cost time: 198.10548210144043
Epoch: 1, Steps: 284 | Train Loss: 0.7457502 Vali Loss: 0.5418562 Test Loss: 0.2217388
Validation loss decreased (inf --> 0.541856). Saving model ...
Updating learning rate to 0.0001
iters: 100, epoch: 2 | loss: 0.3526905
speed: 1.9244s/iter; left time: 53915.4483s
iters: 200, epoch: 2 | loss: 0.4079723
speed: 0.6331s/iter; left time: 17674.2574s

I found that set num_works to 0 is helpful.

@codeNiuMa
Copy link

look like you are using GPU,Use GPU: cuda:0, but its so slow!
if you are running on Windows, you need to change args.num_worker=0, it maybe helpful

@syrGitHub
Copy link

我也有类似的问题,我的问题是随着.sh文件里面的命令的运行,代码突然会变慢,请问您解决了吗?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants