Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

> > 为什么我每次跑完都要一两天的时间呢?求大佬们帮忙 #33

Closed
comprehensiveMap opened this issue Apr 18, 2020 · 0 comments

Comments

@comprehensiveMap
Copy link

为什么我每次跑完都要一两天的时间呢?求大佬们帮忙
因为evaluation是在cpu上做的,试一下把evaluation放在gpu上以tensor的形式来做

谢谢大佬的回答。具体改法是不是把train_eval.py中.cpu()的地方改为.gpu()就可以了(gpu环境以及有了)?原文件里只说了训练时间:30分钟,也不知道怎么训练这么快的,哼😕(吐槽一下嘻嘻)

实际上这时候这个张量本身就是在GPU里面了,只要把cpu()去掉即可,并且要把原本允许ndarry作为参数的sklearn.metrics换为能够张量计算的方式。其他的都不用改。下面是我这边修改的这块代码,可以参考一下。
if total_batch % 100 == 0:
# 每多少轮输出在训练集和验证集上的效果
true = (labels).data
predic = torch.max(outputs.data, 1)[1]
total = true.size(0)
correct = (predic == true).sum().item()
train_acc = correct / total
dev_acc, dev_loss = evaluate(config, model, dev_iter)
if dev_loss < dev_best_loss:
dev_best_loss = dev_loss
torch.save(model.state_dict(), config.save_path)
improve = '*'
last_improve = total_batch
else:
improve = ''
time_dif = get_time_dif(start_time)
msg = 'Iter: {0:>6}, Train Loss: {1:>5.2}, Train Acc: {2:>6.2%}, Val Loss: {3:>5.2}, Val Acc: {4:>6.2%}, Time: {5} {6}'
print(msg.format(total_batch, loss.item(), train_acc, dev_loss, dev_acc, time_dif, improve))
model.train()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant