deepcoral的报错 #59

WHUzhusihan96 · 2019-03-12T06:34:32Z

RuntimeError Traceback (most recent call last)
in
115 model = load_pretrain(model)
116 for epoch in range(1, epochs + 1):
--> 117 train(epoch, model)
118 t_correct = test(model)
119 if t_correct > correct:

in train(epoch, model)
79 gamma = 2 / (1 + math.exp(-10 * (epoch) / epochs)) - 1
80 loss = loss_cls + gamma * loss_coral
---> 81 loss.backward()
82 optimizer.step()
83 if i % log_interval == 0:

~/anaconda3/envs/py36/lib/python3.6/site-packages/torch/autograd/variable.py in backward(self, gradient, retain_graph, create_graph, retain_variables)
165 Variable.
166 """
--> 167 torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
168
169 def register_hook(self, hook):

~/anaconda3/envs/py36/lib/python3.6/site-packages/torch/autograd/init.py in backward(variables, grad_variables, retain_graph, create_graph, retain_variables)
97
98 Variable._execution_engine.run_backward(
---> 99 variables, grad_variables, retain_graph)
100
101
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorMath.cu:26

想问一下这个错误怎么解决，与DDC,DAN同样的环境下跑的代码，自己查了一些解决方案，说是类别为-1的错误，但我感觉这个code里面应该不是这个问题，想问问有没有人遇到了同样的问题（anyone meet the same question?）

jindongwang · 2019-04-02T07:57:26Z

@WUzhusihan 亲，问题解决了吗？

WHUzhusihan96 · 2019-04-02T08:05:48Z

@jindongwang 暂时没有诶，最近做毕设，在看传统的方法。那个问题，之前查了挺多的，都没解决，所以暂时放了放。

jindongwang · 2019-04-02T08:10:48Z

@WUzhusihan 我这边确实没遇到你说的这个问题，是不是pytorch版本的问题？我这个代码写的有点早，可能现在对新版本支持不太好

WHUzhusihan96 · 2019-04-02T08:13:14Z

@jindongwang 我用的是您提供的版本号来做的。DAN和DDC也是同一个版本嘛. pytorch 0.3.1, torch 0.2, python 3.6.

jindongwang · 2019-04-02T08:24:59Z

@WUzhusihan 好的，有时间我更新一下代码

WHUzhusihan96 · 2019-04-02T08:27:14Z

@jindongwang 好的，感谢百忙之中抽空答疑，我之后也再看看。

jindongwang · 2019-04-16T02:18:20Z

@WUzhusihan 我更新了coral loss的代码，发现以前写的有问题。你可以再试跑一下代码，看看是否还存在问题。

WHUzhusihan96 · 2019-04-16T04:06:51Z

@jindongwang 非常感谢，我这边刚好本科毕设做的差不多，今天正好在看deepcoral，在您给的另一个链接里面找到了alexnet版的deepcoral那个是可以跑的，然后我的问题应该不是coral loss的问题。仅仅修改了loss的代码还是会出现问题，我怀疑是自己的配置问题，但其他代码都可以跑。
因为您的代码里DAN,DDC,DeepCoral的框架差不多，我把CoralLoss加到DAN的代码中，稍作修改是可以跑的，我暂时还没办法解释为什么。目前做了100个iter，精度比你贴出来的高一点点，coralloss特别小。

WHUzhusihan96 closed this as completed Apr 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepcoral的报错 #59

deepcoral的报错 #59

WHUzhusihan96 commented Mar 12, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 16, 2019

WHUzhusihan96 commented Apr 16, 2019

deepcoral的报错 #59

deepcoral的报错 #59

Comments

WHUzhusihan96 commented Mar 12, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 2, 2019

WHUzhusihan96 commented Apr 2, 2019

jindongwang commented Apr 16, 2019

WHUzhusihan96 commented Apr 16, 2019