Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) #21

Closed
22wei22 opened this issue Jun 3, 2018 · 8 comments
Closed

Segmentation fault (core dumped) #21

22wei22 opened this issue Jun 3, 2018 · 8 comments

Comments

@22wei22
Copy link

22wei22 commented Jun 3, 2018

==> Test
Extracted features for query set, obtained 3368-by-2048 matrix
Extracted features for gallery set, obtained 15913-by-2048 matrix
==> BatchTime(s)/BatchSize(img): 0.014/64
Segmentation fault (core dumped)
测试的时候出现这个问题,谢谢

@KaiyangZhou
Copy link
Owner

Can you provide more detailed info? (e.g. python version, which script/command you are using)

@22wei22
Copy link
Author

22wei22 commented Jun 3, 2018

不要意思,python 2.7.13 pytorch 0.4.0 torchvision0.2.1
我是执行的
python train_img_model_xent_htri.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 5 --save-dir log/resnet50-xent-market1501 --gpu-devices 2
出现的问题
Epoch: [5][50/93] Time 0.131 (0.143) Data 0.002 (0.015) Loss 6.8060 (6.7926)
Epoch: [5][60/93] Time 0.126 (0.141) Data 0.002 (0.013) Loss 6.9414 (6.8131)
Epoch: [5][70/93] Time 0.141 (0.141) Data 0.002 (0.012) Loss 6.8047 (6.8204)
Epoch: [5][80/93] Time 0.129 (0.140) Data 0.002 (0.010) Loss 6.7707 (6.8251)
Epoch: [5][90/93] Time 0.141 (0.139) Data 0.001 (0.010) Loss 6.8328 (6.8303)
==> Test
Extracted features for query set, obtained 3368-by-2048 matrix
Extracted features for gallery set, obtained 15913-by-2048 matrix
==> BatchTime(s)/BatchSize(img): 0.013/32
Segmentation fault (core dumped)
我可以训练就是在训练的过程中,如果验证的话,就会出现这个问题

@KaiyangZhou
Copy link
Owner

hmm... i guess your python code crashed, but no idea why it happened on your machine, perhaps this, this, and this would help? try those answers and let me know.

@KaiyangZhou KaiyangZhou changed the title 寻求帮助 Segmentation fault (core dumped) Jun 3, 2018
@luzai
Copy link
Contributor

luzai commented Jun 3, 2018

I meet segmentation problem under python 2.7.14 pytorch 0.5.0a0.
I find it is distmat.addmm_(1, -2, qf, gf.t()) that causes the program crash.
And my work-around is change

    m, n = qf.size(0), gf.size(0)
    distmat = torch.pow(qf, 2).sum(dim=1, keepdim=True).expand(m, n) + \
              torch.pow(gf, 2).sum(dim=1, keepdim=True).expand(n, m).t()
    distmat.addmm_(1, -2, qf, gf.t())
    distmat = distmat.numpy()

to

qf = qf.numpy()
gf = gf.numpy()
from scipy.spatial.distance import cdist
distmat = cdist(qf, gf)

I am not sure what problem you meet, hope it help. (And I am wondering whether it is a bug of pytorch 0.5.0a0)

@22wei22
Copy link
Author

22wei22 commented Jun 4, 2018

@luzai @KaiyangZhou 非常感谢,按照你的方式已经把问题解决了
qf = qf.numpy()
gf = gf.numpy()
from scipy.spatial.distance import cdist
distmat = cdist(qf, gf)

@22wei22
Copy link
Author

22wei22 commented Jun 4, 2018

python train_img_model_xent_htri.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 2
不过我使用完之后
mAP: 59.8%
CMC curve
Rank-1 : 77.0%
Rank-5 : 90.6%
Rank-10 : 93.8%
Rank-20 : 96.3%
==> Best Rank-1 77.0%, achieved at epoch 60
感觉效果不好啊
我想问一下,怎么看各种 script/command所对应的超参数是什么?

@KaiyangZhou
Copy link
Owner

you can do python train_img_model_xent_htri.py -h or check the code for more info

@22wei22
Copy link
Author

22wei22 commented Jun 4, 2018

好好,谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants