Code Error #7

AmingWu · 2019-06-16T00:31:38Z

Hello,
When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward
dist_cur = torch.norm(x_expand - centroids_expand, 2, 2)
RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512])
torch.Size([86, 122, 512])

Could you give some advice to solve this problem?

AmingWu · 2019-06-16T01:50:22Z

Ok, I have solved this problem.

xavieryxie · 2019-06-17T06:07:15Z

Hello,
When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward
dist_cur = torch.norm(x_expand - centroids_expand, 2, 2)
RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512])
torch.Size([86, 122, 512])

Could you give some advice to solve this problem?

I also met this problem and it is odd that i got different errors when i run python main.py... in multiple times. Could you tell me how do you solve the problem? Thanks

AmingWu · 2019-06-17T07:16:57Z

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

xavieryxie · 2019-06-17T09:09:02Z

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

Which python do you use? 2.7 or 3.5? i still got the same error, it is a little weird==

xavieryxie · 2019-06-17T12:18:41Z

Use a single GPU. For example, CUDA_VISIBLE_DEVICES=0 python main.py --config ./config/Places_LT/stage_2_meta_embedding.py

The problem has been solved. Thanks for your advice.

AmingWu · 2019-06-17T12:24:55Z

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

xavieryxie · 2019-06-17T12:43:01Z

Ok, No problem.

zhmiao · 2019-06-17T19:48:19Z

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

xavieryxie · 2019-06-18T13:19:01Z

@AmingWu @onexxp Thank you very much for asking. The problem you have encountered was caused by the use of multi-GPU. We have had the same problem as well. Pytorch split the batch according to the number of available GPUs, such that the actual calculation in the code can cause problems because we assume the batch size should be fixed. (i.e. if we have 2 GPUs with batchsize=256, most likely in each GPU there would only be 128 samples, while all the other calculations are expecting 256 input samples). We did not prepare the code to be compatible with multi-GPU training/testing. We are sorry about this. It might need some extra effort to make it work.

Thanks for your answering and awesome work. I met another problem when i use python3.5 to run the code. When we initial models, the feat/classifier param will become order-less for we define the param as a dict, and the code won't work. I change it to OrderedDict() and it works. I don't know if it is just me occur the problem. Just a little question.

xavieryxie · 2019-06-18T13:25:44Z

OK, When you have trained the model on the Place365 dataset, could you share your result with me?

Have you trained the model? I use the default param to train the model, but the result seems a little lower than the paper reported. It is like: Many_shot_accuracy_top1: 0.412 Median_shot_accuracy_top1: 0.369 Low_shot_accuracy_top1: 0.218 on the closed-set.

AmingWu · 2019-06-18T13:34:57Z

My result is lower than your result.

AmingWu · 2019-06-28T00:38:23Z

Hello, for the Place_LT, the number of open set is 6600. But, when I run the openset test, I find the number is 43100. Why?

AmingWu · 2019-06-28T01:15:33Z

Hello, I have understood your setting.

zhmiao · 2019-07-31T03:28:52Z

@AmingWu As mentioned here: #17 (comment) we think we have found the problem why the inference results are a little bit lower than reported. We will fix this asap. Thank you very much.

zhmiao · 2019-08-05T20:04:34Z

@AmingWu #17 (comment)

zhmiao added enhancement New feature or request question Further information is requested labels Jun 17, 2019

zhmiao closed this as completed Aug 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Error #7

Code Error #7

AmingWu commented Jun 16, 2019

AmingWu commented Jun 16, 2019

xavieryxie commented Jun 17, 2019

AmingWu commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

AmingWu commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

zhmiao commented Jun 17, 2019 •

edited

Loading

xavieryxie commented Jun 18, 2019

xavieryxie commented Jun 18, 2019

AmingWu commented Jun 18, 2019

AmingWu commented Jun 28, 2019

AmingWu commented Jun 28, 2019

zhmiao commented Jul 31, 2019

zhmiao commented Aug 5, 2019

Code Error #7

Code Error #7

Comments

AmingWu commented Jun 16, 2019

AmingWu commented Jun 16, 2019

xavieryxie commented Jun 17, 2019

AmingWu commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

AmingWu commented Jun 17, 2019

xavieryxie commented Jun 17, 2019

zhmiao commented Jun 17, 2019 • edited Loading

xavieryxie commented Jun 18, 2019

xavieryxie commented Jun 18, 2019

AmingWu commented Jun 18, 2019

AmingWu commented Jun 28, 2019

AmingWu commented Jun 28, 2019

zhmiao commented Jul 31, 2019

zhmiao commented Aug 5, 2019

zhmiao commented Jun 17, 2019 •

edited

Loading