Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2) #27

Closed
ekremcet opened this issue Oct 7, 2018 · 3 comments

Comments

@ekremcet
Copy link

ekremcet commented Oct 7, 2018

I am trying to train the CornerNet on my machine but it produces the following error.

loading all datasets...
using 4 threads
loading from cache file: ./cache/coco_trainval2014.pkl
loading annotations into memory...
Done (t=8.28s)
creating index...
index created!
loading from cache file: ./cache/coco_trainval2014.pkl
loading annotations into memory...
Done (t=8.38s)
creating index...
index created!
loading from cache file: ./cache/coco_trainval2014.pkl
loading annotations into memory...
Done (t=9.22s)
creating index...
index created!
loading from cache file: ./cache/coco_trainval2014.pkl
loading annotations into memory...
Done (t=8.18s)
creating index...
index created!
loading from cache file: ./cache/coco_minival2014.pkl
loading annotations into memory...
Done (t=0.23s)
creating index...
index created!
system config...
{'batch_size': 1,
'cache_dir': './cache',
'chunk_sizes': [1],
'config_dir': './config',
'data_dir': './data',
'data_rng': <mtrand.RandomState object at 0x7fd5039ad4c8>,
'dataset': 'MSCOCO',
'decay_rate': 10,
'display': 5,
'learning_rate': 0.00025,
'max_iter': 500000,
'nnet_rng': <mtrand.RandomState object at 0x7fd5039ad510>,
'opt_algo': 'adam',
'prefetch_size': 5,
'pretrain': None,
'result_dir': './results',
'sampling_function': 'kp_detection',
'snapshot': 100,
'snapshot_name': 'CornerNet',
'stepsize': 450000,
'test_split': 'testdev',
'train_split': 'trainval',
'val_iter': 100,
'val_split': 'minival',
'weight_decay': False,
'weight_decay_rate': 1e-05,
'weight_decay_type': 'l2'}
db config...
{'ae_threshold': 0.5,
'border': 128,
'categories': 80,
'data_aug': True,
'gaussian_bump': True,
'gaussian_iou': 0.7,
'gaussian_radius': -1,
'input_size': [511, 511],
'lighting': True,
'max_per_image': 100,
'merge_bbox': False,
'nms_algorithm': 'exp_soft_nms',
'nms_kernel': 3,
'nms_threshold': 0.5,
'output_sizes': [[128, 128]],
'rand_color': True,
'rand_crop': True,
'rand_pushes': False,
'rand_samples': False,
'rand_scale_max': 1.4,
'rand_scale_min': 0.6,
'rand_scale_step': 0.1,
'rand_scales': array([0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3]),
'special_crop': False,
'test_scales': [1],
'top_k': 100,
'weight_exp': 8}
len of db: 118287
start prefetching data...
shuffling indices...
start prefetching data...
shuffling indices...
start prefetching data...
shuffling indices...
start prefetching data...
shuffling indices...
start prefetching data...
building model...
module_file: models.CornerNet
shuffling indices...
total parameters: 201035212
setting learning rate to: 0.00025
training start...
0%| | 0/500000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 195, in
train(training_dbs, validation_db, args.start_iter)
File "train.py", line 137, in train
training_loss = nnet.train(**training)
File "/home/ekrem/PycharmProjects/CornerNet/nnet/py_factory.py", line 81, in train
loss = self.network(xs, ys)
File "/home/ekrem/anaconda3/envs/CornerNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ekrem/PycharmProjects/CornerNet/models/py_utils/data_parallel.py", line 68, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/ekrem/anaconda3/envs/CornerNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ekrem/PycharmProjects/CornerNet/nnet/py_factory.py", line 20, in forward
loss = self.loss(preds, ys, **kwargs)
File "/home/ekrem/anaconda3/envs/CornerNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ekrem/PycharmProjects/CornerNet/models/py_utils/kp.py", line 297, in forward
pull, push = self.ae_loss(tl_tag, br_tag, gt_mask)
File "/home/ekrem/PycharmProjects/CornerNet/models/py_utils/kp_utils.py", line 197, in _ae_loss
dist = tag_mean.unsqueeze(1) - tag_mean.unsqueeze(2)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)

I have set batch_size = 1 and chunk_sizes = [1] to get rid of the out of memory error as mentioned in #4

Any suggestions ?
Thanks

@lianbobo
Copy link

lianbobo commented Oct 7, 2018

CornerNet/models/py_utils/kp_utils.py
Above line 193 ( mask = mask.unsqueeze(1) + mask.unsqueeze(2) ) add a line of code:tag_mean = tag_mean.unsqueeze(0)

@ekremcet
Copy link
Author

ekremcet commented Oct 7, 2018

CornerNet/models/py_utils/kp_utils.py
Above line 193 ( mask = mask.unsqueeze(1) + mask.unsqueeze(2) ) add a line of code:tag_mean = tag_mean.unsqueeze(0)

This solved the problem. Thanks !

@duanqipeng
Copy link

CornerNet/models/py_utils/kp_utils.py
Above line 193 ( mask = mask.unsqueeze(1) + mask.unsqueeze(2) ) add a line of code:tag_mean = tag_mean.unsqueeze(0)

This solved the problem. Thanks !

hey! i got the same error as u meet,then i use the method to solve the problem,but get another problem

IndexError: The shape of the mask [2, 128, 128] at index 0does not match the shape of the indexed tensor [1, 2, 2, 128] at index 0
i have 4 gpu and use bachsize=6,chunksize=[1,1,2,2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants