-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2) with CornerNet_Saccade #34
Comments
Are you training on multiple GPUs? |
no, only one GPU |
try to increase the batch size to 8. if the error still persists please let us know. |
I changed the batch size to 8. its working fine. Now it's running. Thank you |
It ran for some time, But now it's throwing this below error batch_size = 8 |
It ran for the first time, but the same code running for the second time throws this Runtime error. |
The issues are with Cuda. I got it. |
once again runtime error 11, 1.55s/it]^M 0%| | 9976/6900000 [20:38:55<2966:40:26, 1.55s/it]^M 0%| | 9977/6900000 [20:38:56<2967:57:09, 1.55s/it]^M 0%| | 9978/6900000 [20:38:58<2955:07:32, 1.54s/it]^M 0%| | 9979/6900000 [20:38:59<2944:00:52, 1.54s/it]^M 0%| | 9980/6900000 [20:39:01<2937:11:48, 1.53s/it]^M 0%| | 9981/6900000 [20:39:02<2937:09:15, 1.53s/it]^M 0%| | 9982/6900000 [20:39:04<2951:29:57, 1.54s/it]^M 0%| | 9983/6900000 [20:39:05<2943:31:21, 1.54s/it]^M 0%| | 9984/6900000 [20:39:07<2937:01:24, 1.53s/it]^M 0%| | 9985/6900000 [20:39:08<2935:26:05, 1.53s/it]^M 0%| | 9986/6900000 [20:39:10<2931:41:56, 1.53s/it]^M 0%| | 9987/6900000 [20:39:12<2928:14:04, 1.53s/it]^M 0%| | 9988/6900000 [20:39:13<2951:58:24, 1.54s/it]^M 0%| | 9989/6900000 [20:39:15<2983:16:16, 1.56s/it]^M 0%| | 9990/6900000 [20:39:16<3001:54:02, 1.57s/it]^M 0%| | 9991/6900000 [20:39:18<2993:27:21, 1.56s/it]^M 0%| | 9992/6900000 [20:39:19<2987:51:46, 1.56s/it]^M 0%| | 9993/6900000 [20:39:21<2970:26:55, 1.55s/it]^M 0%| | 9994/6900000 [20:39:22<2996:07:47, 1.57s/it]^M 0%| | 9995/6900000 [20:39:24<2977:08:39, 1.56s/it]^M 0%| | 9996/6900000 [20:39:26<2961:41:57, 1.55s/it]^M 0%| | 9997/6900000 [20:39:27<2948:27:08, 1.54s/it]^M 0%| | 9998/6900000 [20:39:29<2940:58:51, 1.54s/it]^M Traceback (most recent call last): |
Which dataset are you working on? Please point to that dataset and share your code so that we can reproduce the errors |
I'm working on a real-time project so I'm unable to share the data.
|
Please check if there is any discrepancy within annotation files. Since it started and and ran for certain hours then the issue could be traced back to an image maybe with no labels and bounding boxes or box shapes crossing the image boundaries. |
yes, I crosschecked both train and valid data a couple of times regarding the bounding boxes and the labels. All are good. But when I go with only training data without validation data, everything works fine. |
Thank you for the detailed analysis. Will check validation codes. |
We have run multiple tests on corner-net pipeline, yet the error wasn't reproduced. Have you reached a solution yet? |
Closing due to inactivity |
I'm running
from train_detector import Detector
gtf = Detector();
root_dir = "/home/SK00495085/monk/Monk_Object_Detection/data";
coco_dir = "training_menu"
img_dir = "/"
set_dir = "Images"
gtf.Train_Dataset(root_dir, coco_dir, img_dir, set_dir, batch_size=4, num_workers=4)
root_dir = "/home/SK00495085/monk/Monk_Object_Detection/data";
coco_dir = "validation_menu"
img_dir = "/"
set_dir = "Images"
gtf.Val_Dataset(root_dir, coco_dir, img_dir, set_dir)
gtf.Model(model_name="CornerNet_Saccade")
gtf.Hyper_Params(lr=0.00025, total_iterations=6900000, val_interval=10000)
gtf.Setup();
gtf.Train();
I got this error:
loading annotations into memory...
Done (t=0.59s)
creating index...
index created!
loading annotations into memory...
Done (t=0.22s)
creating index...
index created!
Loading Model - core.models.CornerNet_Saccade
Model Loaded
start_iter = 0
distributed = False
world_size = 0
initialize = False
batch_size = 1
learning_rate = 0.00025
max_iteration = 6900000
stepsize = 5520000
snapshot = 3450000
val_iter = 10000
display = 100
decay_rate = 10
Process 0: building model...
total parameters: 116967797
start prefetching data...
shuffling indices...
setting learning rate to: 0.00025
training start...
start prefetching data...
shuffling indices...
0%| | 0/6900000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "training_saccade.py", line 31, in
gtf.Train();
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/train_detector.py", line 298, in Train
training_loss = self.system_dict["local"]["nnet"].train(**training)
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/core/nnet/py_factory.py", line 93, in train
loss = self.network(xs, ys)
File "/home/SK00495085/.conda/envs/monk_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/core/models/py_utils/data_parallel.py", line 68, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/SK00495085/.conda/envs/monk_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/core/nnet/py_factory.py", line 20, in forward
loss = self.loss(preds, ys, **kwargs)
File "/home/SK00495085/.conda/envs/monk_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/core/models/py_utils/losses.py", line 150, in forward
pull, push = self.ae_loss(tl_tag, br_tag, gt_mask)
File "/home/SK00495085/monk/Monk_Object_Detection/6_cornernet_lite/lib/core/models/py_utils/losses.py", line 26, in _ae_loss
dist = tag_mean.unsqueeze(1) - tag_mean.unsqueeze(2)
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
The text was updated successfully, but these errors were encountered: