Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I ran the train.py file, I encountered a problem, I hope you can answer it. Thank you. #7

Closed
wry980105 opened this issue Oct 16, 2020 · 7 comments

Comments

@wry980105
Copy link

File "train.py", line 604, in
main(args)
File "train.py", line 473, in main
trainer.run(train_loader,max_epochs=args.epoch)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 446, in run
self._handle_exception(e)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 410, in _handle_exception
raise e
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 433, in run
hours, mins, secs = self._run_once_on_dataset()
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 399, in _run_once_on_dataset
self._handle_exception(e)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 410, in _handle_exception
raise e
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 387, in _run_once_on_dataset
for batch in self.state.dataloader:
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/zhaoliuqing/wry/Graph2plan-master/Network/model/floorplan.py", line 223, in getitem
return fp.get_train_data()
File "/home/zhaoliuqing/wry/Graph2plan-master/Network/model/floorplan.py", line 202, in get_train_data
boxes = self.get_boxes(tensor=tensor)
File "/home/zhaoliuqing/wry/Graph2plan-master/Network/model/floorplan.py", line 159, in get_boxes
boxes = np.apply_along_axis(norm,1,boxes)
File "/home/zhaoliuqing/anaconda3/lib/python3.7/site-packages/numpy/lib/shape_base.py", line 380, in apply_along_axis
res = asanyarray(func1d(inarr_view[ind0], *args, **kwargs))
File "/home/zhaoliuqing/wry/Graph2plan-master/Network/model/floorplan.py", line 158, in
norm = lambda box:np.array([X[box[1]],Y[box[0]],X[box[3]-1],Y[box[2]-1]])
IndexError: index 208 is out of bounds for axis 0 with size 208

@zzilch
Copy link
Collaborator

zzilch commented Oct 17, 2020

Can you provide the name of the error data item? You can print it at here

@wry980105
Copy link
Author

Thank you for your reply!
It is this mistake that has occurred:“norm = lambda box:np.array([X[box[1]],Y[box[0]],X[box[3]-1],Y[box[2]-1]])
IndexError: index 208 is out of bounds for axis 0 with size 208”.I read the data_item,it maybe the gtBoxNew has problem.But i am not sure and don't know how to handle it.Could you please give me some methods?

@zzilch
Copy link
Collaborator

zzilch commented Oct 17, 2020

Yeah, I think there is something wrong with the training data. I need you to print the name of the wrong data.
In the class FloorPlanDataset , it will call get_train_data to get the traininng data of a Floorplan.
And in get_train_data, it will call get_boxes which casued the error you encountered.
You can print the name before it calls get_boxes in the line 193 of floorplan.py.
If you provide the name of the problem data, I can figure out what happened.

@wry980105
Copy link
Author

wry980105 commented Oct 18, 2020 via email

@zzilch
Copy link
Collaborator

zzilch commented Oct 19, 2020

Sorry, did you paste any images? I cannot see anyone of them.

@wry980105
Copy link
Author

wry980105 commented Oct 19, 2020 via email

@zzilch zzilch closed this as completed Nov 7, 2020
@LucaJeevanjee
Copy link

Hello, I am receiving this issue as well,
The index in which the error occurs changes when the batch size and number of workers is changed.

image

This is the log from when it crashes.
Looks like either training data: 27184 or 75591
Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants