Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is something wrong when I run "bash train.sh" #1

Closed
lyziew opened this issue Jul 3, 2018 · 4 comments
Closed

There is something wrong when I run "bash train.sh" #1

lyziew opened this issue Jul 3, 2018 · 4 comments

Comments

@lyziew
Copy link

lyziew commented Jul 3, 2018

Hi, I'm very interested in your work on segmentation
also I am new in deep learning
my system is ubuntu 16.04 16GB RAM Nvidia GeForce GTX 1080 Ti
my environment is cuda 8.0 Annoconda4.20 with python3.5 and python2.7(virtural)
i follow your setp ,run "setup_env.sh" to setup environment and then run train.sh
but it not work well. it's very difficult for me to solve it,can you give me a detail python environment and config setting or other guidances.

~/Segmentation/kaggle_carvana_segmentation/asanakoy ~/Segmentation/kaggle_carvana_segmentation
TRAIN SCRATCH
---
==========
FOLD 0
BATCH 1
gacc 4
epochs 250
==========

train_scratch.sh: 行 57:  5912 段错误               (核心已转储) python run_train.py -b=$BATCH -gacc=$gacc -f=$FOLD -nf=7 -fv=1 --lr=0.005 -opt=sgd --decay_step=100 --decay_gamma=0.5 -aug=2 --weight_decay=0.0005 -o="${o_dir}" --epochs=$epochs --no_cudnn

then i run "ternaus/train.sh" also not work

Traceback (most recent call last):
  File "src/train.py", line 196, in <module>
    main()
  File "src/train.py", line 191, in main
    fold=args.fold
  File "/home/ubuntu/Segmentation/kaggle_carvana_segmentation/ternaus/src/utils.py", line 112, in train
    for i, (inputs, targets) in enumerate(tl):
  File "/home/ubuntu/anaconda3/envs/py35_ternaus/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 301, in __iter__
    return DataLoaderIter(self)
  File "/home/ubuntu/anaconda3/envs/py35_ternaus/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 171, in __init__
    self._put_indices()
  File "/home/ubuntu/anaconda3/envs/py35_ternaus/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 210, in _put_indices
    indices = next(self.sample_iter, None)
  File "/home/ubuntu/anaconda3/envs/py35_ternaus/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 115, in __iter__
    for idx in self.sampler:
  File "/home/ubuntu/anaconda3/envs/py35_ternaus/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 50, in __iter__
    return iter(torch.randperm(len(self.data_source)).long())
RuntimeError: invalid argument 1: must be strictly positive at /pytorch/torch/lib/TH/generic/THTensorMath.c:2033

i also run ''albu/train.sh"

~/Segmentation/kaggle_carvana_segmentation/albu/src ~/Segmentation/kaggle_carvana_segmentation/albu
train.sh: 行 6:  6834 段错误               (核心已转储) PYTHONPATH=$(pwd):$PYTHONPATH python train.py
@asanakoy
Copy link
Owner

asanakoy commented Jul 3, 2018

How do you run the scripts?
You should use bash. bash setup_env.sh and bash train.sh.
May be you use sh instead of bash.

@lyziew
Copy link
Author

lyziew commented Jul 4, 2018

Thank you very much
I found the reason for the segmentation fault
my pytorch is not work well with tensorboardX when first import pytorch, so i import tensorboardX first.
the asankoy's solution is run well
but albu's solution is report not enough memory. should i min the bath_size or min the image size ?
the ternaus's solution also the same error, it seem appear wong when call enumerate(tl)

@asanakoy
Copy link
Owner

asanakoy commented Jul 4, 2018

Try to decrease the batch size.

@lyziew
Copy link
Author

lyziew commented Jul 4, 2018

Thanks for your help :)

@lyziew lyziew closed this as completed Jul 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants