Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainig not moving after printing '0 training images in the /data/ocr/icdar2015' #150

Closed
saharudra opened this issue May 3, 2018 · 9 comments

Comments

@saharudra
Copy link

The training is not moving after the following output.

python3 multigpu_train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=14 --checkpoint_path=./tmp/east_icdar2015_resnet_v1_50_rbox/ --text_scale=512 --training_data_path=/data/ocr/icdar2015/ --geometry=RBOX --learning_rate=0.0001 --num_readers=24 --pretrained_model_path=./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
/usr/local/lib/python3.5/dist-packages/matplotlib/backends/backend_gtk3agg.py:18: UserWarning: The Gtk3Agg backend is known to not work on Python 3.x with pycairo. Try installing cairocffi.
  "The Gtk3Agg backend is known to not work on Python 3.x with pycairo. "
resnet_v1_50/block1 (?, ?, ?, 256)
resnet_v1_50/block2 (?, ?, ?, 512)
resnet_v1_50/block3 (?, ?, ?, 1024)
resnet_v1_50/block4 (?, ?, ?, 2048)
Shape of f_0 (?, ?, ?, 2048)
Shape of f_1 (?, ?, ?, 512)
Shape of f_2 (?, ?, ?, 256)
Shape of f_3 (?, ?, ?, 64)
Shape of h_0 (?, ?, ?, 2048), g_0 (?, ?, ?, 2048)
Shape of h_1 (?, ?, ?, 128), g_1 (?, ?, ?, 128)
Shape of h_2 (?, ?, ?, 64), g_2 (?, ?, ?, 64)
Shape of h_3 (?, ?, ?, 32), g_3 (?, ?, ?, 32)
WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_2/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_4/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_8/biases missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_9/biases missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_1/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_3/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_1/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_2/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_5/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_8/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_3/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_7/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_6/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_9/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_4/BatchNorm/gamma missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_5/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_7/biases missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv/weights missing in checkpoint ./tmp/resnet_v1_50.ckpt
WARNING:tensorflow:Variable feature_fusion/Conv_6/BatchNorm/beta missing in checkpoint ./tmp/resnet_v1_50.ckpt
2018-05-03 20:40:28.506116: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Generator use 10 batches for buffering, this may take a while, you can tune this yourself.
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/
0 training images in /data/ocr/icdar2015/

Not sure what I am doing wrong, any pointers on how to fix this or is this running fine?

@saharudra
Copy link
Author

saharudra commented May 3, 2018

Got the 0 training images in /data/ocr/icdar2015/ issue fixed by changing --training_data_path=/data/ocr/icdar2015/ to --training_data_path=./data/ocr/icdar2015/. The training still seems to be stuck.

@saharudra
Copy link
Author

Got it to work, reduced the num_readers.

@xiaodao1990
Copy link

@saharudra hello, i meet same problem,how to solve the problem?

@jiaying96
Copy link

reduced the num_readers ,but no effect @saharudra

@GuoxingYan
Copy link

--training_data_path=./data/ocr/**注意前面的.

@magicxiaobai
Copy link

@GuoxingYan 地址都改了,还是没有效果,仍然是不存在

@AaaGss
Copy link

AaaGss commented Oct 23, 2019

I met the same problem

@Eatzhy
Copy link

Eatzhy commented Oct 13, 2020

images的路径不对,比如训练的img在data/ocr/icdar2015/train/下面,你就把路径改一下;
这个时候可能会报另一个错误,text file ./data/ocr/icdar2015/train/imgs/gt_img_211.txt does not exists
然后改icdar.py中604行,改成:
txt_fn = im_fn.replace(os.path.basename(im_fn), 'gt_%s.txt' % os.path.basename(im_fn).split('.')[0])
txt_fn = txt_fn.replace('imgs','gt')

@Eatzhy
Copy link

Eatzhy commented Oct 13, 2020

the path of Images is error, such as training img in "data/OCR/icdar2015/train/", you should change the corresponding path;
Could quote another mistake this time,
the text file. / data/OCR/icdar2015 / train/imgs/gt_img_211. TXT does not exists
Then change line 604 in icdar.py to:

txt_fn = im_fn.replace(os.path.basename(im_fn), 'gt_%s.txt' % os.path.basename(im_fn).split('.')[0])
txt_fn = txt_fn.replace('imgs','gt')path 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants