Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problems aboult this code #8

Open
980044579 opened this issue Dec 12, 2017 · 12 comments
Open

Some problems aboult this code #8

980044579 opened this issue Dec 12, 2017 · 12 comments

Comments

@980044579
Copy link

980044579 commented Dec 12, 2017

Infact the "max_stepsize" in this code should't be 64.The "max_stepsize" is equal to 12,which is shrunk from original "image_width"(180) to 180/2/2/2/2 = 12.Remenber the core idea in CRNN+CTC is that we split the image vertically to many slices,and we predict each slice's classes,finally using CTC to decode the predicted sequence to the respectd result.For example "aaa_bb_c_"and "a__b_ccc" both respect to the same label "abc",you can also read the paper for more details.

But when I run the wrong code in author's dataset,and I got 98% accuracy while I got a bad result in VGGWord dataset.Finally I got a good result after changing the code.

So, why this code work in your situation,I am very courious about this.Thank you.

@LevinJ
Copy link

LevinJ commented Dec 26, 2017

@980044579 , thanks for sharing your observations and experience.

  1. With the great source codes in this project and the data provided, I was able to reproduce the author's result, getting 0.997 at 50th epoch.
  2. I agree with you on the max_stepsize. it should be in the direction of "image_width", 12 in this project. I also plan to correct this and see how it might impact the final result., If it's okay, can you share your code changes in this area?

@980044579
Copy link
Author

Just change the code between CNN -> RNN in cnn_lstm_otc_ocr.py, make sure the shape of the input of RNN is [batch_size, max_stepsize, num_features].

@LevinJ
Copy link

LevinJ commented Dec 27, 2017

Hi @980044579 , thanks a lot for your kind reply. I did the code changes too in yesterday and found the model can achieve 0.999 accuracy at 12th epoch. so the model is able to converge faster and achieve better performance after fixing this bug.

For those who are interested, here is my code changes.

@980044579
Copy link
Author

Good job~

@anubhavrohatgi
Copy link

I am getting and error Failed precondition: sequence_length(0) <= 12

What I did for inference is I have already trained the model to

model_checkpoint_path: "ocr-model-21001"
all_model_checkpoint_paths: "ocr-model-21001"

on a set of 80000 train and 20 val images a provided in the dataset. I took a few images from val set and create a folder infer(40imgs named 1.png .. 40.png). I tried to run the code for inference using the command given in the readme.

INFO:tensorflow:Restoring parameters from ./checkpoint/ocr-model-20001
restore from ckpt./checkpoint/ocr-model-20001
2018-01-23 11:16:17.305360: W tensorflow/core/framework/op_kernel.cc:1192] Failed precondition: sequence_length(0) <= 12
Traceback (most recent call last):
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
return fn(*args)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
status, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_2, _arg_lstm/Fill_0_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./main.py", line 184, in
tf.app.run()
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./main.py", line 179, in main
infer(FLAGS.infer_dir, FLAGS.mode)
File "./main.py", line 155, in infer
dense_decoded_code = sess.run(model.dense_decoded, feed)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_2, _arg_lstm/Fill_0_1)]]

Caused by op 'CTCBeamSearchDecoder', defined at:
File "./main.py", line 184, in
tf.app.run()
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./main.py", line 179, in main
infer(FLAGS.infer_dir, FLAGS.mode)
File "./main.py", line 115, in infer
model.build_graph()
File "/home/anubhav/Downloads/Manish Sir/CNN_LSTM_CTC_Tensorflow-master (2)/cnn_lstm_otc_ocr.py", line 24, in build_graph
self._build_train_op()
File "/home/anubhav/Downloads/Manish Sir/CNN_LSTM_CTC_Tensorflow-master (2)/cnn_lstm_otc_ocr.py", line 158, in _build_train_op
merge_repeated=False)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 269, in ctc_beam_search_decoder
merge_repeated=merge_repeated))
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 76, in _ctc_beam_search_decoder
top_paths=top_paths, merge_repeated=merge_repeated, name=name)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

FailedPreconditionError (see above for traceback): sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_2, _arg_lstm/Fill_0_1)]]

@980044579
Copy link
Author

@anubhavrohatgi make sure the maxlength of label in your dataset must <= max_stepsize

@anubhavrohatgi
Copy link

anubhavrohatgi commented Jan 23, 2018

@980044579 Please brief me a bit, quiet new to this stuff in Python. what maxlength of label is.

Currently I am using the dataset that was provided in the link given in the repo. Max_stepsize = 64, i guess as is stated in utils.py

All images are 180x60.

error occurs somewhere here:
dense_decoded_code = sess.run(model.dense_decoded, feed)

below is my infer folder contents
screen2

@anubhavrohatgi
Copy link

anubhavrohatgi commented Jan 23, 2018

are you talking about the labels.txt?

Correct me if I am wrong here:: by infer we mean we are testing on our real time data. is it.
If not please help me, how can I use the model to predict the values of a given input image.

@fanw52
Copy link

fanw52 commented Feb 6, 2018

@anubhavrohatgi @980044579 ,hello, i run into the same question,but i inspect the label and find the max length of label is not greater than maxT in[maxT,batch_size,num_char],have you solve it? i don't konw how to do it

@980044579
Copy link
Author

@anubhavrohatgi @kstys make sure you understand how the framework "CNN + RNN + CTC" work and there are some bugs in this code.You should not only change the "maxsteps" in utils.py but also the code between CNN ——> RNN in cnn_lstm_otc_ocr.py

@lovebobo
Copy link

I have a question. in the file of cnn_letm_otc_ocr.oy , after cnn, the x.set_shape([FLAGS.batch_size, filters[3], 24]) is right? the time sequence should be the width which will be feed to the LSTM, but the code is the length of channels.

@lovebobo
Copy link

I changed the code as @LevinJ ,but i got a error "tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found."

I set the max_step as 128 and my input image is 32*192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants