lstm + ctc for mnist #2

anxingle · 2016-08-28T13:42:47Z

Hi, igormq. It is very helpful to see your Blog talk about CTC on Tensorflow . Thank you a million. But I have some confusion about the CTC module.
1. If sequence is A B B * B * B( * is blank). tf.ctc.ctc_greedy_decoder() should return ABBB. But Doc. say result is A B if merge_repeated =True.
2 . My code is using LSTM to classify Mnist data . Just one layer and 28 timeSteps . But CTC_LOSS don't work at all. Can you help me define the right call style? The code is so simple and I promise U can get it when you see the code .
Thanks again.

anxingle · 2016-08-28T13:50:37Z

I write just as your code tell me.And it works well if comment the CTC functions. I really don't know what's wrong with it .

igormq · 2016-08-29T10:12:54Z

Thank you @anxingle , I'm very glad that you liked my post. Answering your questions:

Yes, you are absolute right, this is the default behavior of TensorFlow's implementation, but in Graves' thesis, he wrote that you have to delete the repeated labels and therefore remove the blank labels, as we can see at page 57 of his thesis. I don't have any clue why the Tensorflow team implemented in that way.
I read your code, but it's better if you send to me your error log and your code with the CTC implementation (not as a comment), because in your code I didn't see the seq_len placeholder and the sparse placeholder for y. Could you do that?

anxingle · 2016-08-29T10:41:49Z

Thank you very much. I will do what you told me as soon as I can.

anxingle · 2016-08-29T12:34:06Z

I add the entire code, and show me error.txt.

anxingle · 2016-08-29T12:36:02Z

I tried tf.int64.

igormq · 2016-08-29T13:03:29Z

Could you send me your dataset?

anxingle · 2016-08-29T13:53:49Z

I have push the mnist dataset into the data , you can just git clone the repository.
I am really grateful to you.

anxingle · 2016-08-29T14:24:06Z

It takes almost 1 hours.Thanks GFW

igormq · 2016-08-29T15:34:34Z

Why are you trying to use CTC as a cost function? CTC is used when you don't have an alignment between your input and output and/or the output length vary along the samples. So, for one to one relationship (like one image one digit), CTC probably isn't the best solution for you. But, if you intend to use this code in a continuous hand writing recognition, CTC will work better. I'm looking your code and making some changes. As soon as possible I'll give you a feedback, ok?

anxingle · 2016-08-29T15:50:44Z

Thank you for your reply. But in this code, I have 28 inputs, so it's a problem about many inputs( maybe laterly I'll add multi labels) maps to one label. My senior implement multi labels recognise framework mxnet warpctc and he told me it should be the best solution .
So nice !

igormq · 2016-08-29T16:13:08Z

Yes, but CTC works only for more than one label. I'll show you a working code, but I don't think that for this example CTC will outperform the softmax layer.

anxingle · 2016-08-29T16:16:32Z

Got it! I change another dataset !

igormq · 2016-08-29T16:51:17Z

I made a working code and I put it on gist. You major issue was using the sparse place holder and the sequence length place holder. The targets required by CTC must not be encoded, you must provide as labels and you must feed the sparse place holder as a tuple of (indices, values, shape) (that is generated by sparse_tuple_from); in the case for mnist, for batch you will have a target like

y = (
[[0, 0], [1, 0], [2, 0], ..., [batch_size-1, 0]],
[label_1, label_2, label_3, ..., label_batch_size],
[batch_size, 1]
)

And the seq_lenplaceholder works to tell the run what is the size of each data in batch, but for MNIST, the network was feed with 28 inputs of length 28, so:

seq_len = [28 for _ in xrange(batch_size)]

I hope I could help you. If you have any question I'll be happy to answer you.

igormq · 2016-08-29T16:56:53Z

You can use this dataset, whose images have more than one digit and the number of digits differ from image to image. CTC may work better with this dataset.

anxingle · 2016-08-29T16:59:12Z

I even don't know how to express my appreciation ! Thanks a lot.

igormq · 2016-08-30T10:01:36Z

You're welcome. If you have any questions, please feel free to ask.

igormq closed this as completed Aug 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lstm + ctc for mnist #2

lstm + ctc for mnist #2

anxingle commented Aug 28, 2016

anxingle commented Aug 28, 2016

igormq commented Aug 29, 2016 •

edited

Loading

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 •

edited

Loading

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 •

edited

Loading

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 •

edited

Loading

igormq commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 30, 2016

lstm + ctc for mnist #2

lstm + ctc for mnist #2

Comments

anxingle commented Aug 28, 2016

anxingle commented Aug 28, 2016

igormq commented Aug 29, 2016 • edited Loading

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016

anxingle commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 • edited Loading

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 • edited Loading

anxingle commented Aug 29, 2016

igormq commented Aug 29, 2016 • edited Loading

igormq commented Aug 29, 2016

anxingle commented Aug 29, 2016

igormq commented Aug 30, 2016

igormq commented Aug 29, 2016 •

edited

Loading

igormq commented Aug 29, 2016 •

edited

Loading

igormq commented Aug 29, 2016 •

edited

Loading

igormq commented Aug 29, 2016 •

edited

Loading