-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lstm + ctc for mnist #2
Comments
I write just as your code tell me.And it works well if comment the CTC functions. I really don't know what's wrong with it . |
Thank you @anxingle , I'm very glad that you liked my post. Answering your questions:
|
Thank you very much. I will do what you told me as soon as I can. |
I tried tf.int64. |
Could you send me your dataset? |
I have push the mnist dataset into the data , you can just git clone the repository. |
It takes almost 1 hours.Thanks GFW |
Why are you trying to use CTC as a cost function? CTC is used when you don't have an alignment between your input and output and/or the output length vary along the samples. So, for one to one relationship (like one image one digit), CTC probably isn't the best solution for you. But, if you intend to use this code in a continuous hand writing recognition, CTC will work better. I'm looking your code and making some changes. As soon as possible I'll give you a feedback, ok? |
Thank you for your reply. But in this code, I have 28 inputs, so it's a problem about many inputs( maybe laterly I'll add multi labels) maps to one label. My senior implement multi labels recognise framework mxnet warpctc and he told me it should be the best solution . |
Yes, but CTC works only for more than one label. I'll show you a working code, but I don't think that for this example CTC will outperform the softmax layer. |
Got it! I change another dataset ! |
I made a working code and I put it on gist. You major issue was using the sparse place holder and the sequence length place holder. The targets required by CTC must not be encoded, you must provide as labels and you must feed the sparse place holder as a tuple of y = (
[[0, 0], [1, 0], [2, 0], ..., [batch_size-1, 0]],
[label_1, label_2, label_3, ..., label_batch_size],
[batch_size, 1]
) And the seq_len = [28 for _ in xrange(batch_size)] I hope I could help you. If you have any question I'll be happy to answer you. |
You can use this dataset, whose images have more than one digit and the number of digits differ from image to image. CTC may work better with this dataset. |
I even don't know how to express my appreciation ! Thanks a lot. |
You're welcome. If you have any questions, please feel free to ask. |
Hi, igormq. It is very helpful to see your Blog talk about CTC on Tensorflow . Thank you a million. But I have some confusion about the CTC module.
1. If sequence is A B B * B * B( * is blank). tf.ctc.ctc_greedy_decoder() should return ABBB. But Doc. say result is A B if merge_repeated =True.
2 . My code is using LSTM to classify Mnist data . Just one layer and 28 timeSteps . But CTC_LOSS don't work at all. Can you help me define the right call style? The code is so simple and I promise U can get it when you see the code .
Thanks again.
The text was updated successfully, but these errors were encountered: