-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training accuracy #10
Comments
Hi, did the training loss reduce? If yes, we can be patient, and the accuracy will be better after several epoch! |
Hi, Ys, it reduces very slowly. I will give it more time. Thanks |
Hi, how is the training accuracy? If your training accuracy become up and down, you can turn down the |
Hi, Yes, exactly it keeps fluctuating. I will change lr, and see. Thanks |
Hi Holmeyoung, I was able to train the model after generating clean dataset. However, my question is when to stop the training? the accuracy is increasing and about to reach 100 %, which may case the model to overfitting. So, when to stop? Thanks, |
Hi, it's a good news. |
Thank you again for your corporation. Yes, the accuracy became stable after 30 epoch with 95%. However, when I test the model by using new samples, it could not recognise the words. When I use samples similar to the ones used for training, it recognises them. |
Hi, since you want to predict on noisy images, you should't train on clean data. Try to make your training samples look like the image you actually want to predict on. If you just want to train on small dataset, you can add nn.LSTM(nIn, nHidden, bidirectional=True) to nn.LSTM(nIn, nHidden, bidirectional=True, dropout=0.5) |
Sure, I will try using the dropout. Thanks |
Hi guys, I have a question, I want to train on my dataset ( Farsi Language ), I cannot understand what the shape of the dataset should be? In my problem, all of my images in the dataset have 5 sentences ( 3 numbers, 2 names) , can you tell me how to prepare for training? I have to crop all of them, or I can train them in one image ( I mean all the 5 sentences in one image). For example : Best, |
Hi, From my experience, you have to crop all of your images. The dataset that I used contained sentences, but I cooped all the images to build my model. The shape of the dataset should be as Holmeyoung explains in the Readme file. absolute/path/to/image/一身转战_0.jpg |
Thanks for your response. There is no problem for training, but for testing, I want to give the whole image ( I mean all 5 sentences ) , is this model able to detect all 5 sentences in the test phase? |
Yes, it can recognize sentences. However, words are going to be connected to each other. For example, Niddal Imam will be recognized as niddalimam. |
So there is no way to correct this problem? Also, I have another question: In Farsi, such as Arabic, if two alphabets are connecting, the shape is changing such as "ح" with "حا", what should I do for this problem? Again thanks for being kind. |
I have not found a solution for this problem yet. However, in my project, I used a text detection model that can detect words instead of sentences. After that, the recognition model can extract words correctly. Regarding the second question, you are right the model will recognize "ح" and "ا" as "حا" because the letters are connected. I do not know how to solve such problem because CTC separates characters by blanks. |
I have trained the network, but the problem is that the model is not saved?? The path in the "params.py" is expr, so this folder appears in my folder, but the weights are not saved!! Do you know how to correct this? |
Hi You need to change these parameters: For more information, please refer to #3 . |
@aaobscure was your issue resolved? even I'm not able to save my model to expr folder even with these parameters @niddal-imam can you help me here? |
You can either change these parameters or use a large training dataset. For example: If it does not work, lower these parameters. |
Hi Holmeyoung,
I have 20,000 training samples and 30 characters. I have been trying to train the model but the accuracy does not add up. How should I set the parameters?
The text was updated successfully, but these errors were encountered: