Training accuracy #10

niddal-imam · 2019-05-19T12:57:07Z

Hi Holmeyoung,

I have 20,000 training samples and 30 characters. I have been trying to train the model but the accuracy does not add up. How should I set the parameters?

Holmeyoung · 2019-05-19T14:04:17Z

Hi, did the training loss reduce? If yes, we can be patient, and the accuracy will be better after several epoch!

niddal-imam · 2019-05-19T14:11:53Z

Hi,

Ys, it reduces very slowly. I will give it more time.

Thanks

Holmeyoung · 2019-05-20T01:09:55Z

Hi, how is the training accuracy? If your training accuracy become up and down, you can turn down the lr to 0.0001, just like i have said in #2 .

niddal-imam · 2019-05-20T08:46:31Z

Hi,

Yes, exactly it keeps fluctuating. I will change lr, and see.

Thanks

niddal-imam · 2019-05-21T14:44:24Z

Hi Holmeyoung,

I was able to train the model after generating clean dataset. However, my question is when to stop the training? the accuracy is increasing and about to reach 100 %, which may case the model to overfitting. So, when to stop?

Thanks,

Holmeyoung · 2019-05-22T01:51:01Z

Hi, it's a good news.
About your question, so, it's the meaning of val dataset. And the common strategy is no-improvement-in-n. It menas we should write down the best accuracy, if the performence of our model on val dataset did't increase after several epoch, it's time to stop.
But, how to define several. When the accuracy up to stable, if the epoch number is 100 or 1000, so we should wait about 10 epoch or 20 epoch. But if after 10 epoch the model is trained well(in this case, the model usually up to 80 accuracy in 1 epoch.), wait about 3 to 5 epoch is OK.
And one more thing, it's also the meaning of saving the model every setting-number interval! To avoid missing the best one.
Hope this will help you.

niddal-imam · 2019-05-22T02:32:12Z

Thank you again for your corporation. Yes, the accuracy became stable after 30 epoch with 95%. However, when I test the model by using new samples, it could not recognise the words. When I use samples similar to the ones used for training, it recognises them.

Holmeyoung · 2019-05-22T06:24:18Z

Hi, since you want to predict on noisy images, you should't train on clean data. Try to make your training samples look like the image you actually want to predict on. If you just want to train on small dataset, you can add dropout to the net to avoid overfitting.
in models/crnn.py

nn.LSTM(nIn, nHidden, bidirectional=True)

to

nn.LSTM(nIn, nHidden, bidirectional=True, dropout=0.5)

niddal-imam · 2019-05-22T07:14:15Z

Sure, I will try using the dropout.

Thanks

aaobscure · 2019-06-18T18:54:01Z

@Holmeyoung @niddal-imam

Hi guys,

I have a question,

I want to train on my dataset ( Farsi Language ), I cannot understand what the shape of the dataset should be?

In my problem, all of my images in the dataset have 5 sentences ( 3 numbers, 2 names) , can you tell me how to prepare for training?

I have to crop all of them, or I can train them in one image ( I mean all the 5 sentences in one image).

For example :
Sentence 1 : niddal
Sentece 2: 10
Sentence 3: Imam
.....

Best,

niddal-imam · 2019-06-18T19:37:54Z

Hi,

From my experience, you have to crop all of your images. The dataset that I used contained sentences, but I cooped all the images to build my model. The shape of the dataset should be as Holmeyoung explains in the Readme file.

absolute/path/to/image/一身转战_0.jpg
一身转战
absolute/path/to/image/三千里_1.jpg
三千里
absolute/path/to/image/一剑曾当百万师_2.jpg
一剑曾当百万师
absolute/path/to/image/3.jpg
一剑曾当百万师
absolute/path/to/image/一剑曾当百万师_4.jpg
一剑曾当百万师
absolute/path/to/image/niddal.jpg
niddal
absolute/path/to/image/imam.jpg
imam

aaobscure · 2019-06-18T19:41:59Z

@niddal-imam

Thanks for your response.

There is no problem for training, but for testing, I want to give the whole image ( I mean all 5 sentences ) , is this model able to detect all 5 sentences in the test phase?

niddal-imam · 2019-06-18T19:54:09Z

Yes, it can recognize sentences. However, words are going to be connected to each other. For example, Niddal Imam will be recognized as niddalimam.

aaobscure · 2019-06-18T19:59:00Z

@niddal-imam

So there is no way to correct this problem?

Also, I have another question:

In Farsi, such as Arabic, if two alphabets are connecting, the shape is changing such as "ح" with "حا", what should I do for this problem?

Again thanks for being kind.

niddal-imam · 2019-06-18T20:19:10Z

I have not found a solution for this problem yet. However, in my project, I used a text detection model that can detect words instead of sentences. After that, the recognition model can extract words correctly. Regarding the second question, you are right the model will recognize "ح" and "ا" as "حا" because the letters are connected. I do not know how to solve such problem because CTC separates characters by blanks.

aaobscure · 2019-06-19T02:24:50Z

@niddal-imam

I have trained the network, but the problem is that the model is not saved??

The path in the "params.py" is expr, so this folder appears in my folder, but the weights are not saved!!

Do you know how to correct this?

niddal-imam · 2019-06-19T03:41:03Z

Hi

You need to change these parameters:
displayInterval = 100
valInterval = 1000
saveInterval = 1000

For more information, please refer to #3 .

SreenijaK · 2019-07-18T09:55:10Z

@aaobscure was your issue resolved? even I'm not able to save my model to expr folder even with these parameters
displayInterval = 100
valInterval = 1000
saveInterval = 1000.

@niddal-imam can you help me here?

niddal-imam · 2019-07-18T10:01:38Z

You can either change these parameters or use a large training dataset. For example:
displayInterval = 10
valInterval = 50
saveInterval = 50

If it does not work, lower these parameters.

Holmeyoung closed this as completed Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training accuracy #10

Training accuracy #10

niddal-imam commented May 19, 2019

Holmeyoung commented May 19, 2019

niddal-imam commented May 19, 2019

Holmeyoung commented May 20, 2019

niddal-imam commented May 20, 2019

niddal-imam commented May 21, 2019

Holmeyoung commented May 22, 2019

niddal-imam commented May 22, 2019

Holmeyoung commented May 22, 2019 •

edited

Loading

niddal-imam commented May 22, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 19, 2019

niddal-imam commented Jun 19, 2019

SreenijaK commented Jul 18, 2019 •

edited

Loading

niddal-imam commented Jul 18, 2019

Training accuracy #10

Training accuracy #10

Comments

niddal-imam commented May 19, 2019

Holmeyoung commented May 19, 2019

niddal-imam commented May 19, 2019

Holmeyoung commented May 20, 2019

niddal-imam commented May 20, 2019

niddal-imam commented May 21, 2019

Holmeyoung commented May 22, 2019

niddal-imam commented May 22, 2019

Holmeyoung commented May 22, 2019 • edited Loading

niddal-imam commented May 22, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 18, 2019

niddal-imam commented Jun 18, 2019

aaobscure commented Jun 19, 2019

niddal-imam commented Jun 19, 2019

SreenijaK commented Jul 18, 2019 • edited Loading

niddal-imam commented Jul 18, 2019

Holmeyoung commented May 22, 2019 •

edited

Loading

SreenijaK commented Jul 18, 2019 •

edited

Loading