New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
throwing errors #2
Comments
i also get very high validation CER of around 43 when training. want to know why |
@Sammul40619 is there any way out of reducing the CER or any other idea related to handwritten text recognition. |
@rpro91 i am trying, i am checking the model and input dataset, dont get anything now. how about you? |
@Sammul40619 not yet.. tried every method .. now running Line HTR model of lamhoangtung(parent version of this model) with IAM dataset ... Lets see if it works . If you found any solution or any other model. please let me know. |
@rpro91 ok, i also want to try the model of lamhoangtung. are you Chinese? If you found any solution or any other model. please let me know too. |
@Sammul40619 I am an Indian :) :) surely will let you know!! |
@rpro91 ok! keep in touch! |
Has anyone found any solution to this? |
@rpro91 are you using word IAM dataset or lines IAM dataset. It depends on parsing the lines.txt or words.txt file. I have no idea why such high CER is found. Play with batch size=50, or you normalize images or not before training. Even without data augmentation it should have around 23 %. |
Hi Sushant,
Great work by you!! kudos sir.
I am facing the following issues while running this model.
in DataLoader.py file
for reading the data from ground truth text file
GT text are columns starting at 10
| 77 | gtText_list = lineSplit[9].split('|')
| 78 | gtText = self.truncateLabel(' '.join(gtText_list), maxTextLen)
this throws the error -- index out of range and on correcting
gtText_list = lineSplit[8].split('|')
Also in main.py file
totalEpoch = loader.trainSamples//Model.batchSize # loader.numTrainSamplesPerEpoch
|26| while True:
| 27 | epoch += 1
| 28 | print('Epoch:', epoch, '/', totalEpoch)
is also throwing the error. On commenting totalEpoch line and sending epoch to print statement-
#totalEpoch = loader.trainSamples//Model.batchSize # loader.numTrainSamplesPerEpoch
Also Autocorrect in spellchecker.py is shown as depreceated and on changing it to pyspellchecker v.4.0
I am able to run the model but on training from scratch its showing very high validation CER of around 43.
let me know if change in spellchecker and other performed changes can lead to this. Also let me know if some other approach has to be taken for training this model on IAM line based dataset
The text was updated successfully, but these errors were encountered: