Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data prepare error #4

Closed
justinner opened this issue Dec 29, 2018 · 1 comment
Closed

data prepare error #4

justinner opened this issue Dec 29, 2018 · 1 comment

Comments

@justinner
Copy link

Hi,Aigallego,
I have test DIBCO datasets successfully,so I want to train my own dataset,I create a folder according to your code "train.py",then ran the code ,but it implied no aug_folder,so create one respectively.then I trained, the error is
"nonetype has the attribute shape",
the number of train data is 105,and test data is 256,but I do not have so much test data ,maybe I should sugumentaion before,but I do not know what I should do .could you help me? thanks very much!

@ajgallego
Copy link
Owner

Hi @justinner,

that error indicates that you are not loading the training images correctly. It is possible that the folder structure is not correct, check the "load_dataset_folds" function of the "train.py" file.

For example, to use Dibco 2016 as test set and Dibco 2009-2014 as training sets, you have to write:

python -u train.py -path datasets -db dibco -dbp 6 --aug -w 256 -s 128 -f 64 -k 5 -e 200 -b 10 -th -1 -stride 2 -page 64

Inside your folder you have to create a subfolder called "datasets", and inside this other one called "Dibco", like this:

  • Dibco/2009/handwritten_GR
  • Dibco/2009/printed_GR
  • Dibco/2010/handwritten_GR
  • Dibco/2011/handwritten_GR
  • Dibco/2011/printed_GR
  • Dibco/2012/handwritten_GR
  • Dibco/2013/handwritten_GR
  • Dibco/2013/printed_GR
  • Dibco/2014/handwritten_GR,
  • Dibco/2016/handwritten_GR

If you do not want to use augmented remove the "--aug" option from the command. If you want to use augmented files you have to create the same folders but using the prefix "aug_", for example, for Dibco 2009 it would be:

  • Dibco/2009/aug_handwritten_GR
  • Dibco/2009/aug_printed_GR

Remember also that the images have to be in PNG format.

Another important issue is that the input images and their corresponding ground truth must to have the same filename, for example, for the following training images of Dibco 2009:

  • datasets/Dibco/2009/handwritten_GR/H01.png
  • datasets/Dibco/2009/handwritten_GR/H02.png
  • datasets/Dibco/2009/handwritten_GR/H03.png
  • datasets/Dibco/2009/handwritten_GR/H04.png
  • datasets/Dibco/2009/handwritten_GR/H05.png

We would have the following corresponding images of ground truth:

  • datasets/Dibco/2009/handwritten_GT/H01.png
  • datasets/Dibco/2009/handwritten_GT/H02.png
  • datasets/Dibco/2009/handwritten_GT/H03.png
  • datasets/Dibco/2009/handwritten_GT/H04.png
  • datasets/Dibco/2009/handwritten_GT/H05.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants