Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training (or Fine-Tuning) the Model #64

Open
martholomew opened this issue Oct 14, 2023 · 1 comment
Open

Training (or Fine-Tuning) the Model #64

martholomew opened this issue Oct 14, 2023 · 1 comment

Comments

@martholomew
Copy link

martholomew commented Oct 14, 2023

I would like to fine-tune the model towards the data that I will be feeding it. My pipeline would be to binarize the images using sbb_binarize, then manually edit them to be high-quality ground-truth, then feed a large amount of these images back into the model.

  1. Would the end-result be better binarization on my dataset?
  2. How would this be accomplished?

A link to point me in the right direction would be a great help.

@vahidrezanezhad
Copy link
Member

Dear @martholomew,

Of course, Pseudo-labeling can be effective, and we have also utilized this technique to enhance our models. You can employ https://github.com/qurator-spk/sbb_pixelwise_segmentation for your training needs. Initially, you can use our models to binarize your dataset and subsequently choose the documents with satisfactory results for custom dataset training. Sometimes, the predictions may exhibit local excellence. In such cases, you can employ cropping to prepare your ground truth (GT).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants