Classification model for historical handwritten documents.
- Find a model pre-trained on a dataset closer to our domain
- Include embeddings of words extracted from first iteration of ocr tool (kraken)
Correct # In[0]
comments with:
sed -i.bak -e "s/\#\s*In\[[0-9 ]*\]/#In\[\]/" Prototype.py