Skip to content

Commit

Permalink
🔤 removing unnecessary info from docs
Browse files Browse the repository at this point in the history
  • Loading branch information
rbturnbull committed Feb 22, 2024
1 parent 26db792 commit 6170ec5
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 5 deletions.
3 changes: 3 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,9 @@ Credits
Robert Turnbull, Emily Fitzgerald, Karen Thompson and Jo Birch from the University of Melbourne.

This research was supported by The University of Melbourne’s Research Computing Services and the Petascale Campus Initiative.
The authors thank collaborators Niels Klazenga, Heroen Verbruggen, Nunzio Knerr, Noel Faux, Simon Mutch, Babak Shaban, Andrew Drinnan, Michael Bayly and Hannah Turnbull.

This pipeline depends on `YOLOv8 <https://github.com/ultralytics/ultralytics>`_,
`torchapp <https://github.com/rbturnbull/torchapp>`_,
Microsoft's `TrOCR <https://www.microsoft.com/en-us/research/publication/trocr-transformer-based-optical-character-recognition-with-pre-trained-models/>`_.
Expand Down
14 changes: 9 additions & 5 deletions docs/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,18 @@ and detects bounding boxes for the following fields:
Label Classifier
================

We have trained a classifier using `torchapp <https://github.com/rbturnbull/torchapp>`_ to detect the following types of writing on the institutional label:
We have trained a classifier to detect the following types of writing on the institutional label:

#. typewriter
#. printed
#. handwritten
#. mixed
#. combination

.. These were annotated to the 3,152 images from the MELU dataset.
.. This was partitioned into 2521 training images and 631 validation images.
.. The pretrained `ResNet-101 model <https://doi.org/10.1109/CVPR.2016.90>`_ model was trained using `torchapp <https://github.com/rbturnbull/torchapp>`_ for 20 epochs on this dataset.
.. It achieved an accuracy of 98.3% on the validation set.
These were annotated to the XXX images in the MELU dataset. An image classifier based on a pretrained ResNet-18 was used \citep{resnet}. This achieved an accuracy of XXX\% on the validation set.
Text Recognition
================
Expand All @@ -75,8 +79,8 @@ If the text was determined to be printed or written using a typewriter,
then the Text Recognition module uses the `Tesseract <https://github.com/tesseract-ocr/tesseract>`_ Optical Character Recognition (OCR) engine.
If the text was determined to be hand-written or a mixture, then the `TrOCR <https://www.microsoft.com/en-us/research/publication/trocr-transformer-based-optical-character-recognition-with-pre-trained-models/>`_ Handwritten Text Recognition (HTR) model is used.

Postprocessing and Outputs
==========================
Post-processing and Outputs
===========================

After the text for each field is recognized, Hespi performs some postprocessing steps.
These involve ensuring that the family and genus are capitalized and the species is not.
Expand Down

0 comments on commit 6170ec5

Please sign in to comment.