Skip to content

A deep learning toolkit specialized for handwritten document analysis

License

Notifications You must be signed in to change notification settings

yschneider-sinneria/PyLaia

 
 

Repository files navigation

PyLaia

PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.

It is also a successor to Laia.

pipeline status Coverage Code quality

Python: 3.9 | 3.10 PyTorch: 1.13.0 | 1.13.1 pre-commit: enabled Code style: black Ruff

Get started by having a look at our Documentation!

Installation

To install PyLaia from PyPi:

pip install pylaia

Please note that the CUDA version of nnutils (nnutils-pytorch-cuda) is installed by default. If you do not have a GPU, you should install the CPU version (nnutils-pytorch).

The following Python scripts will be installed in your system:

Contributing

If you want to contribute new feature or found a text that is incorrectly segmented using pySBD, then please head to CONTRIBUTING.md to know more and follow these steps.

  1. Fork it ( https://gitlab.teklia.com/atr/pylaia/-/forks/new )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Merge Request ( https://gitlab.teklia.com/atr/pylaia/-/merge_requests/new )

Code of conduct

We are committed to providing a friendly, safe and welcoming environment for all. Please read and respect the PyLaia Code of Conduct.

Acknowledgments

Work in this toolkit was financially supported by the Pattern Recognition and Human Language Technology (PRHLT) Research Center.

Citation

  • Article describing the latest contributions to PyLaia
@inproceedings{pylaia2024,
    author = "Tarride, Solène and Schneider, Yoann and Generali, Marie and Boillet, Melodie and Abadie, Bastien and Kermorvant, Christopher",
    title = "Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library",
    booktitle = "Submitted at ICDAR",
    year = "2024"
}
  • Original article
@inproceedings{laia2017,
  author={Puigcerver, Joan},
  booktitle={2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
  title={Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?},
  year={2017},
  volume={01},
  number={},
  pages={67-72},
  doi={10.1109/ICDAR.2017.20}}
  • GitLab repository
@misc{puigcerver2018pylaia,
  author = {Joan Puigcerver and Carlos Mocholí},
  title = {PyLaia},
  year = {2018},
  publisher = {GitLab},
  journal = {GitLab repository},
  howpublished = {\url{https://gitlab.teklia.com/atr/pylaia/}},
  commit = {commit SHA}
}

About

A deep learning toolkit specialized for handwritten document analysis

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.7%
  • Other 0.3%