Skip to content

pcschreiber1/PDF_Extraction-Translation

Repository files navigation

Translate long PDF-Reports in Python.

Style Continuous Integration codecov

Translate many large PDF Reports for free using Python. You can find the corresponding Towards Data Science article here or follow the Jupyter Notebook Article_PDF-Translation - the Central Bank Report is stored in src/examples.

This repo stores the pipeline developed for work, where a large number of official reports from different OECD countries had to be translated. To translate free of charge, the GoogleTranslate API is used. The main python packages are: pdfplumber, deep_translator, and pyfpdf2.

About

Translate many large PDF Reports for free using Python.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published