This project used OCR Engine - Tesseract which can be found - here Steps to do, Install the OS specific tesseract package. Run pip install defined in requirements.txt file Currently all the files in input folder has been processed to output type of document.