OCR-Tesseract

Convert Image To HTML Text Format

#Prerequisite

Install Python Newer Version.

2. Install OCR-Tesseract as per your system platform.
https://github.com/UB-Mannheim/tesseract/wiki

#Setup
For Windows **After installation of the "Tesseract" Add the path of the folder into the Envroinment Variable.**

**Some changes in the script you have to make**
Again Paste Tesseract Folder Path At This Line
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\iam_kazi\AppData\Local\Tesseract-OCR\tesseract.exe'

Provide Your Image Folder Path For Source (**It will iterate all images at onces, Do not provide image name**)
listfiles = os.listdir("C:\Data")

Again paste same path to open image
img=Image.open("C:\\Data\\"+x) **make sure give double backlashes**

In My Script, Basically it adds br tag after every line.
You can make changes as per your need

Thank You! Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Image_To_HTML.py		Image_To_HTML.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image_To_HTML.py

Image_To_HTML.py

LICENSE

LICENSE

README.md

README.md

Repository files navigation

OCR-Tesseract

About

Releases

Packages

Languages

License

iamkazi/OCR-Tesseract

Folders and files

Latest commit

History

Repository files navigation

OCR-Tesseract

About

Resources

License

Stars

Watchers

Forks

Languages