Skip to content

Latest commit

 

History

History
41 lines (31 loc) · 946 Bytes

WindowsEnv.md

File metadata and controls

41 lines (31 loc) · 946 Bytes

Download OCRmyPDF on Windows

Step1. Open Windows PowerShell as Administrator

Step2. Download Chocolatey

Run the following command,

Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

Step2. Download Required Packages Using Chocolatey

choco install --pre tesseract
choco install ghostscript
choco install pngquant

Step3. Download OCRmyPDF Using PIP

python -m pip install ocrmypdf

Step4. Download Required Languages

Download

  • chi_tra.traineddata
  • chi_sim.traineddata
  • jpn.traineddata
  • deu.traineddata
  • spa.traineddata

from https://github.com/tesseract-ocr/tessdata/ and place it in C:\\Program Files\\Tesseract-OCR\\tessdata (or wherever Tesseract OCR is installed).


That's it. You have successfully setup OCRmyPdf on Windows!