Skip to content

OCR, classify, create searchable PDF and upload to Google Drive

Notifications You must be signed in to change notification settings

wouterdebie/scanit_cloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scanit_cloud

Google Cloud function that triggers on a GCS bucket google.storage.object.finalize of specific JPG files, then OCRs, classifies, creates a PDF and uploads the PDF to Google Drive.

Flow:

  • OCR image and convert Vision API output to HOCR
  • Detect owner
  • Create Searchable PDF
  • Upload to Google Drive

Requirements:

  • Python3
  • Google Cloud Functions

Deploy

gcloud functions deploy scanit --runtime python37 \
--trigger-resource [INCOMING_BUCKET]  \
--trigger-event google.storage.object.finalize \
--timeout 540

Acknowledgements

To Konstantin Baierer for gcv2hocr.py

About

OCR, classify, create searchable PDF and upload to Google Drive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages