Skip to content

ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR

License

Notifications You must be signed in to change notification settings

t0mer/ocr-docker

Repository files navigation

OCR-Docker

Extract text from images & pdf files

OCR-Docker is a Python & Flask powered, easy to use system that helps us to easily extract text from images and pdf files in multiple languages.

Features

  • Extract text from images (png, jpg, tiff).
  • Extract text from pdf files (single or multiple pages).

Components and Frameworks used in TTS-STT

The OCR (Optical Character Recognition) feature is free thanks to tesseract-ocr which is an Open Source OCR project.

Installation

docker-compose from hub

version: "3.7"
services:
  ocr:
    image: techblog/ocr-docker:latest
    ports:
      - "8080:8080"
    container_name: tts-stt
    labels:
      - "com.ouroboros.enable=true"
    networks:
      - default
    restart: unless-stopped

Now, run docker-compose up -d to pull and run your container. Open your browser and navigate to your container ip address with port 8080, you should see the following screen.

OCR