Skip to content

kill-2/ocrb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ocrb

OCR

Installation

Install the gem and add to the application's Gemfile by executing:

bundle add ocrb

If bundler is not being used to manage dependencies, install the gem by executing:

gem install ocrb

Usage

Require the gem and call Ocrb.run with an image path and prompt:

require "ocrb"

text = Ocrb.run("receipt.jpg", "Extract the text from this image.")
puts text

By default, Ocrb.run uses the ollama CLI with the glm-ocr:bf16 model:

Ocrb.run("receipt.jpg", "Summarize the line items.")

If you want to use an OpenAI-compatible API instead, pass the built-in extractor explicitly:

require "ocrb"

text = Ocrb.run(
  "receipt.jpg",
  "Recognize total amount.",
  extractor: Ocrb::Extractors::OpenAi.new(
    url: "http://127.0.0.1:1234/v1",
    model: "zai-org/glm-4.6v-flash",
    api_key: ENV.fetch("OPENAI_API_KEY", "asdf"),
    json: {type: 'object', properties: {amount: {type: 'string'}}} # can be `nil` or `true` or `response_format.json_schema.schema`
  )
)

You can also resize the image before OCR by passing a resizer:

require "ocrb"

text = Ocrb.run(
  "receipt.jpg",
  "Extract all visible text.",
  resizer: Ocrb::Resizers::Sips.new(resample_width: 1024)
)

Both extractor and resizer are duck-typed. Any object that responds to extract(image_path, prompt) or resize(image_path) can be passed in.

License

The gem is available as open source under the terms of the MIT License.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors