Skip to content

Commit

Permalink
Convert image to PDF (#5)
Browse files Browse the repository at this point in the history
  • Loading branch information
dannnylo committed Oct 20, 2020
1 parent 78f0723 commit c04df01
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 0 deletions.
24 changes: 24 additions & 0 deletions lib/tesseract_ocr/pdf.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
defmodule TesseractOcr.PDF do
@moduledoc """
Documentation for TesseractOcr.PDF.
"""

import TesseractOcr.Utils

@doc """
This function reads the words on image by OCR and returns the pdf's file's path
## Examples
iex> TesseractOcr.PDF.read("test/resources/world.png", "/tmp/test")
"/tmp/test.pdf"
"""
def read(path, output, options \\ %{}) when is_binary(path) do
options = Map.merge(options, %{c: "tessedit_create_pdf=1"})

command(path, output, options)

"#{output}.pdf"
end
end
11 changes: 11 additions & 0 deletions test/tesseract_ocr/pdf_test.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
defmodule TesseractOcr.PDFTest do
use ExUnit.Case
doctest TesseractOcr.PDF

test "read image and saves on a PDF" do
pdf_path = TesseractOcr.PDF.read("test/resources/world.png", "test/test", %{lang: "eng", psm: 7, oem: 1})

assert pdf_path === "test/test.pdf"
File.rm(pdf_path)
end
end

0 comments on commit c04df01

Please sign in to comment.