Grim is a simple gem for extracting a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
You will need ghostscript, imagemagick, and xpdf installed. On the Mac (OSX) I highly recommend using Homebrew to get them installed, its as simple as “brew install ghostscript”, “brew install imagemagick”, and “brew install xpdf”.
instance = Grim.new("/path/to/pdf") page_count = instance.page_count # returns the number of pages in the pdf png = instance.page(1).to_image("/path/to/save/image.png") # saves png to path and returns File instance jpeg = instance.page(2).to_image("/path/to/save/image.jpeg") # saves jpeg to path and returns File instance text = instance.page(3).text # returns text as a string