pdfRuler is a small set of utilities for working with pdf2json
- pdfRuler.localCacheList() populates pdfRuler.files with filenames found in pdfRuler.pdfCache (filters for .pdf extension)
- pdfRuler.pdf2JSON(fileName) promise wrapper for pdf2json
- pdfRuler.findCoords(pdf_json, text, page) returns an array of x,y coordinates for page elements with matching text
- pdfRuler.extractLines(pdf_json, x_range, y_range, page) extracts rows and columns within the target area
- pdfRuler.pdfCache = "./";
- pdfRuler.tolerance = .2; controls the range within which pdfRuler.extractLines() considers page elements to be part of the same row/column
Please see the attached example.