Skip to content

pdfRuler is a small set of utilities for working with pdf2json

Notifications You must be signed in to change notification settings

TennisVisuals/pdfRuler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pdfRuler

pdfRuler is a small set of utilities for working with pdf2json

functions:

  • pdfRuler.localCacheList() populates pdfRuler.files with filenames found in pdfRuler.pdfCache (filters for .pdf extension)
  • pdfRuler.pdf2JSON(fileName) promise wrapper for pdf2json
  • pdfRuler.findCoords(pdf_json, text, page) returns an array of x,y coordinates for page elements with matching text
  • pdfRuler.extractLines(pdf_json, x_range, y_range, page) extracts rows and columns within the target area

configuration:

  • pdfRuler.pdfCache = "./";
  • pdfRuler.tolerance = .2; controls the range within which pdfRuler.extractLines() considers page elements to be part of the same row/column

Please see the attached example.

About

pdfRuler is a small set of utilities for working with pdf2json

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages