LA-PDFText is a system for extracting accurate text from PDF-based research articles (and an interface to be able to improve performance where needed). The system is open-source and provides a simple baseline function for extracting text from primary research articles using rules that developers can customize. This means that the system works qu…
Java XSLT HTML Other
Clone or download
Pull request Compare This branch is 71 commits ahead of GullyAPCBurns:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.