PDF-TextString

This is a nodejs modules that will extract the text from a pdf. If there is no text attached to the pdf, then you will get the null value back. To use this module, you have to install pdftotext and pdffonts from Xpdf.

Installation

Linux

pdftotext and pdffonts are included in the poppler-utils library. To install poppler-utils execute

apt-get install poppler-utils

Windows

For Windows you can simply download the executables from Xpdf

Usage

the path to the pdf must be the absolute path

Windows

first add :

var pdftext = require('pdf-textstring');

And then just do the following :

pdftext.pdftotext(pdf_path, function (err, data) {
  if(err){
    console.log(err);
  }else{
    console.log(data)
  }
}

if pdftotext and/or pdffonts aren't in the PATH of Windows Then you can simply tell the module where the executables are located.

pdftext.setBinaryPath_PdfToText("AbsolutePath/To/Binary");
pdftext.setBinaryPath_PdfFont("AbsolutePath/To/Binary");
pdftext.pdftotext(pdf_path, function (err, data) {
  if(err){
    console.log(err);
  }else{
    console.log(data)
  }
}

i recommed the usage of the path module to get the absolute path :

var path = require('path');
var AbsolutePathToApp = path.dirname(process.mainModule.filename);
var pathToPdftotext = AbsolutePathToApp + "/binaries/pdftotext.exe";
var pathToPdffonts = AbsolutePathToApp + "/binaries/pdffonts.exe";

Linux

first add :

var pdftext = require('pdf-textstring');

And then just do the following :

pdftext.pdftotext(pdf_path, function (err, data) {
  if(err){
    console.log(err);
  }else{
    console.log(data)
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
lib		lib
.travis.yml		.travis.yml
README.md		README.md
index.js		index.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF-TextString

Installation

Linux

Windows

Usage

Windows

Linux

About

Uh oh!

Releases

Packages

Uh oh!

Languages

dmoerman/PDF-TextString

Folders and files

Latest commit

History

Repository files navigation

PDF-TextString

Installation

Linux

Windows

Usage

Windows

Linux

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages