A simple npm package to extract text content from PDF files. It supports both local file paths and remote URLs.
- Node.js
node-fetch
packagepdftotext
utility from the Poppler library
-
Install Node.js (if not already installed):
- Download and install from Node.js official website.
-
Install
pdftotext
:- On macOS, install via Homebrew:
brew install poppler
- On Ubuntu/Debian-based systems:
sudo apt-get update sudo apt-get install poppler-utils
- On Windows, download and install Poppler from Poppler for Windows. Ensure the directory containing
pdftotext.exe
is in your PATH.
- On macOS, install via Homebrew:
Install PDF Extractor
:
npm i node-react-pdf-extractor
import { extractPdf } from "node-react-pdf-extractor";
const url =
"https://file-examples.com/storage/fed5266c9966708dcaeaea6/2017/10/file-example_PDF_500_kB.pdf";
try {
const data = extractPdf(url);
console.log("============== DATA", data);
} catch (error) {
console.log("============== ERROR", error);
}
import { extractPdf } from "node-react-pdf-extractor";
const url = "./test.pdf";
try {
const data = extractPdf(url);
console.log("============== DATA", data);
} catch (error) {
console.log("============== ERROR", error);
}