pdf-split is a package that splits a PDF into individual pages and caches the results on the file system. It relies on pdfcpu for the split operation and @chriscdn/file-cache for file caching and cleanup.
Using npm:
npm install @chriscdn/pdf-splitUsing yarn:
yarn add @chriscdn/pdf-splitCreate a PDFSplitFileCache instance:
import { PDFSplitFileCache, Rotate } from "@chriscdn/pdf-split";
import { Duration } from "@chriscdn/duration";
const splitCache = new PDFSplitFileCache({
cachePath: "/path/to/cache/directory",
ttl: Duration.toMilliseconds({ days: 7 }),
});This assumes that pdfcpu is available on the system PATH. If it is not, you can provide the path to the binary by using the pdfcpu parameter in the constructor:
const splitCache = new PDFSplitFileCache({
cachePath: "/path/to/cache/directory",
ttl: Duration.toMilliseconds({ days: 7 }),
pdfcpu: "/opt/homebrew/bin/pdfcpu",
});The PDFSplitFileCache class extends FileCache from @chriscdn/file-cache. All constructor arguments from FileCache are supported except for cb and ext. The cache, including automatic cleanup of expired files, is managed by FileCache.
Retrieve the file path to a PDF page:
const firstPageFilePath = await splitCache.getFile({
pdfFilePath: "/path/to/your/pdf/file.pdf",
pageIndex: 0,
rotate: Rotate.DEG_0,
});Notes:
pageIndexis 0-based.rotateis optional, and can be set toDEG_0(default),DEG_90,DEG_180, orDEG_270- The cache key is based on
pdfFilePathandpageIndex. Ensure that unique PDFs have unique names to avoid cache collisions.
Get the page count:
const pageCount = await splitCache.pageCount("/path/to/your/pdf/file.pdf");Return an array containing the full path to each page of the split PDF. The length of the array should match pageCount:
const pages = await splitCache.pages("/path/to/your/pdf/file.pdf");