Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

XpdfText

Mark Geurts edited this page Mar 20, 2015 · 9 revisions

XpdfText() returns PDF text by executing the Xpdf pdftotext command. This function acts as a MATLAB wrapper for pdftotext and will return a cell array of cells containing the PDF text returned from this Xpdf command. This function was created from pdftotext version 3.04.

Contents

Syntax

text = XpdfText(filename)
text = XpdfText(..., filename)

Example

% Retrieve info from test.pdf
text = XpdfText('/path/to/file/', 'test.pdf');

% Print the text from the first page
fprintf('%s\n', text{1}{1:end});

Input Arguments

Variable Type Description
filename cell array of strings The full PDF file name. The name can be provided either as a single string (in filename) containing the path and file, or as separate strings. If separate strings are provided (in varargin), they are concatenated using fullfile()

Output Arguments

Variable Type Description
text n x 1 cell array Text data for n pages. Each cell contains an m x 1 cell array of strings, where m is the number of lines on that page. Note that pdftotext is executed with the -table flag, so the lines are optimized for tables.

General Information

Functions

For Developers

Clone this wiki locally