Laravel package for extracting Text/Html from a PDF or converting it to images (PNG, JPeG).
We invest a lot of time and give our hearts to work in Open Source.
You can install the package via composer:
composer require hub-io/laravel-pdf-toYou can publish the config file with:
php artisan vendor:publish --tag="laravel-pdf-to-config"This is the content of the published config file:
return [
/**
* Set the pdftotext binary path manually
*/
'pdftotext_bin' => env('PDF_TO_TEXT_PATH'),
/**
* Set the pdftohtml binary path manually
*/
'pdftohtml_bin' => env('PDF_TO_HTML_PATH'),
/**
* Set the pdftoppm binary path manually
*/
'pdftoppm_bin' => env('PDF_TO_PPM_PATH'),
/**
* Set the pdftocairo binary path manually
*/
'pdftocairo_bin' => env('PDF_TO_CAIRO_PATH'),
/**
* Set the default output directory
*/
'output_dir' => env('PDF_TO_OUTPUT_DIR', storage_path('app/pdf-to')),
];This package relies on the following external tools:
- pdftotext: For extracting text from PDFs.
- pdftohtml: For converting PDFs to HTML.
- pdftoppm: For generating images from PDFs.
Make sure these tools are installed and available in your system's PATH. On macOS, you can install them via Homebrew:
brew install popplerBy default, the package attempts to locate the required binary files (pdftotext, pdftohtml, pdftoppm, or pdftocairo) automatically. If these binaries are not found in your system's PATH, you will need to set their paths manually in the configuration file.
You can update the configuration file config/pdf-to.php as follows:
return [
'pdftotext_bin' => env('PDF_TO_TEXT_PATH'),
'pdftohtml_bin' => env('PDF_TO_HTML_PATH'),
'pdftoppm_bin' => env('PDF_TO_PPM_PATH'),
'pdftocairo_bin' => env('PDF_TO_CAIRO_PATH'),
];For text extraction, this package uses the Spatie/pdf-to-text. However, if you don't have pdftoppm or pdftocairo installed, you can also use the Spatie/pdf-to-image for image generation. This provides a fallback mechanism to ensure functionality even without the required binaries.
use Hubio\LaravelPdfTo\Facades\LaravelPdfTo;
$text = LaravelPdfTo::setFile('path/to/your/file.pdf')
->setTimeout(120) // optionally
->result('txt');
echo $text;use Hubio\LaravelPdfTo\Facades\LaravelPdfTo;
$html = LaravelPdfTo::setFile('path/to/your/file.pdf')
->setConfig(['options' => [...]]) // optionally
->saveAs('output-file') // optionally, if you wan to store as file, then result returns path
->result('html');
echo $html;use Hubio\LaravelPdfTo\Facades\LaravelPdfTo;
$image = LaravelPdfTo::setFile('path/to/your/file.pdf')
->setTimeout(180) // optionally
->result('png');
echo $image; // Path to the generated imageTo run the tests, use the following command:
composer testThe tests include functionality for extracting text, converting to HTML, and generating images from PDFs. Example test files are located in the tests/ directory.
Please see CHANGELOG for more information on what has changed recently.
Please see CONTRIBUTING for details.
Please review our security policy on how to report security vulnerabilities.
The MIT License (MIT). Please see License File for more information.