gettext empty result #652

bigmoney99 · 2023-11-14T14:12:56Z

Hello, Iwant to extract this pdf, but the result is empty.
https://www.mediafire.com/file/azb7yddqo2ry55j/123.pdf/file

this is my code

$parser = new \Smalot\PdfParser\Parser(); // Parse pdf file using Parser library 
$pdf = $parser->parseFile($file);
$metaData = $pdf->getDetails();
print_r($metaData); 
$pages  = $pdf->getPages();
foreach ($pages as $page) {
            $text = $page->getText();
            echo "<div>".$text."</div>";
}
echo $file;

the result just

Array
(
    [Producer] => cairo 1.17.4 (https://cairographics.org
    [Pages] => 1
)
<div></div>D:\web\D\public\pdf_po/123.pdf

The text was updated successfully, but these errors were encountered:

GreyWyvern · 2023-11-15T16:17:46Z

Issue seems to appear both in 2.7.0 and 2.8.0rc. For some reason no text content sections are found and delivered to formatContent() to parse. Text is selectable from within a PDF reader, so there is text there. More research is needed.

k00ni added the bug label Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gettext empty result #652

gettext empty result #652

bigmoney99 commented Nov 14, 2023 •

edited

GreyWyvern commented Nov 15, 2023

gettext empty result #652

gettext empty result #652

Comments

bigmoney99 commented Nov 14, 2023 • edited

GreyWyvern commented Nov 15, 2023

bigmoney99 commented Nov 14, 2023 •

edited