Text missing from PDF with large ratio

### Bug
I am testing various pdf files in order to be able to improve the solution I am building on top of docling. I came across a weirdly ratioed pdf that was taken from a website. When loaded with docling it omits a lot of characters that should be accessible through raw text. I attached the file to this issue, when loading this specific file via other libraries like e.g. PyPDF everything works as intended.

Problematic PDF:
[Baldur's Gate III Guide - IGN (1).pdf](https://github.com/user-attachments/files/19339342/Baldur.s.Gate.III.Guide.-.IGN.1.pdf)

PyPDF Parsing Output:
[PyPDF_Parsing_Output.txt](https://github.com/user-attachments/files/19339407/PyPDF_Parsing_Output.txt)

Docling Parsing Output:
[Docling_Parsing_Output.txt](https://github.com/user-attachments/files/19339406/Docling_Parsing_Output.txt)


### Steps to reproduce
Run the basic example with the problematic pdf
from docling.document_converter import DocumentConverter

```
source = "Baldur.s.Gate.III.Guide.-.IGN.1.pdf"
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())
```

### Docling version
Docling version: 2.27.0
Docling Core version: 2.23.2
Docling IBM Models version: 3.4.0
Docling Parse version: 4.0.0
Python: cpython-311 (3.11.10)
Platform: Linux-5.15.0-94-generic-x86_64-with-glibc2.35

### Python version
Python 3.11.10



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text missing from PDF with large ratio #1202

Bug

Steps to reproduce

Docling version

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Text missing from PDF with large ratio #1202

Description

Bug

Steps to reproduce

Docling version

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions