Excessive spacing when extracting text from PDF

Hello,

Thanks for this awesome work. I'm using PyMuPDF to provide blind people with accessible eBook reading experience with [Bookworm](https://github.com/mush42/bookworm/).

## Describe the bug (mandatory)
PyMuPDf v1.16.0 and up introduced a weired bug with certain PDF documents. When using text extraction functions., excessive spaces are inserted between characters. The extracted text is just a blob of characters. I have to admit that this problem is rare, it happens with some kind of PDF documents. In v1.14X versions the text is extracted correctly.

## To Reproduce (mandatory)
Here is the text extraction function straight from the FitzDocument backend:

```python
def _text_from_page(page):
    bloks = page.getTextBlocks()
    text = [blk[4].replace("\n", " ") for blk in bloks]
    return "\r\n".join(text)
```

## Expected behavior (optional)
The text is extracted preserving its basic structure (words and paragraphs).

## Your configuration (mandatory)
 - Operating system: Windows 10 Pro, 64-bit
 - Python: Python 3.7.4 64-bit/32-bit
 - PyMuPDF version: v1.16.0, installed from PyPI


## Additional context (optional)
The problem still happens with the latest version of PyMuPDF (v1.16.2). Playing with text extraction flags didn't help, it gives several variations none of which solves the issue.

This problem happens with the Dart specs document, and several others. As an example, the Dart language specs is attached.
[DartLangSpec-v2.2.pdf](https://github.com/pymupdf/PyMuPDF/files/3613623/DartLangSpec-v2.2.pdf)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Excessive spacing when extracting text from PDF #364

Describe the bug (mandatory)

To Reproduce (mandatory)

Expected behavior (optional)

Your configuration (mandatory)

Additional context (optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Excessive spacing when extracting text from PDF #364

Description

Describe the bug (mandatory)

To Reproduce (mandatory)

Expected behavior (optional)

Your configuration (mandatory)

Additional context (optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions