`PageObject.extract_text`s `text_visitor` reports a wrong matrix for some text nodes

While trying to extract lemmas from this page, I found that some text "nodes" (not sure what the technical term is, I'll refer to them as nodes in this issue) are passed to `visitor_text` with seemingly wrong `matrix` values.

## Environment
```bash
$ python -m platform
Linux-6.5.0-21-generic-x86_64-with-glibc2.35
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.1.0, crypt_provider=('cryptography', '3.4.8'), PIL=9.0.1
```

## Code + PDF

This is a minimal, complete example that shows the issue. Observe (using a PDF reader) that the nodes `ZURRA˓A, KHIRBE` and `T EL` appear next to each other. Also save the script below (to `example.py` for example) and run it, passing the path to the attached pdf as first parameter.

```python
import pypdf
import sys

def main():

    reader = pypdf.PdfReader(sys.argv[1], strict=True)
    page = reader.pages[0]

    def text_visitor(text, transform, matrix, font_dict, font_size):
        if "T EL" in text or "ZURRA˓A, KHIRBE" in text:
            print(f"{text!r} has matrix {matrix}")

    page.extract_text(visitor_text=text_visitor)

if __name__ == "__main__":
    main()
```

Observe that the output is:
```bash
$ python example.py ./zurra_page.pdf 
'ZURRA˓A, KHIRBE' has matrix [1.0, 0.0, 0.0, 1.0, 50.4, 687.12]
' T EL' has matrix [1.0, 0.0, 0.0, 1.0, 0.0, 0.0]
```

I expected the last two elements of the `T EL` node to be the x and y position of the node (which pdfbox shows to be `177.92` and `687.12` respectively).
I also noticed that pdfbox seems to indicate the text in the node is `T EL`, but pdfpy reports ` T EL` (note the leading space). Is pdfpy mistakenly adding a leading space? 

### Files

The sample PDF used with this is a page from a PDF version of the Anchor Bible Dictionary: [zurra_page.pdf](https://github.com/py-pdf/pypdf/files/14551057/zurra_page.pdf)

This page in pdfbox's debugger, which clearly shows the coordinates of the `T EL` node:
![image](https://github.com/py-pdf/pypdf/assets/32435769/bf26e818-3f45-4af5-b6b7-f82d5a382772)

## Traceback
There is no exception raised, so there also is no traceback.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`PageObject.extract_text`s `text_visitor` reports a wrong matrix for some text nodes #2513

Environment

Code + PDF

Files

Traceback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PageObject.extract_texts text_visitor reports a wrong matrix for some text nodes #2513

Description

Environment

Code + PDF

Files

Traceback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`PageObject.extract_text`s `text_visitor` reports a wrong matrix for some text nodes #2513