Support rotated pages with extraction_mode="layout"

## Explanation

When extracting text from rotated pages, the current options limit useful extraction in layout mode.
- If `strip_rotated=True`,  a warning is issued and there is no output.
- If `strip_rotated=False`,  a warning is issued and the output is garbled.

I propose to add an optional `orientation: {"infer", 0, 90, 180, 270} = "infer"}` to `PageObject.extract_text`. `infer` could either use the `page['/Rotate']` or use the actual rotation of the text. The names `orientation`, `layout_mode_orientation`, `rotation`, etc. are all the same to me.

I think it's best to add a keyword argument rather than to implicitly use the `page['/Rotate']`, so one could extract different groups of rotated text from the same page. For example, a page header/footer has 0 rotation, but the page content are rotated 90 degrees. There is value to be able to extract each.

[rotated-page.pdf](https://github.com/user-attachments/files/19981120/rotated-page.pdf)

## Code Example


```python
from pypdf import PdfReader
reader = PdfReader("./rotated-page.pdf")

# all to the same effect, for a 90-degree rotated page...
reader.pages[0].extract_text(extraction_mode="layout")
reader.pages[0].extract_text(extraction_mode="layout", orientation="infer")
reader.pages[0].extract_text(extraction_mode="layout", orientation=90)

# to collect different sections of a page, while preserving the layout of each.
header = reader.pages[0].extract_text(extraction_mode="layout", orientation=0)
body = reader.pages[0].extract_text(extraction_mode="layout", orientation=90)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support rotated pages with extraction_mode="layout" #3270

Explanation

Code Example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Support rotated pages with extraction_mode="layout" #3270

Description

Explanation

Code Example

Activity

stefan6419846 commented on Apr 30, 2025

hackowitz-af commented on Apr 30, 2025

hackowitz-af commented on Apr 30, 2025

stefan6419846 commented on May 1, 2025

shartzog commented on May 20, 2025

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions