This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
PdfReader - Extract images from specific pages #2535
Labels
workflow-images
From a users perspective, image handling is the affected feature/workflow
Replace this: What happened? What were you trying to achieve?
Environment
Python 3.8
WSL Ubuntu 22.04
Windows11
pypdf 4.1.0
Issue
I generated a very simple pdf with libreoffice-writer :
test_image.pdf
In this pdf, there is two pages, one containing a small text, another containing an image.
![](https://private-user-images.githubusercontent.com/26071804/316262630-67f68340-a8a0-4b7d-aca7-17f243f7e4fe.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MjY4NDYsIm5iZiI6MTcyMTYyNjU0NiwicGF0aCI6Ii8yNjA3MTgwNC8zMTYyNjI2MzAtNjdmNjgzNDAtYThhMC00YjdkLWFjYTctMTdmMjQzZjdlNGZlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDA1MzU0NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTE3ZDc5MTc3NTY4NzZiYWViYjhmMGI1MmM5ODIxYWRlYWFhYjY2ZDJmZmIwZmE1NTViNTlmMWUxMWUyNWEyZTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.w2a26rEczXiSoXxepFS9hHhEgKUZHS2pGvLyKVjSiBw)
![](https://private-user-images.githubusercontent.com/26071804/316262640-0b80958a-7d76-40c3-a3a1-4795983b3039.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MjY4NDYsIm5iZiI6MTcyMTYyNjU0NiwicGF0aCI6Ii8yNjA3MTgwNC8zMTYyNjI2NDAtMGI4MDk1OGEtN2Q3Ni00MGMzLWEzYTEtNDc5NTk4M2IzMDM5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDA1MzU0NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTIxNDFjYjIxNjg3ZmJmZDMzZmYzOGE5YTNiYzVhYjVhMWFjOTgzZTRiZGU4YmJlMjE5MTY3OWIyNTY1MTFkYWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.3EkBB7wKWYBRzm8H9-BqKKwjfnRThp1cRVfgjyW_yfE)
I want to extract pdf pages and get the image only in the second page.
The code to reproduce the issue is here :
The result is bellow :
I expect that on page 0 there is 0 image in order to extract the image only from the second page.
I don't know if it is a normal behaviour.
How to do what i would like to obtain ?
Thanks,
Regards
The text was updated successfully, but these errors were encountered: