You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the script normally seems to work, printing out the full file.
However, if I try to pipe or Tee-Object:
python .\pdf2txt.py file.pdf > file.txt
or python .\pdf2txt.py file.pdf | Tee-Object file.txt
I get the following error (Command Prompt and PowerShell):
Traceback (most recent call last):
File "C:\Users\user\Downloads\pdfminer-env\Scripts\pdf2txt.py", line 317, in <module>
sys.exit(main())
^^^^^^
File "C:\Users\user\Downloads\pdfminer-env\Scripts\pdf2txt.py", line 311, in main
outfp = extract_text(**vars(parsed_args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\user\Downloads\pdfminer-env\Scripts\pdf2txt.py", line 62, in extract_text
pdfminer.high_level.extract_text_to_fp(fp, **locals())
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\high_level.py", line 132, in extract_text_to_fp
interpreter.process_page(page)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\pdfinterp.py", line 998, in process_page
self.device.end_page(page)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 81, in end_page
self.receive_layout(self.cur_item)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 352, in receive_layout
render(ltpage)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 341, in render
render(child)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 341, in render
render(child)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 341, in render
render(child)
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 343, in render
self.write_text(item.get_text())
File "C:\Users\user\Downloads\pdfminer-env\Lib\site-packages\pdfminer\converter.py", line 335, in write_text
cast(TextIO, self.outfp).write(text)
File "C:\Program Files\Python311\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\x83' in position 0: character maps to <undefined>
The text was updated successfully, but these errors were encountered:
Running the script normally seems to work, printing out the full file.
However, if I try to pipe or Tee-Object:
python .\pdf2txt.py file.pdf > file.txt
or
python .\pdf2txt.py file.pdf | Tee-Object file.txt
I get the following error (Command Prompt and PowerShell):
The text was updated successfully, but these errors were encountered: