Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError When Reading Non-ANSI Characters in PPTX Files #63

Open
ayoubelmhamdi opened this issue Jun 21, 2024 · 0 comments
Open

Comments

@ayoubelmhamdi
Copy link

I'm encountering a UnicodeDecodeError when attempting to read PPTX files that contain non-ANSI characters using pptx2md. Below is the error traceback:

pptx2md Radiobiologie.txt -t pptx.pptx 
Traceback (most recent call last):
  File "/tmp/new_1/venv/bin/pptx2md", line 8, in <module>
    sys.exit(main())
  File "/tmp/new_1/venv/lib/python3.12/site-packages/pptx2md/__main__.py", line 66, in main
    prepare_titles(args.title)
  File "/tmp/new_1/venv/lib/python3.12/site-packages/pptx2md/__main__.py", line 17, in prepare_titles
    for line in f.readlines():
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 14: invalid start byte

This issue arises when the pptx2md command is run with a specific PPTX file as input. The error suggests that there is a problem with decoding a byte sequence that is not recognized as valid UTF-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant