Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

中文字符相关的编码错误 Error in Chinese character encoding #463

Merged
merged 4 commits into from
Apr 9, 2022

Conversation

gitzjm
Copy link
Contributor

@gitzjm gitzjm commented Oct 5, 2018

generic.py, line 492 PyPDF2.utils.PdfReadError: Illegal character in Name Object
utils.py: lin 238 UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)

修复BUG:Name Object 遇到GBK编码时会抛出PyPDF2.utils.PdfReadError: Illegal character in Name Object异常
修复BUG:遇到中文时latin-1无法编码抛出UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)异常
Chinese character can not use 'latin-1' encode
Copy link

@muideen muideen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran into this same issue and was about submitting a change request to fix. I think this bug fix should be approved to save other developers headaches in future

@ysinsane
Copy link

I was wondering why all those pull requests ignored for so long...

@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF Tiny Pull requests that make a tiny change - and thus should be easy to merge labels Apr 6, 2022
@MartinThoma MartinThoma merged commit b76ffcd into py-pdf:master Apr 9, 2022
@MartinThoma
Copy link
Member

Thank you for your contribution!

It might take until end of the month to make the next release. Latest 01.05.2022 this will be on PyPI :-)

MartinThoma added a commit that referenced this pull request Apr 10, 2022
- PKG: Make Tests not a subpackage (#728)
- BUG: Fix ASCII85Decode.decode assertion (#729)
- BUG: Error in Chinese character encoding (#463)
- BUG: Code duplication in Scripts/2-up.py
- ROBUST: Guard 'obj.writeToStream' with 'if obj is not None'
- ROBUST: Ignore a /Prev entry with the value 0 in the trailer
- MAINT: Remove Sample_Code (#726)
- TST: Close file handle in test_writer (#722)
- TST: Fix test_get_images (#730)
- DEV: Make tox use pytest and add more Python versions (#721)
- DOC: Many (#720, #723-725, #469)
@pubpub-zz
Copy link
Collaborator

@gitzjm / @ysinsane
sorry to bother you about this very old PR. In order to strengthen regression tests. I'm looking for a pdf file where the gbk encoding is required. Can your help me with this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF Tiny Pull requests that make a tiny change - and thus should be easy to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants