中文字符相关的编码错误 Error in Chinese character encoding #463

gitzjm · 2018-10-05T05:01:05Z

generic.py, line 492 PyPDF2.utils.PdfReadError: Illegal character in Name Object
utils.py: lin 238 UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)

修复BUG:Name Object 遇到GBK编码时会抛出PyPDF2.utils.PdfReadError: Illegal character in Name Object异常

修复BUG:遇到中文时latin-1无法编码抛出UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)异常 Chinese character can not use 'latin-1' encode

Update utils.py

muideen

I just ran into this same issue and was about submitting a change request to fix. I think this bug fix should be approved to save other developers headaches in future

ysinsane · 2020-03-20T03:26:41Z

I was wondering why all those pull requests ignored for so long...

MartinThoma · 2022-04-09T20:01:47Z

Thank you for your contribution!

It might take until end of the month to make the next release. Latest 01.05.2022 this will be on PyPI :-)

- PKG: Make Tests not a subpackage (#728) - BUG: Fix ASCII85Decode.decode assertion (#729) - BUG: Error in Chinese character encoding (#463) - BUG: Code duplication in Scripts/2-up.py - ROBUST: Guard 'obj.writeToStream' with 'if obj is not None' - ROBUST: Ignore a /Prev entry with the value 0 in the trailer - MAINT: Remove Sample_Code (#726) - TST: Close file handle in test_writer (#722) - TST: Fix test_get_images (#730) - DEV: Make tox use pytest and add more Python versions (#721) - DOC: Many (#720, #723-725, #469)

pubpub-zz · 2024-04-20T07:18:09Z

@gitzjm / @ysinsane
sorry to bother you about this very old PR. In order to strengthen regression tests. I'm looking for a pdf file where the gbk encoding is required. Can your help me with this ?

gitzjm added 3 commits June 23, 2018 14:42

Update generic.py

f923840

修复BUG:Name Object 遇到GBK编码时会抛出PyPDF2.utils.PdfReadError: Illegal character in Name Object异常

Update utils.py

d46e5fa

修复BUG:遇到中文时latin-1无法编码抛出UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)异常 Chinese character can not use 'latin-1' encode

Merge pull request #1 from gitzjm/gitzjm-patch-2

69866e1

Update utils.py

muideen approved these changes Dec 6, 2019

View reviewed changes

MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF Tiny Pull requests that make a tiny change - and thus should be easy to merge labels Apr 6, 2022

Merge branch 'master' into gitzjm-patch-1

b5bd652

MartinThoma merged commit b76ffcd into py-pdf:master Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

中文字符相关的编码错误 Error in Chinese character encoding #463

中文字符相关的编码错误 Error in Chinese character encoding #463

gitzjm commented Oct 5, 2018

muideen left a comment

ysinsane commented Mar 20, 2020

MartinThoma commented Apr 9, 2022

pubpub-zz commented Apr 20, 2024

中文字符相关的编码错误 Error in Chinese character encoding #463

中文字符相关的编码错误 Error in Chinese character encoding #463

Conversation

gitzjm commented Oct 5, 2018

muideen left a comment

Choose a reason for hiding this comment

ysinsane commented Mar 20, 2020

MartinThoma commented Apr 9, 2022

pubpub-zz commented Apr 20, 2024