words able to decode but unable to encode in GB18030 #45292

zaex · 2007-08-09T01:34:16Z

BPO	1770551
Nosy	@hyeshik
Files	python25_GB18030_cant_encode: The file containing the words able to decode but unable to encode in GB18030

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/hyeshik'
closed_at = <Date 2007-08-12.15:18:46.000>
created_at = <Date 2007-08-09.01:34:16.000>
labels = ['expert-unicode']
title = 'words able to decode but unable to encode in GB18030'
updated_at = <Date 2007-08-12.15:18:46.000>
user = 'https://bugs.python.org/zaex'

bugs.python.org fields:

activity = <Date 2007-08-12.15:18:46.000>
actor = 'hyeshik.chang'
assignee = 'hyeshik.chang'
closed = True
closed_date = None
closer = None
components = ['Unicode']
creation = <Date 2007-08-09.01:34:16.000>
creator = 'zaex'
dependencies = []
files = ['2431']
hgrepos = []
issue_num = 1770551
keywords = []
message_count = 4.0
messages = ['32611', '32612', '32613', '32614']
nosy_count = 3.0
nosy_names = ['nnorwitz', 'hyeshik.chang', 'zaex']
pr_nums = []
priority = 'normal'
resolution = 'duplicate'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1770551'
versions = ['Python 2.5']

zaex · 2007-08-09T01:34:16Z

Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
ä�¬ä�±ä…ŸäŒ·ä¦Ÿä¦·ä² ã§�ã�ã˜šã˜�ã±®ä´”ä´–ä´—ä¦†ã§Ÿä™¡ä™Œä´•ä�–ä�¬ä´™ä¥½ä�¼ä��ä“–ä²¡ä¥‡ä¦‚ä¦…ä´“ã©³ã§�ã³ ä²¢ä´˜ã–�äœ£ä¥ºä¶®äœ©ä¥ºä²Ÿä²£ä¦›ä¦¶ã‘³ã‘‡ã¥®ã¤˜ä��ä¦ƒ

zaex · 2007-08-09T01:37:14Z

The Python is Python2.5 , my OS is windows XP professional sp2 version 2002

nnorwitz · 2007-08-10T03:35:28Z

This seems like a cjk problem. Hye-Shik, could you take a look?

hyeshik · 2007-08-12T15:18:46Z

The problem has been fixed about a week ago. (r56727-8)
It will be okay on the forthcoming Python releases. Thank you for reporting!

zaex mannequin closed this as completed Aug 9, 2007

zaex mannequin assigned hyeshik Aug 9, 2007

zaex mannequin added the topic-unicode label Aug 9, 2007

zaex mannequin closed this as completed Aug 9, 2007

zaex mannequin assigned hyeshik Aug 9, 2007

zaex mannequin added the topic-unicode label Aug 9, 2007

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

words able to decode but unable to encode in GB18030 #45292

words able to decode but unable to encode in GB18030 #45292

zaex mannequin commented Aug 9, 2007

zaex mannequin commented Aug 9, 2007

zaex mannequin commented Aug 9, 2007

nnorwitz mannequin commented Aug 10, 2007

hyeshik commented Aug 12, 2007

words able to decode but unable to encode in GB18030 #45292

words able to decode but unable to encode in GB18030 #45292

Comments

zaex mannequin commented Aug 9, 2007

zaex mannequin commented Aug 9, 2007

zaex mannequin commented Aug 9, 2007

nnorwitz mannequin commented Aug 10, 2007

hyeshik commented Aug 12, 2007