Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

words able to decode but unable to encode in GB18030 #45292

Closed
zaex mannequin opened this issue Aug 9, 2007 · 4 comments
Closed

words able to decode but unable to encode in GB18030 #45292

zaex mannequin opened this issue Aug 9, 2007 · 4 comments
Assignees

Comments

@zaex
Copy link
Mannequin

zaex mannequin commented Aug 9, 2007

BPO 1770551
Nosy @hyeshik
Files
  • python25_GB18030_cant_encode: The file containing the words able to decode but unable to encode in GB18030
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/hyeshik'
    closed_at = <Date 2007-08-12.15:18:46.000>
    created_at = <Date 2007-08-09.01:34:16.000>
    labels = ['expert-unicode']
    title = 'words able to decode but unable to encode in GB18030'
    updated_at = <Date 2007-08-12.15:18:46.000>
    user = 'https://bugs.python.org/zaex'

    bugs.python.org fields:

    activity = <Date 2007-08-12.15:18:46.000>
    actor = 'hyeshik.chang'
    assignee = 'hyeshik.chang'
    closed = True
    closed_date = None
    closer = None
    components = ['Unicode']
    creation = <Date 2007-08-09.01:34:16.000>
    creator = 'zaex'
    dependencies = []
    files = ['2431']
    hgrepos = []
    issue_num = 1770551
    keywords = []
    message_count = 4.0
    messages = ['32611', '32612', '32613', '32614']
    nosy_count = 3.0
    nosy_names = ['nnorwitz', 'hyeshik.chang', 'zaex']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1770551'
    versions = ['Python 2.5']

    @zaex
    Copy link
    Mannequin Author

    zaex mannequin commented Aug 9, 2007

    Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

    detailed:
    used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

    the attachment is also the list.

    list:
    ��䅟䌷䦟䦷䲠��㘚�㱮䴔䴖䴗䦆㧟䙡䙌䴕��䴙䥽��䓖䲡䥇䦂䦅䴓㩳�㳠䲢䴘�䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘�䦃

    @zaex zaex mannequin closed this as completed Aug 9, 2007
    @zaex zaex mannequin assigned hyeshik Aug 9, 2007
    @zaex zaex mannequin added the topic-unicode label Aug 9, 2007
    @zaex zaex mannequin closed this as completed Aug 9, 2007
    @zaex zaex mannequin assigned hyeshik Aug 9, 2007
    @zaex zaex mannequin added the topic-unicode label Aug 9, 2007
    @zaex
    Copy link
    Mannequin Author

    zaex mannequin commented Aug 9, 2007

    The Python is Python2.5 , my OS is windows XP professional sp2 version 2002

    @nnorwitz
    Copy link
    Mannequin

    nnorwitz mannequin commented Aug 10, 2007

    This seems like a cjk problem. Hye-Shik, could you take a look?

    @hyeshik
    Copy link
    Contributor

    hyeshik commented Aug 12, 2007

    The problem has been fixed about a week ago. (r56727-8)
    It will be okay on the forthcoming Python releases. Thank you for reporting!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant