Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MacRoman Encoding Bug (OHM vs. OMEGA) #42700

Closed
seanbpalmer mannequin opened this issue Dec 16, 2005 · 2 comments
Closed

MacRoman Encoding Bug (OHM vs. OMEGA) #42700

seanbpalmer mannequin opened this issue Dec 16, 2005 · 2 comments
Assignees

Comments

@seanbpalmer
Copy link
Mannequin

seanbpalmer mannequin commented Dec 16, 2005

BPO 1382096
Nosy @malemburg

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/malemburg'
closed_at = <Date 2005-12-16.14:47:09.000>
created_at = <Date 2005-12-16.02:22:35.000>
labels = ['expert-unicode']
title = 'MacRoman Encoding Bug (OHM vs. OMEGA)'
updated_at = <Date 2005-12-16.14:47:09.000>
user = 'https://bugs.python.org/seanbpalmer'

bugs.python.org fields:

activity = <Date 2005-12-16.14:47:09.000>
actor = 'lemburg'
assignee = 'lemburg'
closed = True
closed_date = None
closer = None
components = ['Unicode']
creation = <Date 2005-12-16.02:22:35.000>
creator = 'seanbpalmer'
dependencies = []
files = []
hgrepos = []
issue_num = 1382096
keywords = []
message_count = 2.0
messages = ['27087', '27088']
nosy_count = 2.0
nosy_names = ['lemburg', 'seanbpalmer']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1382096'
versions = ['Python 2.4']

@seanbpalmer
Copy link
Mannequin Author

seanbpalmer mannequin commented Dec 16, 2005

The file encodings/mac_roman.py in Python 2.4.1
contains the following incorrect character definition
on line 96:

    0x00bd: 0x2126, # OHM SIGN

This should read:

    0x00bd: 0x03A9, # GREEK CAPITAL LETTER OMEGA

Presumably this bug occurred due to a misreading, given
that OHM and OMEGA having the same glyph. Evidence that
the OMEGA interpretation is correct:

0xBD 0x03A9 # GREEK CAPITAL LETTER OMEGA
-http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.TXT

Further evidence can be found by Googling for MacRoman
tables. This bug means that, for example, the following
code gives a UnicodeEncodeError when it shouldn't do:

>> u'\u03a9'.encode('macroman')

For a workaround, I've been using the following code:

>>> import codecs
>>> from encodings import mac_roman
>>> mac_roman.decoding_map[0xBD] = 0x03A9
>>> mac_roman.encoding_map =
codecs.make_encoding_map(mac_roman.decoding_map)

And then, to use the example above:

>>> u'\u03a9'.encode('macroman')
'\xbd'
>>> 

Thanks,

--
Sean B. Palmer

@seanbpalmer seanbpalmer mannequin closed this as completed Dec 16, 2005
@seanbpalmer seanbpalmer mannequin assigned malemburg Dec 16, 2005
@seanbpalmer seanbpalmer mannequin added the topic-unicode label Dec 16, 2005
@malemburg
Copy link
Member

Logged In: YES
user_id=38388

This has been fixed in CVS and Python 2.5 will include the fix.

A backport is not possible, because we've changed the way
charmap codecs work in 2.5.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant