Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codecs.lookup can raise exceptions other than LookupError #40295

Closed
jpe mannequin opened this issue May 26, 2004 · 8 comments
Closed

codecs.lookup can raise exceptions other than LookupError #40295

jpe mannequin opened this issue May 26, 2004 · 8 comments
Assignees

Comments

@jpe
Copy link
Mannequin

jpe mannequin commented May 26, 2004

BPO 960874
Nosy @mwhudson, @malemburg

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/malemburg'
closed_at = <Date 2004-05-26.19:33:50.000>
created_at = <Date 2004-05-26.14:37:36.000>
labels = ['expert-unicode']
title = 'codecs.lookup can raise exceptions other than LookupError'
updated_at = <Date 2004-05-26.19:33:50.000>
user = 'https://bugs.python.org/jpe'

bugs.python.org fields:

activity = <Date 2004-05-26.19:33:50.000>
actor = 'jpe'
assignee = 'lemburg'
closed = True
closed_date = None
closer = None
components = ['Unicode']
creation = <Date 2004-05-26.14:37:36.000>
creator = 'jpe'
dependencies = []
files = []
hgrepos = []
issue_num = 960874
keywords = []
message_count = 8.0
messages = ['20893', '20894', '20895', '20896', '20897', '20898', '20899', '20900']
nosy_count = 3.0
nosy_names = ['mwh', 'lemburg', 'jpe']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue960874'
versions = []

@jpe
Copy link
Mannequin Author

jpe mannequin commented May 26, 2004

codecs.lookup raises ValueError when given an empty
string and UnicodeEncodeError when given a unicode
object that can't be converted to a str in the default
encoding. I'd expect it to raise LookupError when
passed any basestring instance.

For example:
Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC 
v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more 
information.
>>> import codecs
>>> codecs.lookup('')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "c:\python23\lib\encodings\__init__.py", line 84, in 
search_function
    globals(), locals(), _import_tail)
ValueError: Empty module name
>>> codecs.lookup(u'\uabcd')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode 
character u'\uabcd' in position 0: ordinal not in range
(128)
>>>

@jpe jpe mannequin closed this as completed May 26, 2004
@jpe jpe mannequin assigned malemburg May 26, 2004
@jpe jpe mannequin added the topic-unicode label May 26, 2004
@jpe jpe mannequin closed this as completed May 26, 2004
@jpe jpe mannequin assigned malemburg May 26, 2004
@jpe jpe mannequin added the topic-unicode label May 26, 2004
@mwhudson
Copy link

Logged In: YES
user_id=6656

What exactly are you complaining about? I'd expect codecs.lookup
to raise TypeError if called with no arguments or an integer.

I believe it's documented somewhere that encoding names must
be ascii only, but I must admit I don't recall where.

@jpe
Copy link
Mannequin Author

jpe mannequin commented May 26, 2004

Logged In: YES
user_id=22785

The other exceptions occur when strings or unicode objects
are passed in as an argument. The string that it fails on is
the empty string (''). I can see disallowing non-ascii names,
but '' should raise a LookupError.

My use case is to see if an user supplied unicode string is a
valid encoding, so any check that the lookup function does
not do, I will need to do before calling it.

@mwhudson
Copy link

Logged In: YES
user_id=6656

This much seems to be fixed in CVS, actually :-)

@jpe
Copy link
Mannequin Author

jpe mannequin commented May 26, 2004

Logged In: YES
user_id=22785

Yes, it does look like lookup('') is fixed in CVS. So the
question is whether lookup() of something that isn't
convertable in the current encoding to a char* should raise a
LookupError. I can live with it not, though if it did, it would
make it a bit easier to determine if an arbitrary unicode string
is a name of a supported encoding.

I'm willing to put together a patch to raise LookupError if
that's what the behavior should be

@mwhudson
Copy link

Logged In: YES
user_id=6656

Well, *I* don't think that's a particularly good idea. I don't know if
Marc-André feels differently.

@malemburg
Copy link
Member

Logged In: YES
user_id=38388

I don't think we should change anything.

First of all, the lookup function interfaces to a codec
search function and these can raise all kinds of errors, so
it is not guaranteed that you will only see LookupErrors
(the same is true for most other Python APIs, e.g. most can
generate MemoryErrors). Possible other errors are
ValueErrors, NameErrors, ImportErrors, etc. etc. depending
on the search function that happens to process your request.

Second, the name you enter as argument usually maps to a
Python module and/or package name, so it *has* to be ASCII.
The fact that you can enter Unicode names for the codec name
if only by virtue of the automagical conversion of Unicode
to strings. Again, this happens in a lot of places in Python
and is not specific to lookup().

Closing this request.

@jpe
Copy link
Mannequin Author

jpe mannequin commented May 26, 2004

Logged In: YES
user_id=22785

Okay, that works for me. We might want to update the
documentation, which seems to imply that LookupError will be
raised if the name is invalid -- my mental model was that it
acted more like a dictionary. I was just trying to avoid a
catch all handler to catch expected failures (an encoding
being unavailable is exepect because I know I may be feeding
junk to it; but out of memory wouldn't be, though I know it
can happen anywhere).

Thanks for the quick response :).

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants