New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling unallowed unicode value #280

Closed
inirudebwoy opened this Issue Nov 11, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@inirudebwoy
Copy link
Contributor

inirudebwoy commented Nov 11, 2016

Used Cerberus version: 1.0.1

I'm using Cerberus to validate data send by users, data can be unicode. One of the validated fields needs to be a certain allowed value. Value itself can be a unicode string but allowed list does not have any unicode elements.

This is a simple illustration of the problem.

from cerberus import Validator
schema = {'name': {'type': 'string', 'allowed': ['Joe']}
v = Validator(schema)

In [1]: v.validate({'name': 'Jo'})
Out[1]: False

In [2]: v.errors
Out[2]: {'name': ['unallowed value Jo']}

In [3]: v.validate({'name': u'Michał'})
Out[3]: False

In [4]: v.errors
---------------------------------------------------------------------------
/home/majki/.virtualenvs/cerberus-playground/lib/python2.7/site-packages/cerberus/errors.pyc in format_message(self, field, error)
    462         return self.messages[error.code]\
    463             .format(*error.info, constraint=error.constraint,
--> 464                     field=field, value=error.value)
    465
    466     def insert_error(self, path, node):

UnicodeEncodeError: 'ascii' codec can't encode character u'\u0142' in position 5: ordinal not in range(128)

As you can see an exception occurs when I try to display the error message.
At first I was encoding the data before validation, what worked, but than it would require to decode it back before rendering a response. So it seemed a bit off to do.

Since my allowed list does not contain any unicode strings as a workaround I have created a validator that rejects non ASCII string with a proper message and does manual check for allowed values.

Is this a bug? Or am I doing something horribly wrong with unicode?

@funkyfuture

This comment has been minimized.

Copy link
Member

funkyfuture commented Nov 11, 2016

though this could propably be fixed, i suggest you use Python 3.

@nicolaiarocci

This comment has been minimized.

Copy link
Member

nicolaiarocci commented Nov 12, 2016

What @funkyfuture said. However it would be nice if Cerberus would run smoothly over this, even on Py2. So @inirudebwoy, feel free to submit a patch 😄

@nicolaiarocci nicolaiarocci added this to the Unreleased milestone Nov 12, 2016

@inirudebwoy

This comment has been minimized.

Copy link
Contributor

inirudebwoy commented Nov 14, 2016

Cheers guys.
Since I can't unfortunately move to Python 3, will have to submit a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment