Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix int_validator #1692

Merged
merged 12 commits into from
May 2, 2014
34 changes: 27 additions & 7 deletions ckan/logic/validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,14 +62,34 @@ def package_id_not_changed(value, context):
return value

def int_validator(value, context):
if isinstance(value, int):
return value
"""
Return an integer for value, which may be a string in base 10 or
a numeric type (e.g. int, long, float, Decimal, Fraction). Return
None for None or empty/all-whitespace string values.

:raises: ckan.lib.navl.dictization_functions.Invalid for other
inputs or non-whole values
"""
if value is None:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it, but is this a new behavior? Just by looking at the diff, it seems that we used to raise Invalid if called with None.

Just making sure it was intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is new behaviour.

The current implementation is not idempotent. If you call it once passing in an empty string it will convert that to None, but calling a second time will raise an Invalid exception. The same is true if you pass a long value as a string: it fails on the second call. This PR fixes those cases and some others.

I believe that all validators should be idempotent: it shouldn't matter if you call them once or twenty times in a row, you should always get the same result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me realize that I had the wrong definition of idempotent. I thought it was simply that a function doesn't have side effects, and if I call it n times with the same parameters, it'll give me the same result. Basically, f(x) == f(x) == f(x), but checking Wikipedia tells me that it's actually f(f(f(x))) == f(x). Thanks!

if hasattr(value, 'strip') and not value.strip():
return None

try:
if value.strip() == '':
return None
return int(value)
except (AttributeError, ValueError), e:
raise Invalid(_('Invalid integer'))
whole, part = divmod(value, 1)
except TypeError:
try:
return int(value)
except ValueError:
pass
else:
if not part:
try:
return int(whole)
except TypeError:
pass # complex number: fail like int(complex) does

raise Invalid(_('Invalid integer'))

def natural_number_validator(value, context):
value = int_validator(value, context)
Expand Down
53 changes: 53 additions & 0 deletions ckan/new_tests/logic/test_validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
import ckan.new_tests.lib.navl.test_validators as t


assert_equals = nose.tools.assert_equals


def returns_arg(function):
'''A decorator that tests that the decorated function returns the argument
that it is called with, unmodified.
Expand Down Expand Up @@ -444,5 +447,55 @@ def call_validator(*args, **kwargs):
*args, **kwargs)
call_validator(key, data, errors, context={'model': mock_model})

def test_int_validator_idempotent(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name wasn't clear to me at first sight. I guess something more explicit, like test_int_validator_doesnt_change_already_converted_values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but.. that's what idempotent means! idempotent is a good word!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a native English speaker, so I might have misunderstood. Idempotent for me means something that doesn't have side effects, which wasn't what you were testing. The new names you wrote are much better in my opinion 👍

import ckan.logic.validators as validators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we use this pattern of importing what the tests need in the test itself everywhere in this file, but I haven't seen it used in other places. Do you know if it's something we should start doing? If not, I would prefer to move those imports to the beginning of the file, following the relative coding standards


unchanged_values = [
42,
0,
3948756923874659827346598,
None,
]
for v in unchanged_values:
returns_arg(validators.int_validator)(v)

def test_int_validator_convert(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be more explicit as well. I'm out of ideas, but our naming guidelines might help.

To be honest, even though I understand that it'll lead to quite a bit of code duplication, I would do something along the lines of:

class TestIntValidator(object):
    # ...

    def test_it_converts_fractions(self):
        value, expected_result = (Fraction(2, 1), 2)
        assert_equals(validators.int_validator(value, None), expected_result)

    def test_it_converts_decimals(self):
        value, expected_result = (Decimal("19.00"), 19)
        assert_equals(validators.int_validator(value, None), expected_result)

    # ...

This follows the pattern I'm proposing at #1672. I'm not religious about it, though. I'm happy if you prefer to do this way, just make the test name more explicit please.

import ckan.logic.validators as validators
from fractions import Fraction
from decimal import Decimal

converted_values = [
(42.0, 42),
(Fraction(2, 1), 2),
(Decimal("19.00"), 19),
("528735648764587235684376", 528735648764587235684376),
("", None),
(" \n", None),
]
for arg, result in converted_values:
assert_equals(validators.int_validator(arg, None), result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that every validator needs to accept a context, even though it doesn't need it. But maybe you could set int_validator to have a default context value? I mean, def int_validator(value, context={}), so you don't need to keep calling it with None all over the place.

If you don't agree with it, please change the calls to pass an empty dictionary. Even though we don't use the context now, it's still better to pass a "mock" that belongs to the class that the method expects IMHO.


def test_int_validator_invalid(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change it to something like def test_int_validator_raises_Invalid_if_called_with_invalid_values.

import ckan.logic.validators as validators
from fractions import Fraction
from decimal import Decimal
import warnings

invalid_values = [
42.5,
"42.5",
"1e6",
"text",
Fraction(3, 2),
Decimal("19.99"),
1 + 1j,
1 + 0j, # int(complex) fails, so expect the same
]
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=DeprecationWarning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I guess this deprecation warning isn't too bad, I still feel a bit nervous ignoring it.

Using divmod() on complex numbers was removed in Python 3. I know we won't be moving to Python 3 any time soon, but maybe it's simple enough to test if the number is complex and don't run divmod() in this case?

If you can't find an easy way to do so, I guess I'd rather see the deprecation message than ignore it (although I can be convinced otherwise).

for v in invalid_values:
raises_Invalid(validators.int_validator)(v, None)


#TODO: Need to test when you are not providing owner_org and the validator
# queries for the dataset with package_show