Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StrictBool #579

Open
cazgp opened this issue Jun 5, 2019 · 16 comments

Comments

Projects
None yet
4 participants
@cazgp
Copy link
Contributor

commented Jun 5, 2019

Thank you for bringing types into python -- it makes daily life so much easier :)

However, I think there's a pretty serious bug with the bool validator.

Minimal working example:

from pydantic import BaseModel

class Model(BaseModel):
    a: bool

Model(a=1) # should cause exception
Model(a="snap") # should cause exception

a should be a bool and only a bool. If a user provides any other value to the endpoint (using fastapi), the database throws because it's the wrong type.

I cannot think of a single type that can be passed to bool(X) that won't return a boolean, and yet bool_validator returns that coercion. e.g. bool({"moo": 1}) == True. This makes the validator completely useless :(

Is there a reason behind this? And if it needs to stay as is, do you think adding a strict mode to the validator is an option?

Also, as an aside, there are no unit tests for the function.

I'm very happy to make a PR (adding some unit tests as well!) because it shouldn't take long, but I need to understand if there's a reason for this first.

Thanks!

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 5, 2019

Yes there's a reason for this.

Please read #578, #284 and #360. You are not the first person to ask about this - pydantic is a parsing library and not a validation library.

There definitely are tests for this, hence the 100% coverage, here are 13 at least.

If you do not want this behaviour:

  • you can implement a validator with pre=True that only accepts a set list of things
  • you can implement your own StrictBool type.

Regarding your examples, since bool in python are a sub-type of int, I think most people would except 1 (and therefore probably '1') to be valid as a boolean value.

Other strings are trickier, in python bool('any string with length > 0') == True, but that's awkward since bool('false') == True, hence it's not trivial to decide which values are truey, eg. DRF and yaml accept values like yes.

I'd accept a PR to add support for StrictBool.

@samuelcolvin samuelcolvin changed the title Type coercion in bool validator is far too liberal StrictBool Jun 5, 2019

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 5, 2019

There's also the case of non strings, most people would (I imagine) expect any positive int to be coersed to True, "special" values like -1 are trickier.

What about dictionaries and lists? Same behaviour as python? Or maybe same as javascript if we've parsed JSON - it's not that simple.

What about arbitrary class? I imagine most people would expect most of them to be truey like in python, but that can get tricky.

@cazgp

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

Thanks for responding so quickly!

Thanks for pointing out the tests. I grepped an awful lot and couldn't find anything, and then was in test_validators for a very long time.

StrictBool sounds like it could be useful, however I still think that the coercion at the end is not expected behaviour. I cannot imagine that someone would expect {"test": 1} to be parsed as a bool. It really seems like that's user error which should be caught.

So, in your examples, I would say throw an error for all of them! Unless the value is a bool, or it matches a particular set of strings, I say reject it.

@cazgp

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

Oh and 1 and 0 should probably be exceptions too

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 5, 2019

I can of understand that {"test": "john doe"} being parsed to {'test': False} would be confusing for a lot of people.

However this is a breaking change, and people don't like breaking changes.

So what do other people think?

In particular:

  • which strings should be true/false?
  • what should we do with numbers other than 0/1?
  • what should we do with lists and dicts, what about if they're empty? true/false/error?
  • what should we do with other classes? true/false/empty? Should we look for the __bool__ method?

@cazgp cazgp referenced this issue Jun 5, 2019

Merged

StrictBool #580

4 of 4 tasks complete
@jasonkuhrt

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

which strings should be true/false?

We could parse based on Yaml's boolean syntax spec;

Regexp:
 y|Y|yes|Yes|YES|n|N|no|No|NO
|true|True|TRUE|false|False|FALSE
|on|On|ON|off|Off|OFF

what should we do with numbers other than 0/1?

Error, or, if going with the above, not integers whatsoever

what should we do with lists and dicts, what about if they're empty? true/false/error?

Error

what should we do with other classes?

Error

@jasonkuhrt jasonkuhrt referenced this issue Jun 11, 2019

Open

Version 1 Release #576

0 of 5 tasks complete
@jasonkuhrt

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

Oh and 1 and 0 should probably be exceptions too

Not 100% sure about this, but I get it. I think for a StrictBool type it is less acceptable than a bool type.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 11, 2019

Definitely 0 and 1 should be allowed.

And also the strings of those values.

@samuelcolvin samuelcolvin added this to the Version 1 milestone Jun 11, 2019

@jasonkuhrt

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

Sure, and yeah the string variants too then

@dmontagu

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

My two cents regarding how I'd personally like to see these classes implemented:


  • For StrictBool:
    • bools pass validation (duh)
    • The only strings that don't raise validation errors are the yaml strings and "0" and "1"
    • The only ints that don't raise validation errors are 0 and 1
    • Everything else is a validation error -- if you use the StrictBool type, then you take responsibility for explicitly casting to bool.
      • I think checking for __bool__ would be okay, but if used, should probably require it to be explicitly implemented by the user (rather than inherited). But I don't know if there is a clean way to accomplish this, and would probably rather not have to worry about it.

  • For regular bool:
    • Follow standard python bool behavior except parse "0" and the yaml false strings as False.
    • (If this differs from the current implementation, e.g., other special cases are handled, I'm fine keeping it as is.)

I can of understand that {"test": "john doe"} being parsed to {'test': False} would be confusing for a lot of people.

Yeah, I would definitely vote that this case be treated as a validation error before it get parsed as False. True would be okay if parsed as a regular bool.

@cazgp

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

Follow standard python bool behavior except parse "0" and the yaml false strings as False.

Standard python behaviour is to parse literally anything as True, so I really don't think that's sensible.

class Test:
    pass

def test():
    pass

bool(Test())       # true
bool({"test": 1})  # true
bool(test)         # true

It all but renders this type as useless IMHO.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 12, 2019

It all but renders this type as useless IMHO.

Happy to have a discussion and hear opposing views, but please let's not resort to hysterics. 99.9% of pydantic usage is in parsing data from "dumb" formats, eg. JSON, YAML, form data, URL arguments where classes and functions can't be defined, so parsing the above as True would manifestly NOT render the type "useless".


In answer to the question, I think @jasonkuhrt's answer plus 0, 1, '0', '1' is probably the best option.

Have a look here for how DRF does this.

@cazgp

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

hysterics

That wasn't a wildly exaggerated and emotional reaction. I believe that a bool parser which permits nearly anything to be coerced into a boolean (particularly one named after a portmanteau of "pedantic") is not that useful. I think the behaviour is unexpected and leads to problems for people relying on the library to act as a type guard between their python functions. In my case, a non-boolean value was hitting the database, causing it to throw. Luckily it was caught in development.

Unfortunately github is unhappy with the DRF link.

Happy to go with @jasonkuhrt + 0, 1 as it seems straightforward and expected.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

commented Jun 12, 2019

Weird, the link works fine for me. It was line 636 and below in /rest_framework/fields.py.

@cazgp

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

Sweet, that works now!

Yeah that looks sound.

I might not get a chance to do it this week, but happy to have a pop next week :)

@dmontagu

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

It all but renders this type as useless IMHO.

My reasoning here was that if StrictBool were an alternative option, then I would expect the non-strict version to just follow python's built-in behavior (seems like the least surprising option, and is consistent with how str vs StrictStr works). If StrictBool were an option, and I cared to be strict, I would just use that.

Validation isn't pydantic's only capability -- I would still want to annotate fields as bool for type hinting / schema building, etc., even if I didn't care much about how it was validated (more complex validation logic may incur unwanted overhead; if that were the case I'd be fine with minimal validation in most cases). And I may care more about validating other fields, while still wanting type hints, etc. So I do think this implementation of this type could be useful.

But this is a weakly held opinion, and I'm okay with a stricter implementation.

@dmontagu dmontagu referenced this issue Jun 23, 2019

Open

Make bool_validator strict #617

1 of 4 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.