Value 'None' is not of type 'str' #77

nowox · 2016-10-24T09:57:40Z

With this YAML:


---
foo: # This field is empty and despite YAML parser sees it as a None value, I am expecting an empty string

And this schema:

type: map
mapping: 
  foo:
    type: str

I get the error "- Value 'None' is not of type 'str'". However, a string can be empty.

A key allownone should be supported for this particular case.

The text was updated successfully, but these errors were encountered:

Grokzen · 2016-10-24T10:40:04Z

@nowox This is not an issue becuase yaml handles that data field properly because if i do the following in python

>>> yaml.load("foo: # foobar")
{'foo': None}

i get back a None value and that is correct one as i see it.

The only way you can solve this is to explicitly cast the value in the data part to a string by doing the following:

>>> yaml.load("foo: "" # foobar")
{'foo': ""}

This is only considered a problem if yaml would interpret the yaml file above a string but it would still output the error then that would be a problem. If yaml do not handle this case where is expects to be an empty string when you define your data that way then you need to take it up with the YAML specification directly or pyyaml/ruamel yaml if they should change their specs (that i do not think they will do)

About the allownone way of doing it, i do not currently see that as a solution to the problem because then you can very easily argue that you should do that for all possible types that is supported becuase there will always be someone who has that special need and then we have a problem with argument creep that will never stop.

The other solution that exists is to basically rewrite the entire spec and implementation to support multiple types for keys by doing the same solution as we already have in sequences (read up on the last part of this section https://github.com/Grokzen/pykwalify/blob/unstable/docs/validation-rules.rst#sequence) where you could define something like:

type: map
mapping: 
  foo:
    - type: none
    - type: str

The third solution is to implement the default rule that was mentioned in #62 and that could help to solve some cases where you want to support some sane default and where you can implement to support None values.

Currently i am leaning towards implementing nr 2 and nr 3 but not the first one.

nowox · 2016-10-24T18:35:55Z

Thanks for your complete answer. Unfortunately I realized that most of my issues with PyKwalify are in fact due to limitations of YAML itself. The None issue is recurrent but other ambiguities such as {foo: 3.14e-3} which is not interpreted as number but as a string cannot be used with range. PyKwalify offers no mechanisms to cast or transform a value prior the validation. I've tried to dig into the code, but it is a bit harder that I initially thought. PyKwalify is generating a lot of debug info, but not the useful one and error messages are not all the time useful because exceptions are not raised where and when they should but later in the process. I feel it is quite difficult to trace the errors back.

So I had to make a decision and I switched for voluptuous. The philosophy is very different and I like the way I can clearly separate the YAML parsing itself and the validation. The core idea of PyKwalify which is to write the validation schema in YAML is pretty sexy, but at the end, It feels more convenient to define the schema directly in Python because in most cases I would need to write extensions in pure Python anyway.

That said I think Voluptuous and Pyparsing could work well together. I also looked at yamlious which looks pretty similar to PyKwalify.

As noticed before in #29 I still think YAML really deserve an official validation language that is part of the YAML standard and I will go for that one once it is announced. Last but not least, design software such as Entreprise Architect can easily export XSD shemas. Voluptuous, PyKwalify or even http://www.schematype.org/ would be very useful once they can be generated from XSD or directly from such software.

Grokzen · 2016-10-25T00:26:44Z

@nowox I got no problems with that :)

One thing i can add about some intentions with pykwalify that i might not be that open with or have stated somewhere more in details is that the original idea of pykwalify was to make a port of kwalify lib and the kwalify specification that was done a while back. I did not intent for this lib to grow anything beyond the original kwalify spec (even if it has in some areas) and my plan was to make it so that people could just drop this one into the same place where they used the original kwalify lib and still have some way to validate their kwalify schemas they already had.

I never intended to implement new things into the spec or to solve that many new problems based on that statement. If pykwalify is missing some element to what it can do, then it is probably intentional and probably will be that way.

Some comments on the features you mentioned. You are correct that there is no explicit transformation or some kind of tests for different value types but that is intentional in some ways becuase it was not part of the kwalify spec and inside the validation schema it serves no real purpose and it falls back to how YAML deals with the values and what is casted where. To help with this you need to do some preprocessing of your data before you send it into validation if you have some explicit casting needs.
However you can still say that pykwalify still should support this becuase the above mentioned value should be declarabe as a type: number and pykwalify should be able to deal with that value if it thinks it is a valid number type. If it passes the type validation and you have defined a range attribute that you want to validate, then the implementation should try all the known supported types like int, float, string and handle all cases transparently and only fail if the number can't be casted in all cases back to a float and validate your range attributes. This case might be a bug if you think like that.

On the code part i should say that because it is basically a port of the original code with the main logic of the classes and how things is validated, some of the problems and debugging output is a legacy from the kwalify implementation because of how they did it and not so much about how i converted it. I do think it is really hard to trac the debug output myself and i only use it when doing deep debugging of problems.

Exceptions is raised in the end of the code and not where they happen because you want to collect all validation errors from the entire schema and not when it happens becuase then you would only show 1 error and not all errors at the same time.

You mention that defining the validation directly in python, that is something that is possible in pykwalify already with the added feature extendsions https://github.com/Grokzen/pykwalify/blob/unstable/docs/extensions.rst#extensions

The main idea with extensions was to give the user some easy way to make custom validation code themself to help patch some validation that was not good enough in the core validation logic. In you case with the number it might have been a way to solve the multiple types of validation as you could just take the value of the key and it would get passed into a custom validation method where you could do some extra work on values and validate them yourself. It would basically server as a buffer for me to not have to rush to implement new stuff that would be requested but still not stop the users from solving their problems while i was implementing or fixing the thing they have done in the custom validation method.

Grokzen · 2016-11-05T15:27:20Z

@nowox I will close this issue becuase there is no real reason for keeping this open becuase there is nothing i can really do with this issue becuase pykwalify will never add the ability to alter how pyyaml or ruamelyaml internals work and to patch around the problems that they bring to the table.

The only known solution right now is to either use another type of validation type for the value you want or that you define a custom extension for just this case where it can deal with the problem that None values should be interpreted as something else.

Because of this i see no reason to keep this open becuase there is no action or bug that i am willing to commit to fix so i see. If you find that you have some smaller part that you want changed based on the discussion we had above then please open a new issue.

nowox changed the title ~~Empty strings are not considered as strings~~ Value 'None' is not of type 'str' Oct 24, 2016

Grokzen closed this as completed Nov 5, 2016

jmchilton mentioned this issue Sep 7, 2018

[18.01] Allow schema to validate galaxyproject/galaxy#5919

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Value 'None' is not of type 'str' #77

Value 'None' is not of type 'str' #77

nowox commented Oct 24, 2016

Grokzen commented Oct 24, 2016

nowox commented Oct 24, 2016 •

edited

Loading

Grokzen commented Oct 25, 2016

Grokzen commented Nov 5, 2016

Value 'None' is not of type 'str' #77

Value 'None' is not of type 'str' #77

Comments

nowox commented Oct 24, 2016

Grokzen commented Oct 24, 2016

nowox commented Oct 24, 2016 • edited Loading

Grokzen commented Oct 25, 2016

Grokzen commented Nov 5, 2016

nowox commented Oct 24, 2016 •

edited

Loading