New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to create schema with optional fields #213

Closed
Gr1N opened this Issue Jun 29, 2018 · 14 comments

Comments

3 participants
@Gr1N
Collaborator

Gr1N commented Jun 29, 2018

Just tried to use schema creation and it's awesome, but I found, I think, bug with optional fields.

Here is my model:

from typing import Optional
from pydantic import BaseModel

class TestModel(BaseModel):
    foo: Optional[int] = None

print(TestModel.schema_json(indent=2))

Output:

Traceback (most recent call last):
  File "test_pydantic_schema.py", line 9, in <module>
    print(TestModel.schema_json(indent=2))
  File "/pydantic/pydantic/main.py", line 288, in schema_json
    return json.dumps(cls.schema(by_alias=by_alias), default=pydantic_encoder, **dumps_kwargs)
  File "/pydantic/pydantic/main.py", line 278, in schema
    s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
  File "/pydantic/pydantic/main.py", line 278, in <dictcomp>
    s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
  File "/pydantic/pydantic/fields.py", line 146, in schema
    if issubclass(self.type_, Enum):
TypeError: issubclass() arg 1 must be a class

Same issue with Union:

from typing import Union
from pydantic import BaseModel

class TestModel(BaseModel):
    foo: Union[str, int]

print(TestModel.schema_json(indent=2))

I think pydantic should cover cases like this, I don't know what is the best way to solve issue with Optional, but in case of Union oneOf or anyOf rule can be used http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.7.

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jun 29, 2018

yes.

I guess Optional is just equivalent of Union[None, Foobar].

As discussed on #129 I'm not that bothered by json-schema or it's equivalents, but yes, something like that makes sense.

Perhaps we just have "type": ["int", "float"] and an array for type just means "union"?

@Gr1N

This comment has been minimized.

Collaborator

Gr1N commented Jun 30, 2018

For JSON Schemas, "union" means:

{
    "type": "object",
    "properties": {
        "foo": {
            "anyOf": [
                { "type": "int },
                { "type": "float" },
            ]
        }
    }
}
@Gr1N

This comment has been minimized.

Collaborator

Gr1N commented Jun 30, 2018

Also I don't think that we need to use anyOf for Optional, I think better to convert this case to property with default=null.

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 1, 2018

I'm still not sure how to display unions, I don't think we should return multiple instances of the entire field definition, eg. if we have

  "Gender": {
    "type": "int",
    "title": "Gender",
    "required": false,
  }

and the value can also be a string, I think

  "Gender": {
    "type": ["int", "str"],
    "title": "Gender",
    "required": false,
  }

or

  "Gender": {
    "type": {"anyOf": ["int", "str"]},
    "title": "Gender",
    "required": false,
  }

is clearer than

  "Gender": {
    "anyOf": [
      {
        "type": "int",
        "title": "Gender",
        "required": false,
      },
      {
        "type": "str",
        "title": "Gender",
        "required": false,
      },
    ]
  }

But that get's more complicated in strange situations like Union[str, FoobarModel]

What do you think?

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 2, 2018

ok, after looking at #218 I propose the following way to deal with this:

type becomes a dictionary.

For "simple" fields, eg. just an int or float, type like:

'type': {
  'item_type': 'float',
}

For unions we'd have:

'type': {
  'mode': 'any_of',
  'item_type': ['str', 'float'],
}

For things like List[int] we'd have:

'type': {
  'item_type': 'int',
  'shape': 'list',
}

or List[Union[int, str]] we'd have:

'type': {
  'mode': 'any_of',
  'item_type': ['int', 'str'],
  'shape': 'list',
}

For Dict[str, int] we'd have:

'type': {
  'item_type': 'int',
  'key_type': 'str',
  'shape': 'dict',
}

(we could add key_mode if the key is a union, but that's a very rare case)

The names mode, item_type and shape are not great (suggestions very welcome), but I think this is the best way to go.

Questions:

  • are their any case's I've missed?
  • do we do type: {'item_type': 'int', ...} or just add all these items to the main dict?
  • what do we do when implement tuples #12, I guess something like properties on sub models?
  • what do we do with Union[int, FoobarModel] - I guess we have type as above, then properties. What do we do with Union[FooModel, BarModel] - maybe this is just banned?

@Gr1N @x1ah

@x1ah

This comment has been minimized.

x1ah commented Jul 2, 2018

May missed enum type ?

In my opinion, we should carefully change the field structure. and typing.Union 's field info looks good.

from typing import Union
from pydantic import BaseModel


class Foo(BaseModel):
    ubar = Union[int, str]

print(Foo.__fields__['r'].info)
# {'type': 'typing.Union[int, str]',
#  'required': True,
#  'sub_fields': [<Field r_int: type='int', required=True, validators=['int_validator']>,
#   <Field r_str: type='str', required=True, validators=['not_none_validator', 'str_validator','anystr_strip_whitespace', 'anystr_length_validator']>]}

and similar for List[int, str], we get

{'type': 'typing.List[int]',
 'required': True,
 'sub_fields': [<Field r_int: type='int', ...>]}

by this way, this problem will be solved without changing the original fields struct, just put into sub_fields.

What do you think? @samuelcolvin

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 2, 2018

Just copying python's syntax is ok with me, it'll make it slightly harder for people to read programmatically, but it'll be more concise and extendible.

The case of enums is already implemented via choices.

@Gr1N

This comment has been minimized.

Collaborator

Gr1N commented Jul 2, 2018

are their any case's I've missed?

What about Optional[any type]?

And one question, why we want to invent new format? Why not use JSON Schema spec? I understand that spec in some cases can be very explicit, but it covers all needs to my mind. Also it's good to follow specification in cases if we want to generate API specs from pydantic models and give them to another team who works, for example, with different language, and if generated specs follow JSON schema spec we can use them to generate code for any language we want.

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 2, 2018

I think your suggestion of default=null for Optional is good.

And one question, why we want to invent new format? Why not use JSON Schema spec?

see #129 (comment), also:

  1. It would be masses of work (perhaps more than the rest of pydantic combined) to fully support JSON schema, which is only draft anyway.
  2. pydantic isn't tied to JSON and I think we would run into lots of conflicts, eg. (AFAIK) json schema doesn't cover lots of types with pydantic supports, eg. decimal, uuid, dict keys have to be strings in json
@Gr1N

This comment has been minimized.

Collaborator

Gr1N commented Jul 2, 2018

Understand. Can you please share how cases Union[int, FoobarModel] and Union[FooModel, BarModel] will be looks in proposed format?

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 2, 2018

Ok, I've redesigned it, (maybe we're now back at JSON schema? 😄):

simple:

'type': {
  'type': 'float',
}

Union[float, int]:

'type': {
  'type': 'any_of',
  'types': [
    {
      'type': 'float',
    },
    {
      'type': 'int',
    },
  ],
}

List[int]:

'type': {
  'type': 'list',
  'item_type': {
    'type': 'int',
  },
}

List[Union[int, str]]:

'type': {
  'type': 'list',
  'item_type': {
    'mode': 'any_of',
    'types': [
      {
      'type': 'float',
      },
      {
      'type': 'int',
      },
    ],
  }
}

Dict[str, int]:

'type': {
  'type': 'dict',
  'item_type': {
    'type': 'int',
  },
  'key_type': {
    'type': 'str',
  },
  'shape': 'dict',
}

FooModel:

'type': {
  'type': 'object',
  'properties': {
    ...
  }
}

Enum(int):

'type': {
  'type': 'int',
  'choices': [
      [1, 'Male'],
      [2, 'Female'],
      [3, 'Other'],
  ]
}

Then Union[int, FoobarModel] is just:

'type': {
  'mode': 'any_of',
  'types': [
    {
      'type': 'int',
    },
      'type': 'object',
      'properties': {
        ...
      }
    }
  ],
}

and Union[FooModel, BarModel] is:

'type': {
  'type': 'any_of',
  'types': [
    {
      'type': 'object',
      'properties': {
        ...
      }
    },
      'type': 'object',
      'properties': {
        ...
      }
    }
  ],
}

In other words types are recursive; any_of, list and dict are just types of type.

For a whole field, we then get:

'age': {
  'type': {
    'type': 'float',
  },
  'name': 'Age',
  'required': True,
}

What do you think?


Update: correct object.
Update 2: mode > type

@Gr1N

This comment has been minimized.

Collaborator

Gr1N commented Jul 2, 2018

LGTM, let's try to go with your proposal.

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 2, 2018

as per discussion on gitter:

  • the top level property should change type > schema
  • there should be a shortened display of just 'schema': {'type': 'int'} which is just 'schema': 'int', this applies both at the top level for simple types and where simple types are nested within more complex types.

Overal the design is encapsulated by:

'foobar': {
  'schema': {
    'type': 'list',
    'item_type': {
      'mode': 'any_of',
      'types': [
        'float', 'int',
      ],
    }
  },
  'name': 'Foobar',
  'required': True,
}

@samuelcolvin samuelcolvin added the bug label Jul 4, 2018

@samuelcolvin samuelcolvin self-assigned this Jul 4, 2018

samuelcolvin added a commit that referenced this issue Jul 20, 2018

samuelcolvin added a commit that referenced this issue Jul 20, 2018

@samuelcolvin

This comment has been minimized.

Owner

samuelcolvin commented Jul 20, 2018

I've implemented this in #232, it's not exactly as described above, but similar. The nice thing is that this doesn't break much of what's already implemented in schema()

samuelcolvin added a commit that referenced this issue Jul 31, 2018

improving schema (#232)
* improving schema, fix #213

* tweask and history
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment