Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Any for schemas with no type #389

Closed
forest-benchling opened this issue Apr 14, 2021 · 12 comments
Closed

Use Any for schemas with no type #389

forest-benchling opened this issue Apr 14, 2021 · 12 comments
Labels
✨ enhancement New feature or improvement

Comments

@forest-benchling
Copy link
Collaborator

Describe the bug
When there's a model property with no type field, it causes a NoneProperty to be generated. I believe that according to JSONSchema, if there is no type specified, it should accept any type (though I'm not 100% sure; see conflicting answers in this thread https://stackoverflow.com/questions/29230349/is-type-optional-in-json-schema). It does seem more logical to compile to Any rather than None, since there's already type: null in JSONSchema.

To Reproduce
Add a model with a property with no type, such as:

    Field:
      type: object
      properties:
        value:
		  description: Field value

Expected behavior
Expected the generated field.py model to have value: Any.

OpenAPI Spec File
See above

Desktop (please complete the following information):

  • OS: [e.g. macOS 10.15.1]
  • Python Version: [e.g. 3.8.0]
  • openapi-python-client version [e.g. 0.1.0]
@forest-benchling forest-benchling added the 🐞bug Something isn't working label Apr 14, 2021
@forest-benchling
Copy link
Collaborator Author

@dbanty @emann I'm curious to get your thoughts on this, also if there's any workaround. cc @packyg @bowenwr @dtkav

@dbanty
Copy link
Collaborator

dbanty commented Apr 14, 2021

I believe you're right when it comes to JSONSchema, however OpenAPI differs in a few ways that can be seen in their documentation.

null is not supported as a type

That's not to say that we should interpret no type as null, but type: null is not allowed in OpenAPI. This is near the top of the spec (wish they had perma-links to sections...)

type - Value MUST be a string. Multiple types via an array are not supported.

This is under Schema Object (the relevant section for where type is used) where it's noted that type does not strictly follow JSONSchema, but is instead modified to better accommodate OpenAPI.


I don't think that any of this is a clear indication of the right behavior for generators, it mostly just is evidence of there not being clarity (like with a lot of OpenAPI's spec). My bigger concern is around practicality. If we were to support Any, how would we go about serializing or deserializing that object? We can't just pass it directly through to httpx without knowing what sort of data it is or who knows what would happen. I'm also not sure what benefit an Any-like type provides to an API. If consumers have no idea what to pass, what can the field be used for?

In your particular use-case, what is actually supposed to be passed to value? Is it possible to do a oneOf with the possible concrete types?

@forest-benchling
Copy link
Collaborator Author

forest-benchling commented Apr 14, 2021

@dbanty Thanks for clarifying the OpenAPI spec difference, that's helpful to know.

Our use case might be pretty specific to us...we would basically like Any to act as a JSON type, but with no type-checking. The serializer/deserializer would just be the identity function.

We tried defining it as a oneOf of string, number, array, etc. The issue is that in our use case, when we get a response from the API, we generally know exactly which type it is, and that forces us to use cast or type: ignore.

As an illustration, let's say you had an API which took in a type string, and returned an example of that type:

"string" -> API returns "foo"
"number" -> API returns 3.4

In your calling code you want to do something like this:

my_string: str = API("string")  # type error

This wouldn't be a problem if it were a one-off case, but unfortunately it's one of the most common API operations in our system.

@forest-benchling
Copy link
Collaborator Author

forest-benchling commented Apr 14, 2021

Also, just noting that we do already support "identity" serialization/deserialization of JSON objects with

additionalProperties: true

, which compiles to Dict[str, Any], so it wouldn't be a novel concept.

@forest-benchling
Copy link
Collaborator Author

@dbanty Any thoughts on the above?

@dbanty
Copy link
Collaborator

dbanty commented Apr 19, 2021

I'm not opposed to the idea completely. I'm personally of the believe that any input or output should be typed so clients know what they're looking at and can avoid runtime errors... but I'm not so pedantic that I'll prevent others from doing it 😆. I just want to make sure we're adhering to the OpenAPI spec as closely as possible. It's unclear to me what the expected behavior is of such an endpoint. Without clear guidance in the spec, I tend to turn to the openapi-tools project as they're sort of the "reference implementation". What do they do when there's no type field?

@forest-benchling
Copy link
Collaborator Author

forest-benchling commented Apr 19, 2021

@dbanty I searched openapi-tools but I couldn't find any results on GitHub except https://github.com/apisyouwonthate/openapi.tools, which just seems to link out to other tools. Would you mind linking me to the repo you're referring to?

I tried out swagger-codegen, which is another one of the tools we were initially evaluating. Using this spec,

"components": {
      "schemas": {
          "AASequence": {
              "type": "object",
              "properties": {
                  "aliases": {
                      "description": "string"
                  }
              }
          }
      }
},

this was the model class that was generated. As far as I can tell, it supports any object type.

# coding: utf-8

"""


    
    
    Generated by: https://github.com/swagger-api/swagger-codegen.git
"""

import pprint
import re  # noqa: F401

import six

class AASequence(object):
    """NOTE: This class is auto generated by the swagger code generator program.

    Do not edit the class manually.
    """
    """
    Attributes:
      swagger_types (dict): The key is attribute name
                            and the value is attribute type.
      attribute_map (dict): The key is attribute name
                            and the value is json key in definition.
    """
    swagger_types = {
        'aliases': 'Object'
    }

    attribute_map = {
        'aliases': 'aliases'
    }

    def __init__(self, aliases=None):  # noqa: E501
        """AASequence - a model defined in Swagger"""  # noqa: E501
        self._aliases = None
        self.discriminator = None
        if aliases is not None:
            self.aliases = aliases

    @property
    def aliases(self):
        """Gets the aliases of this AASequence.  # noqa: E501

        string  # noqa: E501

        :return: The aliases of this AASequence.  # noqa: E501
        :rtype: Object
        """
        return self._aliases

    @aliases.setter
    def aliases(self, aliases):
        """Sets the aliases of this AASequence.

        string  # noqa: E501

        :param aliases: The aliases of this AASequence.  # noqa: E501
        :type: Object
        """

        self._aliases = aliases

    def to_dict(self):
        """Returns the model properties as a dict"""
        result = {}

        for attr, _ in six.iteritems(self.swagger_types):
            value = getattr(self, attr)
            if isinstance(value, list):
                result[attr] = list(map(
                    lambda x: x.to_dict() if hasattr(x, "to_dict") else x,
                    value
                ))
            elif hasattr(value, "to_dict"):
                result[attr] = value.to_dict()
            elif isinstance(value, dict):
                result[attr] = dict(map(
                    lambda item: (item[0], item[1].to_dict())
                    if hasattr(item[1], "to_dict") else item,
                    value.items()
                ))
            else:
                result[attr] = value
        if issubclass(AASequence, dict):
            for key, value in self.items():
                result[key] = value

        return result

    def to_str(self):
        """Returns the string representation of the model"""
        return pprint.pformat(self.to_dict())

    def __repr__(self):
        """For `print` and `pprint`"""
        return self.to_str()

    def __eq__(self, other):
        """Returns true if both objects are equal"""
        if not isinstance(other, AASequence):
            return False

        return self.__dict__ == other.__dict__

    def __ne__(self, other):
        """Returns true if both objects are not equal"""
        return not self == other

@dbanty
Copy link
Collaborator

dbanty commented Apr 19, 2021

Whoops, sorry I got the name wrong. Here's a link https://github.com/OpenAPITools/openapi-generator .

Looks like swagger-codegen is trying to serialize everything dynamically at runtime instead of using static typing. Works fine as long as no one puts a non-json-serializable type in there (e.g. a datetime) which will be the case for our implementation as well.

@forest-benchling
Copy link
Collaborator Author

forest-benchling commented Apr 19, 2021

@dbanty Gotcha, thanks. Just tried out openapi-generator with the same schema, here's the generated code:

"""
    Benchling API

    No description provided (generated by Openapi Generator https://github.com/openapitools/openapi-generator)  # noqa: E501

    The version of the OpenAPI document: 2.0.0
    Generated by: https://openapi-generator.tech
"""


import re  # noqa: F401
import sys  # noqa: F401

from openapi_client.model_utils import (  # noqa: F401
    ApiTypeError,
    ModelComposed,
    ModelNormal,
    ModelSimple,
    cached_property,
    change_keys_js_to_python,
    convert_js_args_to_python_args,
    date,
    datetime,
    file_type,
    none_type,
    validate_get_composed_info,
)


class AaSequence(ModelNormal):
    """NOTE: This class is auto generated by OpenAPI Generator.
    Ref: https://openapi-generator.tech

    Do not edit the class manually.

    Attributes:
      allowed_values (dict): The key is the tuple path to the attribute
          and the for var_name this is (var_name,). The value is a dict
          with a capitalized key describing the allowed value and an allowed
          value. These dicts store the allowed enum values.
      attribute_map (dict): The key is attribute name
          and the value is json key in definition.
      discriminator_value_class_map (dict): A dict to go from the discriminator
          variable value to the discriminator class name.
      validations (dict): The key is the tuple path to the attribute
          and the for var_name this is (var_name,). The value is a dict
          that stores validations for max_length, min_length, max_items,
          min_items, exclusive_maximum, inclusive_maximum, exclusive_minimum,
          inclusive_minimum, and regex.
      additional_properties_type (tuple): A tuple of classes accepted
          as additional properties values.
    """

    allowed_values = {
    }

    validations = {
    }

    additional_properties_type = None

    _nullable = False

    @cached_property
    def openapi_types():
        """
        This must be a method because a model may have properties that are
        of type self, this must run after the class is loaded

        Returns
            openapi_types (dict): The key is attribute name
                and the value is attribute type.
        """
        return {
            'aliases': (bool, date, datetime, dict, float, int, list, str, none_type,),  # noqa: E501
        }

    @cached_property
    def discriminator():
        return None


    attribute_map = {
        'aliases': 'aliases',  # noqa: E501
    }

    _composed_schemas = {}

    required_properties = set([
        '_data_store',
        '_check_type',
        '_spec_property_naming',
        '_path_to_item',
        '_configuration',
        '_visited_composed_classes',
    ])

    @convert_js_args_to_python_args
    def __init__(self, *args, **kwargs):  # noqa: E501
        """AaSequence - a model defined in OpenAPI

        Keyword Args:
            _check_type (bool): if True, values for parameters in openapi_types
                                will be type checked and a TypeError will be
                                raised if the wrong type is input.
                                Defaults to True
            _path_to_item (tuple/list): This is a list of keys or values to
                                drill down to the model in received_data
                                when deserializing a response
            _spec_property_naming (bool): True if the variable names in the input data
                                are serialized names, as specified in the OpenAPI document.
                                False if the variable names in the input data
                                are pythonic names, e.g. snake case (default)
            _configuration (Configuration): the instance to use when
                                deserializing a file_type parameter.
                                If passed, type conversion is attempted
                                If omitted no type conversion is done.
            _visited_composed_classes (tuple): This stores a tuple of
                                classes that we have traveled through so that
                                if we see that class again we will not use its
                                discriminator again.
                                When traveling through a discriminator, the
                                composed schema that is
                                is traveled through is added to this set.
                                For example if Animal has a discriminator
                                petType and we pass in "Dog", and the class Dog
                                allOf includes Animal, we move through Animal
                                once using the discriminator, and pick Dog.
                                Then in Dog, we will make an instance of the
                                Animal class but this time we won't travel
                                through its discriminator because we passed in
                                _visited_composed_classes = (Animal,)
            aliases (bool, date, datetime, dict, float, int, list, str, none_type): string. [optional]  # noqa: E501
        """

        _check_type = kwargs.pop('_check_type', True)
        _spec_property_naming = kwargs.pop('_spec_property_naming', False)
        _path_to_item = kwargs.pop('_path_to_item', ())
        _configuration = kwargs.pop('_configuration', None)
        _visited_composed_classes = kwargs.pop('_visited_composed_classes', ())

        if args:
            raise ApiTypeError(
                "Invalid positional arguments=%s passed to %s. Remove those invalid positional arguments." % (
                    args,
                    self.__class__.__name__,
                ),
                path_to_item=_path_to_item,
                valid_classes=(self.__class__,),
            )

        self._data_store = {}
        self._check_type = _check_type
        self._spec_property_naming = _spec_property_naming
        self._path_to_item = _path_to_item
        self._configuration = _configuration
        self._visited_composed_classes = _visited_composed_classes + (self.__class__,)

        for var_name, var_value in kwargs.items():
            if var_name not in self.attribute_map and \
                        self._configuration is not None and \
                        self._configuration.discard_unknown_keys and \
                        self.additional_properties_type is None:
                # discard variable.
                continue
            setattr(self, var_name, var_value)

The code is pretty hard to read so I'm not exactly sure, but based on this method

    @cached_property
    def openapi_types():
        """
        This must be a method because a model may have properties that are
        of type self, this must run after the class is loaded

        Returns
            openapi_types (dict): The key is attribute name
                and the value is attribute type.
        """
        return {
            'aliases': (bool, date, datetime, dict, float, int, list, str, none_type,),  # noqa: E501
        }

it looks like they are treating it as an any type. But I'm not sure how de/serialization is being done.

@dbanty
Copy link
Collaborator

dbanty commented Apr 19, 2021

The code is pretty hard to read so I'm not exactly sure, but based on this method

That's why openapi-python-client exists 😆.

it looks like they are treating it as an any type. But I'm not sure how de/serialization is being done.

Looks like they are setting it to the list of builtins that they know how to serialize / deserialize. My guess is that the function they use to transform that into actual args in requests is doing a runtime type check to turn date/datetime into strings (or whatever JSON serializer they are using).

Any is probably the right call based on those two examples then. I would say without us trying to do runtime validation/serialization for now, just pass it through like you said.

@forest-benchling
Copy link
Collaborator Author

forest-benchling commented Apr 20, 2021

Sounds good, thank you for the feedback!

Thinking about backwards-compatibility, it seems like if someone were relying on NoneProperty, their code should still work, so that's good, although they would lose the type specificity.

@dbanty dbanty added this to To do in OpenAPI 3.0 Compliance via automation Apr 28, 2021
@dbanty dbanty added ✨ enhancement New feature or improvement and removed 🐞bug Something isn't working labels May 2, 2021
@dbanty dbanty changed the title No type causes NoneProperty to be generated Use Any for schemas with no type May 2, 2021
@dbanty
Copy link
Collaborator

dbanty commented Jul 5, 2021

Closed by #445

@dbanty dbanty closed this as completed Jul 5, 2021
OpenAPI 3.0 Compliance automation moved this from To do to Done Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or improvement
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants