Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6 annotation: named enumerations #57

Closed
handrews opened this issue Sep 18, 2016 · 11 comments
Closed

v6 annotation: named enumerations #57

handrews opened this issue Sep 18, 2016 · 11 comments

Comments

@handrews
Copy link
Contributor

handrews commented Sep 18, 2016

The Problem

Enumerations are often cryptic, particularly when they exist to match legacy systems that valued storage efficiency over readability. While it is possible to include more information with the title and description fields at the same level as the enum, it is not possible to associate any additional information with each enum value.

There are two use cases:

Documentation

This falls squarely within JSON Schema’s goals, and is simply about providing an easily-understood-by-humans string for each enum value.

UI Generation

This is analogous to the value+label tuples common in web application framework geared towards producing select widgets. While JSON Schema is intended to help build UIs, it is debatable as to whether this is enough of a core goal to motivate features on its own. See also issue #55

The Proposals

There have been several proposals to address this. The options so far are:

  • A parallel array of human-readable names under a different keyword adjacent to ”enum”
  • A parallel-ish array of [enumValue, humanName] tuples under a different keyword adjacent to ”enum”
  • Replacing the current “enum” array with an array of tuples of (enumValue, humanName)

Due ”enum” values supporting any JSON type, it is not possible to have a JSON object mapping values to names. This is why lists of tuples are proposed instead.

@geraintluff proposed the parallel array of names, under the keyword ”enumNames”: https://github.com/json-schema/json-schema/wiki/enumNames-(v5-proposal)

@nemesisdesign proposed replacing with a tuple array, using the keyword ”choices”, drawn from web app frameworks: https://github.com/json-schema/json-schema/wiki/choices-(v5-proposal-to-enhance-enum)

@sam-at-github proposed the parallel-ish array of tuples, under the keyword ”enumLut” (although this is more or less the same as the proposed transitional period for moving the “choices”). See the comments in the issue filed for "choices" at the old repository (and also for a discussion of the validity of UI generation as a goal): json-schema/json-schema#211

Pros and cons

  • separate keywords for enum values and human-readable names preserves our existing distinction between validation keywords and annotation keywords
  • Parallel arrays are error-prone and very difficult to manage with anything but a very short enumeration
  • Making the array hold tuples so that the order is irrelevant makes it more robust, but involves duplication. If the enum value itself is a complex object or list, the duplication can get non-trivial
  • Replacing ”enum” with a new keyword that holds tuples is disruptive, and combines validation and annotation into one keyword, which we’ve otherwise avoided
  • A list of tuples, whether in addition to or in place of ”enum”, matches how many web development frameworks set up <select> inputs in forms.

In terms of schema design purity, the parallel array of names is the best solution. ”enum” remains a validation property, and ”enumNames” (or whatever we call the parallel array) is an annotation property.

In terms of ease of use, replacing the current value list with a tuple list is the best option. It removes any possibility of mis-matching values and names, and avoids any duplication. The cost is some syntactic noise for unnamed enums as the entries need to be tuples whether there are names or not.

In terms of flexibility, the parallel-ish array of tuples, which is keyed by the value rather than matched strictly by order, is the best option. It allows unnamed enums to continue to work exactly as they already do. We also preserve the validation vs annotation property separation. And it is not vulnerable to mismatches by miscounting. The cost is needing to duplicate the enum values, and then the values can get out of sync.

Steps towards a resolution

We should decide whether the separation of validation and annotation keywords is a fundamental part of the JSON Schema approach (again, see issue #55). If it is, then we can discard the "replace with a list of tuples" option, as it would be used for both validation and annotation. It would be the only annotation that leaves noise in the validation syntax even when it is not used. The value itself may be a tuple, so the top level must always be a tuple in order to avoid ambiguity, even if there is no name present.

If we do settle on the validation/annotation split principle, we're down to either adding a list of names that must be strictly parallel to the list of values, or we must add a list of tuples that are correlated by the value in the tuple. The former option is likely to get out of order or end up with the wrong number of entries, while the latter is likely to end up with values out of sync.

For simple values, keeping the values in sync should be pretty easy, but if enums supply complex data structure values, bugs are likely. I suspect that complex values in enums are quite rare.

For small sets of values, keeping lists in parallel should be easy, but long enums will lead to bugs. I suspect that long lists are more common than complex values.

If long lists are more common than complex values, we should choose the option that is more robust for long lists, which is the list of tuples. I'd appropriate the "enumName" keyword for it, even though that was proposed for the list of names, because it clearly ties the list of tuples to the "enum" property.

One mitigation for bugs involving values getting out of sync is that a debug mode could easily check that every value in the tuple list is an actual value of the corresponding enum. I am NOT proposing this as a step in validating instances- JSON Schema seems to generally be fine with nonsensical schemas (although that's another principle that we should confirm in issue #55). I am just speculating about an additional tool, like a linter for JSON Schema.

The point being that it would be possible to detect the most likely bugs from using a list of tuples with a theoretical linter, but the only thing such a linter could check with the list of names is that it is not longer than the enumeration. I think this, plus the likelihood of long enumerations vs complex values, gives the list of tuples alongside the existing "enum" list the edge.

@awwright
Copy link
Member

Alternative solution: just use {oneOf: [...]} with a set of schemas, that can themselves contain title/description/constant

@handrews
Copy link
Contributor Author

handrews commented Sep 18, 2016

My first reaction was "that's a cumbersome mess" and then I wrote it out and now I agree with you :-)

{
    "oneOf": [
        {"enum": ["foo"], "title": "Pick Foo"},
        {"enum": ["bar"], "title": "Pick Bar"},
        {"enum": ["whatever"], "title": "Don't Care"}
    ]
}

Not much more cumbersome than the list of tuples, and avoids both the duplication and validation vs annotation concerns. The one real (minor) problem is that the intent is not overly clear. I think that if we combine this with const ( issue #58 ) then the intent becomes much more clear:

{
    "oneOf": [
        {"const": "foo", "title": "Pick Foo"},
        {"const": "bar", "title": "Pick Bar"},
        {"const": "whatever", "title": "Don't Care"}
    ]
}

Anyway, +1 to this idea.

@ruifortes
Copy link

ruifortes commented Oct 7, 2016

maybe the proposal for using object is not out. It seams es6 Object.getOwnPropertyNames() maintain the order.

Of course the UI library that reads the schema would have to iterate using this method.

maybe event using translations like this:

{
    "enum": {
        "UTC": {"$ref": "url://schemas/clock/translations#timezone/UTC"},
        "GMT0": {"$ref": "url://schemas/clock/translations#timezone/GMT0"},
        "EAT-3": {"$ref": "url://schemas/clock/translations#timezone/EAT-3"},
    }
}

@handrews
Copy link
Contributor Author

const has been added in PR #139 which has already gotten one approval and is waiting out the review period. This will make the oneOf proposal pretty readable and concise.

@ruifortes unless/until JSON requires maintaining property order, we can't rely on it for JSON Schema features. JSON is separate from JavaScript, and its spec prevents any reliance on object property ordering.

Since I filed this, I'm closing it on the grounds that we should see how const + oneOf works out. Additionally, now that there is a separate project for a UI-oriented schema extension ( #67 ), the original use cases really belong with that project rather than with JSON Schema validation. If anyone wants to pursue further work for the UI use cases, it should be done with JSON UI Schema

@adjenks
Copy link

adjenks commented Jan 22, 2020

Why was this so hard to find? I spend a long time trying to figure out how to annotate enums. The example should be put in the docs. I was searching online for quite a while to get here, I found a link to this page from https://groups.google.com/forum/#!topic/json-schema/w_5mVYB7OHg

@tillias
Copy link

tillias commented Apr 7, 2021

I'm very sorry for hijacking this thread, but what is outcome of this ticket? There are billions of mentions and I'm not sure what is the implemented solution. What is the proposed way to use named enumerations with multiple fields?

@gregsdennis
Copy link
Member

the workaround is to use a oneOf with const subschemas as mentioned here.

@tillias
Copy link

tillias commented Apr 7, 2021

We're trying to achieve: OpenAPITools/openapi-generator#9140

@gregsdennis what can be valid JSON for

{
    "oneOf": [
        {"const": "001", "name": "first"},
        {"const": "002", "name": "second"},
        {"const": "003", "name": "third"}
    ]
}

I need some particular property to be "oneOf" and unfortunately example provided is not self-descriptive :(

@karenetheridge
Copy link
Member

karenetheridge commented Apr 7, 2021

"name" is not a valid keyword there, but "description" is. So your tool should be able to use this subschema to generate the choices (using "anyOf" instead of "oneOf" should work as well, since all the consts are mutually exclusive):

  "properties": {
    "businessId": {
      "oneOf": [
        {"const": "001", "description": "first"},
        {"const": "002", "description": "second"},
        {"const": "003", "description": "third"}
      ]
    }
  }

@juliusdanek
Copy link

@handrews, 4.5 years later is this still the accepted solution? Or have there been other attempts made to make this clearer?

@hgeldenhuys
Copy link

hgeldenhuys commented Apr 27, 2021

OneOf and Const only work with value-label pairs and simple annotations, but not with translations.

I wouldn't mix data and UI concerns in the same schema.

Data should only store values, ui validation should own its own schema.

For example, your data schema could be:

{
  "$id": "http://acme.corp/animal-data.json",
  "animal": {
    "$ref": "#/definitions/animal"
  },
  "definitions": {
    "animal": {
      "enum": ["dog", "cat", "mouse"]
    }
  }
}

...and your UI schema could contain the translation values using properties or patternProperties:

{
 "$id": "http://acme.corp/animal-ui.json",
 "type": "array",
 "items": [
    {
      "$ref": "#/definitions/translation"
    }
  ],
  "definitions": {
    "translation": {
      "type": "object",
      "properties": {
        "enum": {
           "$ref": "http://acme.corp.animal-data.json#/definitions/animal"
        },
        "translations": {
           "patternProperties": {
            "^[a-z][a-z]_[A-Z][A-Z]$": { "type": "string" }
          },
        }
      }
    }
  }
}

That means your data could look like this:

{
   "animal": "dog"
}

and your translation for your UI could look like this:

[
  {
    "enum": "cat",
    "translations": {
      "en_US": "Cat",
      "fr_CA": "Chat",
      "af_ZA": "Kat"
    }
  },
  {
    "enum": "dog",
    "translations": {
      "en_US": "Dog",
      "fr_CA": "Chien",
      "af_ZA": "Hond"
    }
  },
  {
    "enum": "mouse",
    "translations": {
      "en_US": "Mouse",
      "fr_CA": "Souris",
      "af_ZA": "Muis"
    }
  },
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants