Modeling enumeration values semantically #997

fennibay · 2020-11-03T15:06:23Z

Enumerations are used commonly in automation systems, a simple example would be a property having the value range "on", "off" or "auto". Building automation in particular uses them frequently via the multistate objects in BACnet.

TD spec. defers to JSON schema for data modeling. JSON schema provides the enum keyword for describing enumerations, however one can only describe primitives (numbers, strings), but cannot assign semantics to these enum values.

If we could assign semantics to enumerated values via a well-defined URI coming from an ontology, we could:

Handle cases where different parties use different terminology, e.g. "on" vs. "active"
Attach additional information to enumerated values such as colors, icons, translations
map the enumerated values to the protocol requirements in protocol bindings. For instance for multistate objects on BACnet, a BACnet protocol binding could convert semantic enumerations to integers; or for a RESTful interface the semantic enumerations could be mapped to strings that this interface will understand etc.

How could this be solved? Would a simple solution such as interpreting JSON schema's enum keyword via context extension work? For instance: instead of enum: ["on", "off", "auto"] we would use enum: ["myont:on", "myont:off", "myont:auto"] where myont is specified in @context.

p.s. I use enumerated value to refer to a member (e.g. "on") of an enumeration, where enumeration refers to the whole grouping "on", "off", "auto". I think if we can model enumerated values semantically, different groupings of them in different enumerations can be sufficiently handled in the data model with existing enum mechanism.

Theoretically yes if you want to annotate properly but @id will not be understood by regular JSON Schema parsers or ones who rely on the existing TD vocabulary to generate payloads or UIs. oneOf is already one of the least used features in the WG so I would not say that my proposal is very widely understood :/

fennibay · 2020-11-09T13:59:02Z

Well, oneOf is a standard feature of JSON Schema, so I think we can assume it will be supported. Who wouldn't understand it?

After further thinking, I understand that my essential need is enumerations of IRIs, instead of enumerations of strings or numbers. Once I have IRIs, I can relate them to an ontology, add specific protocol conversions, add translations...

So going for IRIs, this would be my attempt:

{
  "oneOf": [
    { "const": {"@id": "myont:on" } },
    { "const": {"@id": "myont:off" } },
    { "const": {"@id": "myont:auto" } }
  ]
}

From JSON schema POV: I would require the value to be an object type of one of the three values.

Interpreting with JSON-LD support would further conclude that these are individuals with given IRIs. On this basis, I can add protocol bindings:

{
    "@id": "myont:on",
    "htv:body": "on",
}

This way on HTTP-level simple strings (on, off, auto) would be transmitted, while a thing consumer works with the linked IRIs. I can also extend it further with translations and other mechanisms.

This is maybe a nicer solution than your "messy" one, but I understand that one would be more robust in that it would also work with non-JSON-LD-aware clients.

egekorkan · 2020-11-09T14:20:37Z

Well, oneOf is a standard feature of JSON Schema, so I think we can assume it will be supported. Who wouldn't understand it?

I also agree but some are hard coding the meaning/parsing of DataSchema keywords rather than using JSON Schema based approaches.

Regarding the rest, I am not sure if I understand everything. As far as I know, @id is a JSON-LD feature and would not be understood by JSON Schema parsers? Also, putting "htv:body": "on", is bad practice in TD design since protocol related information should be only used in forms.

sebastiankb · 2020-12-03T12:55:15Z

I cannot decide whether the issue here is more or less solved or not. Otherwise we should involve members from JSON Schema or JSON-LD here.

fennibay · 2020-12-03T18:46:34Z

@sebastiankb, @egekorkan, sorry I couldn't respond for a while.

In summary, I think we found two alternative solutions:

Alt. 1

Example:

{
  "oneOf": [
    { "const": {"@id": "myont:on" } },
    { "const": {"@id": "myont:off" } },
    { "const": {"@id": "myont:auto" } }
  ]
}

Any kind of value, also when not part of an enumeration, should be modeled as an IRI if the author intends to have it well-defined.
TD, being JSON-LD, represents an IRI as {"@id": "..."}
We can construct an enum from that using the oneOf construct from JSON Schema
A non-enum could also be represented as such.
Further information can also be attached to this IRI, e.g.:
1. Translations
2. Symbols/icons
3. Concrete representation for a protocol // should be under forms
4. Also other IRIs having the same meaning can be mapped using owl:sameAs.
This doesn't cover the case where the author writes TD only as JSON and not JSON-LD. But such an author probably doesn't have requirements regarding linked data, anyway. If the TD is written as JSON-LD, yet the consumer expects only a JSON, they would see a type of object with the structure {"@id": "..."}. This is a general problem of using JSON-LD vs. only supporting JSON, independent from TDs, I'd say.

Alt. 2

{
  "oneOf": [
    { "const":"on","@type":"myont:on" },
    { "const":"off","@type":"myont:off" },
    { "const":"auto","@type":"myont:auto" }
  ]
}

// "type": "string" is not necessary, as the const value is a string literal

Use constants, and add @type to attach semantics.
This has the advantages that a JSON consumer would still understand it, and no need to map to protocol value, if they do not differ. So, backwards compatibility is better.
Further information can still be attached via the @type IRI, like alt. 1
I see maybe just a conceptual disadvantage, where we categorize enum values as types instead of individuals. But I don't think this is a major issue, it's just a modelling preference.

Conclusion

We found ways to solve this problem, without needing to extend TD spec. IMHO Alt. 2 looks better.

I think we can close this issue. Many thanks for the discussion.

egekorkan · 2020-12-03T21:42:09Z

I also like the second alternative. However, I think that this can be included in the spec on how to provide such information on enum values. I am quite sure that there are others who would be interested. Even putting "description" is a requirement in the profile spec (for the core profile) so that would be a way to have description in enum values.

fennibay · 2020-12-13T19:14:40Z

After examining JSON-LD mechanisms further, I came up with another (IMO better) alternative:

Alt. 3

"@context": {"const": {"@type": "@vocab", "@context": {"on": "myont:On", "off": "myont:Off", "auto": "myont:Auto"}}},
"oneOf": [
    {"const": "on"},
    {"const": "off"},
    {"const": "auto"}               
]

This is similar to Alt. 1 in that the enum values are mapped to individuals and not types.
The syntax is simpler than Alt. 1 and 2, except for the @context part ;-)
The IRIs can be completely different than the strings. The mapping is flexible and local. I tried to show this with different casing.
It falls back nicely to JSON, in that we just have simple strings at the end. // Although I don't understand yet in general, how a consumer without JSON-LD support can interpret TDs.
@context could also be somewhere deeper or higher in the hierarchy.

Please also see the example in JSON-LD playground. The expanded form shows that the enum values are expanded to IRIs.

@egekorkan I understand this is all possible via context extension, foreseen in the spec. Do you think an extension to the spec is still necessary? We could provide an example, but I don't see an extension necessary at the moment.

egekorkan · 2020-12-13T19:42:31Z

I sadly don't understand the example but that is my problem :) However, JSON LD playground throws an error.

fennibay · 2020-12-13T20:51:44Z

Ok :-), I try to dissect the example:

The first @context makes an expanded term definition for const and says:

The @type of values should be of @vocab, i.e. they shall be IRIs, but even regular texts should be mapped to the currently active vocabulary, i.e. terms in context. Without this the strings would remain as just strings, and not be converted to IRIs.
The second, inner @context then adds some terms to the vocabulary, mapping the string literals on, off, auto to the desired IRIs.

Then when I use these terms as simple strings, they will be expanded to IRIs as defined by the context.

If parser doesn't understand JSON-LD, @context will be ignored completely, and we will have string literals.

I tried the playground again, it worked for me. What do you get as the error message?

mcr · 2020-12-14T15:19:10Z

This issue pointed to while discussing: ietf-wg-asdf/SDF#8

mjkoster · 2020-12-25T00:57:25Z

Alternative #2 above is closest to the semantic annotation pattern we use for events, actions, properties, and data fields of complex data schemas in similar other TD examples. It also allows adding labels, descriptions, and localization hints. I'd also recommend using "anyOf" (see the referenced ASDF discussion above - ietf-wg-asdf/SDF#8 ) . Here is an example using OneDM style example URIs.

{
  "anyOf": [
    { 
      "description": "Manually override the automatic control and place the device in the powered state",
      "label": "on",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/on" 
      "const":"on"
    },
    { 
      "description": "Manually override the automatic control and place the device in the un-powered state",
      "label": "off",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/off" 
      "const":"off"
    },
    { 
      "description": "Apply the automatic control to the device state",
      "label": "auto",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/auto" 
      "const":"auto"
     }
  ]
}

mjkoster · 2020-12-25T01:17:30Z

The file uploaded to OneDM that this references looks like this:

{
  "info": {
    "title": "Example file for H-O-A industrial control switch", 
    "version": "2020-12-24", 
    "copyright": "Copyright 2020 Michael J. Koster. All rights reserved.", 
    "license": "https://github.com/one-data-model/oneDM/blob/master/LICENSE"
  }, 
  "namespace": {
    "ex": "https://onedm.org/exploratory/"
  }, 
  "defaultnamespace": "ex", 

  "sdfObject": {
    "HoaSwitch": {
      "sdfProperty": {
        "SwitchState": {
          "sdfRef": "ex:#/sdfData/HoaSwitchState"
        }
      }, 
      "sdfAction": {
        "on": {}, 
        "off": {},
        "auto": {}
      }
    }
  },
  "sdfData": {
    "HoaSwitchState": {
      "sdfChoice": {
        "on": {
          "description": "Manually override the automatic control and place the device in the powered state",
          "label": "on",
          "default": "on"
        },
        "off": {
          "description": "Manually override the automatic control and place the device in the un-powered state",
          "label": "off",
          "default": "off"
        },
        "auto": {
          "description": "Apply the automatic control to the device state",
          "label": "auto",
          "default": "auto"
        }
      }
    }
  }
}

You could also add some more "@type" statements to the TD to annotate the TD itself, and its properties and actions:

"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfProperty/SwitchState"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/on"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/off"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/auto"

egekorkan · 2023-12-21T09:06:50Z

Some new discussion on this:

enum is better for a developer who is working with the data schema. oneOf creates a more complicated structure, that is not necessary.
We can keep enum but add another term that allows mapping the enum values to more semantically enriched terms. E.g. below:

{
//...
"enum":[4,6,123],
"enumMap":{
  // the map uses array indexes. This way, we are not coupled to value types in enum
  "0": {"description":"Lowest speed for fan","@type":"myOnto:lowSpeed"}
  "1": {"description":"Medium speed for fan","@type":"myOnto:midSpeed"}
  "2": {"description":"Maxium speed for fan","@type":"myOnto:maxSpeed"}
  }
}

@lu-zero This also relates to the data mapping discussion. This is a rather common use case in bacnet devices.

lu-zero · 2023-12-21T12:52:04Z

The overlap between enum and oneOf is fairly annoying, for the use-case at hand using oneOf feels better.
I'd check with upstream jsonschema since this ambiguity should happen on a broader scope :/

in the end enum is an oneOf of consts w/out the chance to use additional metadata fields from DataSchema.

sebastiankb added the PR needed label Dec 4, 2020

sebastiankb assigned vcharpenay Apr 26, 2021

egekorkan added the Needs discussion more discussion is needed before getting to a solution label Oct 26, 2021

egekorkan added the Defer to TD 2.0 label Nov 23, 2023

egekorkan unassigned vcharpenay Nov 23, 2023

egekorkan added the data mapping workitem: discussions on data mapping concepts label Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modeling enumeration values semantically #997

Modeling enumeration values semantically #997

fennibay commented Nov 3, 2020

egekorkan commented Nov 3, 2020

fennibay commented Nov 6, 2020

egekorkan commented Nov 6, 2020

fennibay commented Nov 9, 2020 •

edited

egekorkan commented Nov 9, 2020

sebastiankb commented Dec 3, 2020

fennibay commented Dec 3, 2020

egekorkan commented Dec 3, 2020

fennibay commented Dec 13, 2020

egekorkan commented Dec 13, 2020

fennibay commented Dec 13, 2020

mcr commented Dec 14, 2020

mjkoster commented Dec 25, 2020 •

edited

mjkoster commented Dec 25, 2020 •

edited

egekorkan commented Dec 21, 2023

lu-zero commented Dec 21, 2023

Modeling enumeration values semantically #997

Modeling enumeration values semantically #997

Comments

fennibay commented Nov 3, 2020

See also

egekorkan commented Nov 3, 2020

fennibay commented Nov 6, 2020

egekorkan commented Nov 6, 2020

fennibay commented Nov 9, 2020 • edited

egekorkan commented Nov 9, 2020

sebastiankb commented Dec 3, 2020

fennibay commented Dec 3, 2020

Alt. 1

Alt. 2

Conclusion

egekorkan commented Dec 3, 2020

fennibay commented Dec 13, 2020

Alt. 3

egekorkan commented Dec 13, 2020

fennibay commented Dec 13, 2020

mcr commented Dec 14, 2020

mjkoster commented Dec 25, 2020 • edited

mjkoster commented Dec 25, 2020 • edited

egekorkan commented Dec 21, 2023

lu-zero commented Dec 21, 2023

fennibay commented Nov 9, 2020 •

edited

mjkoster commented Dec 25, 2020 •

edited

mjkoster commented Dec 25, 2020 •

edited