Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling enumeration values semantically #997

Open
fennibay opened this issue Nov 3, 2020 · 16 comments
Open

Modeling enumeration values semantically #997

fennibay opened this issue Nov 3, 2020 · 16 comments
Labels
data mapping workitem: discussions on data mapping concepts Defer to TD 2.0 Needs discussion more discussion is needed before getting to a solution PR needed

Comments

@fennibay
Copy link

fennibay commented Nov 3, 2020

Enumerations are used commonly in automation systems, a simple example would be a property having the value range "on", "off" or "auto". Building automation in particular uses them frequently via the multistate objects in BACnet.

TD spec. defers to JSON schema for data modeling. JSON schema provides the enum keyword for describing enumerations, however one can only describe primitives (numbers, strings), but cannot assign semantics to these enum values.

If we could assign semantics to enumerated values via a well-defined URI coming from an ontology, we could:

  • Handle cases where different parties use different terminology, e.g. "on" vs. "active"
  • Attach additional information to enumerated values such as colors, icons, translations
  • map the enumerated values to the protocol requirements in protocol bindings. For instance for multistate objects on BACnet, a BACnet protocol binding could convert semantic enumerations to integers; or for a RESTful interface the semantic enumerations could be mapped to strings that this interface will understand etc.

How could this be solved? Would a simple solution such as interpreting JSON schema's enum keyword via context extension work? For instance: instead of enum: ["on", "off", "auto"] we would use enum: ["myont:on", "myont:off", "myont:auto"] where myont is specified in @context.

p.s. I use enumerated value to refer to a member (e.g. "on") of an enumeration, where enumeration refers to the whole grouping "on", "off", "auto". I think if we can model enumerated values semantically, different groupings of them in different enumerations can be sufficiently handled in the data model with existing enum mechanism.

See also

Google Digital Buildings refers to this need as well: https://github.com/google/digitalbuildings/blob/master/ontology/docs/ontology.md#multi-state-values

@egekorkan
Copy link
Contributor

I think this is a valid use case, however the example solution would not work since a TD consumer should understand it as sending exactly myont:on instead of on. A messy solution would be:

{
//...
  "oneOf": [
    { "type": "string", "const":"on","@type":"myont:on" },
    { "type": "string", "const":"off","@type":"myont:off" },
    { "type": "string", "const":"auto","@type":"myont:auto" }
  ]
}

@fennibay
Copy link
Author

fennibay commented Nov 6, 2020

Hi @egekorkan, many thanks for the feedback. I wasn't really hoping that that idea would work :-)

For my understanding, would the following also work?

{
  "oneOf": [
    { "@id": "myont:on" },
    { "@id": "myont:off" },
    { "@id": "myont:auto" }
  ]
}

I tried to express here that consumer should interact using IRIs with the thing.

@egekorkan
Copy link
Contributor

Theoretically yes if you want to annotate properly but @id will not be understood by regular JSON Schema parsers or ones who rely on the existing TD vocabulary to generate payloads or UIs. oneOf is already one of the least used features in the WG so I would not say that my proposal is very widely understood :/

@fennibay
Copy link
Author

fennibay commented Nov 9, 2020

Well, oneOf is a standard feature of JSON Schema, so I think we can assume it will be supported. Who wouldn't understand it?

After further thinking, I understand that my essential need is enumerations of IRIs, instead of enumerations of strings or numbers. Once I have IRIs, I can relate them to an ontology, add specific protocol conversions, add translations...

So going for IRIs, this would be my attempt:

{
  "oneOf": [
    { "const": {"@id": "myont:on" } },
    { "const": {"@id": "myont:off" } },
    { "const": {"@id": "myont:auto" } }
  ]
}

From JSON schema POV: I would require the value to be an object type of one of the three values.

Interpreting with JSON-LD support would further conclude that these are individuals with given IRIs. On this basis, I can add protocol bindings:

{
    "@id": "myont:on",
    "htv:body": "on",
}

This way on HTTP-level simple strings (on, off, auto) would be transmitted, while a thing consumer works with the linked IRIs. I can also extend it further with translations and other mechanisms.

This is maybe a nicer solution than your "messy" one, but I understand that one would be more robust in that it would also work with non-JSON-LD-aware clients.

@egekorkan
Copy link
Contributor

Well, oneOf is a standard feature of JSON Schema, so I think we can assume it will be supported. Who wouldn't understand it?

I also agree but some are hard coding the meaning/parsing of DataSchema keywords rather than using JSON Schema based approaches.

Regarding the rest, I am not sure if I understand everything. As far as I know, @id is a JSON-LD feature and would not be understood by JSON Schema parsers? Also, putting "htv:body": "on", is bad practice in TD design since protocol related information should be only used in forms.

@sebastiankb
Copy link
Contributor

I cannot decide whether the issue here is more or less solved or not. Otherwise we should involve members from JSON Schema or JSON-LD here.

@fennibay
Copy link
Author

fennibay commented Dec 3, 2020

@sebastiankb, @egekorkan, sorry I couldn't respond for a while.

In summary, I think we found two alternative solutions:

Alt. 1

Example:

{
  "oneOf": [
    { "const": {"@id": "myont:on" } },
    { "const": {"@id": "myont:off" } },
    { "const": {"@id": "myont:auto" } }
  ]
}
  1. Any kind of value, also when not part of an enumeration, should be modeled as an IRI if the author intends to have it well-defined.
  2. TD, being JSON-LD, represents an IRI as {"@id": "..."}
  3. We can construct an enum from that using the oneOf construct from JSON Schema
  4. A non-enum could also be represented as such.
  5. Further information can also be attached to this IRI, e.g.:
    1. Translations
    2. Symbols/icons
    3. Concrete representation for a protocol // should be under forms
    4. Also other IRIs having the same meaning can be mapped using owl:sameAs.
  6. This doesn't cover the case where the author writes TD only as JSON and not JSON-LD. But such an author probably doesn't have requirements regarding linked data, anyway. If the TD is written as JSON-LD, yet the consumer expects only a JSON, they would see a type of object with the structure {"@id": "..."}. This is a general problem of using JSON-LD vs. only supporting JSON, independent from TDs, I'd say.

Alt. 2

{
  "oneOf": [
    { "const":"on","@type":"myont:on" },
    { "const":"off","@type":"myont:off" },
    { "const":"auto","@type":"myont:auto" }
  ]
}

// "type": "string" is not necessary, as the const value is a string literal

  1. Use constants, and add @type to attach semantics.
  2. This has the advantages that a JSON consumer would still understand it, and no need to map to protocol value, if they do not differ. So, backwards compatibility is better.
  3. Further information can still be attached via the @type IRI, like alt. 1
  4. I see maybe just a conceptual disadvantage, where we categorize enum values as types instead of individuals. But I don't think this is a major issue, it's just a modelling preference.

Conclusion

We found ways to solve this problem, without needing to extend TD spec. IMHO Alt. 2 looks better.

I think we can close this issue. Many thanks for the discussion.

@egekorkan
Copy link
Contributor

I also like the second alternative. However, I think that this can be included in the spec on how to provide such information on enum values. I am quite sure that there are others who would be interested. Even putting "description" is a requirement in the profile spec (for the core profile) so that would be a way to have description in enum values.

@fennibay
Copy link
Author

After examining JSON-LD mechanisms further, I came up with another (IMO better) alternative:

Alt. 3

"@context": {"const": {"@type": "@vocab", "@context": {"on": "myont:On", "off": "myont:Off", "auto": "myont:Auto"}}},
"oneOf": [
    {"const": "on"},
    {"const": "off"},
    {"const": "auto"}               
]
  1. This is similar to Alt. 1 in that the enum values are mapped to individuals and not types.
  2. The syntax is simpler than Alt. 1 and 2, except for the @context part ;-)
  3. The IRIs can be completely different than the strings. The mapping is flexible and local. I tried to show this with different casing.
  4. It falls back nicely to JSON, in that we just have simple strings at the end. // Although I don't understand yet in general, how a consumer without JSON-LD support can interpret TDs.
  5. @context could also be somewhere deeper or higher in the hierarchy.

Please also see the example in JSON-LD playground. The expanded form shows that the enum values are expanded to IRIs.

@egekorkan I understand this is all possible via context extension, foreseen in the spec. Do you think an extension to the spec is still necessary? We could provide an example, but I don't see an extension necessary at the moment.

@egekorkan
Copy link
Contributor

I sadly don't understand the example but that is my problem :) However, JSON LD playground throws an error.

@fennibay
Copy link
Author

Ok :-), I try to dissect the example:

The first @context makes an expanded term definition for const and says:

  • The @type of values should be of @vocab, i.e. they shall be IRIs, but even regular texts should be mapped to the currently active vocabulary, i.e. terms in context. Without this the strings would remain as just strings, and not be converted to IRIs.
  • The second, inner @context then adds some terms to the vocabulary, mapping the string literals on, off, auto to the desired IRIs.

Then when I use these terms as simple strings, they will be expanded to IRIs as defined by the context.

If parser doesn't understand JSON-LD, @context will be ignored completely, and we will have string literals.

I tried the playground again, it worked for me. What do you get as the error message?

@mcr
Copy link

mcr commented Dec 14, 2020

This issue pointed to while discussing: ietf-wg-asdf/SDF#8

@mjkoster
Copy link
Contributor

mjkoster commented Dec 25, 2020

Alternative #2 above is closest to the semantic annotation pattern we use for events, actions, properties, and data fields of complex data schemas in similar other TD examples. It also allows adding labels, descriptions, and localization hints. I'd also recommend using "anyOf" (see the referenced ASDF discussion above - ietf-wg-asdf/SDF#8 ) . Here is an example using OneDM style example URIs.

{
  "anyOf": [
    { 
      "description": "Manually override the automatic control and place the device in the powered state",
      "label": "on",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/on" 
      "const":"on"
    },
    { 
      "description": "Manually override the automatic control and place the device in the un-powered state",
      "label": "off",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/off" 
      "const":"off"
    },
    { 
      "description": "Apply the automatic control to the device state",
      "label": "auto",
      "@type":"https://onedm.org/exploratory/#/sdfData/HoaSwitchState/sdfChoice/auto" 
      "const":"auto"
     }
  ]
}

@mjkoster
Copy link
Contributor

mjkoster commented Dec 25, 2020

The file uploaded to OneDM that this references looks like this:

{
  "info": {
    "title": "Example file for H-O-A industrial control switch", 
    "version": "2020-12-24", 
    "copyright": "Copyright 2020 Michael J. Koster. All rights reserved.", 
    "license": "https://github.com/one-data-model/oneDM/blob/master/LICENSE"
  }, 
  "namespace": {
    "ex": "https://onedm.org/exploratory/"
  }, 
  "defaultnamespace": "ex", 

  "sdfObject": {
    "HoaSwitch": {
      "sdfProperty": {
        "SwitchState": {
          "sdfRef": "ex:#/sdfData/HoaSwitchState"
        }
      }, 
      "sdfAction": {
        "on": {}, 
        "off": {},
        "auto": {}
      }
    }
  },
  "sdfData": {
    "HoaSwitchState": {
      "sdfChoice": {
        "on": {
          "description": "Manually override the automatic control and place the device in the powered state",
          "label": "on",
          "default": "on"
        },
        "off": {
          "description": "Manually override the automatic control and place the device in the un-powered state",
          "label": "off",
          "default": "off"
        },
        "auto": {
          "description": "Apply the automatic control to the device state",
          "label": "auto",
          "default": "auto"
        }
      }
    }
  }
}

You could also add some more "@type" statements to the TD to annotate the TD itself, and its properties and actions:

"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfProperty/SwitchState"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/on"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/off"
"@type": "https://onedm.org/exploratory/#/sdfObject/HoaSwitch/sdfAction/auto"

@egekorkan egekorkan added the Needs discussion more discussion is needed before getting to a solution label Oct 26, 2021
@egekorkan
Copy link
Contributor

Some new discussion on this:

  • enum is better for a developer who is working with the data schema. oneOf creates a more complicated structure, that is not necessary.
  • We can keep enum but add another term that allows mapping the enum values to more semantically enriched terms. E.g. below:
{
//...
"enum":[4,6,123],
"enumMap":{
  // the map uses array indexes. This way, we are not coupled to value types in enum
  "0": {"description":"Lowest speed for fan","@type":"myOnto:lowSpeed"}
  "1": {"description":"Medium speed for fan","@type":"myOnto:midSpeed"}
  "2": {"description":"Maxium speed for fan","@type":"myOnto:maxSpeed"}
  }
}

@lu-zero This also relates to the data mapping discussion. This is a rather common use case in bacnet devices.

@egekorkan egekorkan added the data mapping workitem: discussions on data mapping concepts label Dec 21, 2023
@lu-zero
Copy link
Contributor

lu-zero commented Dec 21, 2023

The overlap between enum and oneOf is fairly annoying, for the use-case at hand using oneOf feels better.
I'd check with upstream jsonschema since this ambiguity should happen on a broader scope :/

in the end enum is an oneOf of consts w/out the chance to use additional metadata fields from DataSchema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data mapping workitem: discussions on data mapping concepts Defer to TD 2.0 Needs discussion more discussion is needed before getting to a solution PR needed
Projects
None yet
Development

No branches or pull requests

7 participants