Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KHR_xmp_json_ld #1893

Merged
merged 25 commits into from
Jun 11, 2021
Merged

KHR_xmp_json_ld #1893

merged 25 commits into from
Jun 11, 2021

Conversation

weegeekps
Copy link

@weegeekps weegeekps commented Oct 28, 2020

Opening up the discussion for the new JSON-LD compatible version of KHR_xmp.

Some context: it's been identified that there are some inconsistencies in the original specification around KHR_xmp and it's relationship to JSON-LD. This new version of the extension is an attempt at clarifying and solidifying the relationship between KHR_xmp and JSON-LD. In regards to XMP, JSON-LD is important as the XMP team are currently nearing completion of a new ISO standard defining JSON-LD serialization of XMP. We'd like to align more closely to that while also providing additional guidance and restrictions within KHR_xmp in order to ensure both JSON and JSON-LD parsers can properly handle the data.

In short, JSON-LD won't be required or recommended, but if you want that extra control and validation within your KHR_xmp packets, you can have it.

This is largely an in progress pull request at the moment. Here are a list of tasks I've identified, and their progress.

  • Move @context within each packet.
  • Add language defining additional restrictions around JSON-LD.
  • Add section explaining how to use language alternatives.
  • Update JSON schema files.
  • Update included example.

This should be ready for discussion and review.

@emackey
Copy link
Member

emackey commented Nov 3, 2020

I hate to say it, but when I first read this in the sans-serif title font, I read it as KHR_xmp_Id. I've worked in a code base that has lots of SomeKindOfObjectId variables around, perhaps I'm being influenced by that memory here.

In sans-serif, KHR_xmp_ld and KHR_xmp_Id look almost identical.

KHR_xmp_ld
KHR_xmp_Id

In monospace, the difference is clearer, as one serif is missing (with the monospace font on my system at least).

KHR_xmp_ld
KHR_xmp_Id

Not sure if there's a good solution to this. Might it be worth breaking convention and using KHR_xmp_LD?

@weegeekps
Copy link
Author

That's a great point. I'm in favor of breaking convention here for clarity. If there are no protests I will make this change later today.

@weegeekps
Copy link
Author

While reading the latest published version of the ISO spec this morning it dawned on me that we may be able to allow the @type keyword without major harm to JSON parsers. We could add guidance that for JSON parsers they should ignore that keyword, and it would allow type validation for JSON-LD parsers, which I imagine most JSON-LD users would want.

@lexaknyazev, what do you think? Interested to hear your take on this.

@lexaknyazev
Copy link
Member

@emackey
"Id" and "ld" written in sans surely cannot be disambiguated without a context.

Imo, the question comes down to whether "a glTF extension name" is a strong enough context that is expected to always use lower-case (with deprecated spec-gloss being the only exception) thus implicitly resolving the issue. We could also make the whole string to use upper-case given that both XMP and LD are abbreviations.

@donmccurdy
Copy link
Contributor

What is the written phrase, "XMP LD", intended to mean? Or more specifically, is it common to use the abbrevation "LD" with that meaning anywhere except as part of the full "JSON-LD" acronym? I would be tempted to expand the name for clarity, rather than breaking convention, with something like KHR_xmp_json_ld, KHR_metadata_xmp, or even KHR_xmp_v2.

@weegeekps
Copy link
Author

The initial idea was to signify the support of the JSON-LD for XMP ISO spec somewhere in the name, but since ISO is trademarked we can't use something like KHR_xmp_iso. I wouldn't be opposed to KHR_xmp_json_ld but it's definitely more verbose and still has the potential readability problems.

That said, on the 3D Commerce call today it was brought up that since our convention is to do all lowercase after the vendor prefix, is this really an issue? While the uninitiated may make this mistake, they'll only make it once. Additionally, I'd expect most use cases are going to be some tool actually writing the metadata to the file. With the additional context of expanding the name out to KHR_xmp_json_ld I think it becomes even less of an issue.

@weegeekps weegeekps changed the title KHR_xmp_ld KHR_xmp_json_ld Nov 23, 2020
@donmccurdy
Copy link
Contributor

Do we have some updated sample models that implementors can test?

@lexaknyazev
Copy link
Member

After going through the requirements of this extension, it seems that some of them could be incompatible with ISO and XMP specs.

Ordered arrays
XMP:
See dc:creator and dc:date.

This spec:

The @list and @set keywords are forbidden.

ISO:

This JSON object shall have a property @list (which is used for ordered arrays in JSON-LD) and its value shall be a JSON array of items.

Language Alternatives
ISO:

It is not syntactically possible in JSON-LD to represent alternative arrays as it does not have serializations for RDF containers. So use rdf:Alt from RDF vocabulary to indicate the type of the property using @type keyword of JSON-LD.

In JSON-LD, serialization of language alternative is the same as alternative array valued XMP forms with qualifiers with only difference being that each item of the array will have @language JSON-LD keyword as its key.

This spec directly stores language alternative JSON objects without using rdf:Alt.

@weegeekps
Copy link
Author

Honestly, with @list and @set, I'm completely on board with us abandoning that restriction. I added it because it was one of the restrictions we came up with during our earlier discussions, but in retrospect there's really no need for such a restriction.

The language alternatives restriction is a potential issue, but the ISO spec seemingly has conflicting examples of what is acceptable. In section 4.5.5.3 it provides the provided example shows an unordered list in the provided XML with 4 list items, and in the JSON-LD these are provided in a @set with 3 of them using the a default language context. That said, in 4.5.5.5, it definitely mentions the use of rdf:Alt and looking back at the XML spec that does seem appropriate.

I'm fairly uneasy about going down this path with Language Alternatives as I'm afraid it's going to make using just JSON with this extension far more difficult. However, I'm not sure what else we could do to rectify the situation in this case. I don't believe it's acceptable at all for us to say to people, "yes, you will need a JSON-LD parser as well," as in that case we've not solved the problem for smaller users who don't have any interest or need for JSON-LD in their tech stack. When we ran our survey it was about a 50/50 split of respondents, so whatever we can do to meet the needs of both is ideal. Perhaps if we document this sufficiently on how to handle Language Alternatives, that will be enough.

@lexaknyazev
Copy link
Member

lexaknyazev commented Dec 7, 2020

Should both ways of storing unordered collections be supported (e.g. value objects with @set and plain JSON arrays)? If we're going to allow @list anyway, we could support only @set option for symmetry.

ISO spec seemingly has conflicting examples of what is acceptable. In section 4.5.5.3 it provides the provided example shows an unordered list in the provided XML with 4 list item

That section describes mapping of xml:lang attribute that could be arbitrarily present almost anywhere, not language alternative array types.

I'm fairly uneasy about going down this path with Language Alternatives as I'm afraid it's going to make using just JSON with this extension far more difficult.

I'm not sure how to quantify "far more difficult" here. The most annoying thing would be to include rdf prefix every time a packet contains any language alternative values. Without enforcing this one way or another, it seems that we couldn't claim ISO compatibility.

@weegeekps
Copy link
Author

I would say yes, lets support both @set and @list. Doesn't seem to be any harm in supporting either.

You're right about 4.5.5.3, thanks for setting me straight.

Regarding the difficulty with the rdf prefix, my concern is that a subset of very common JSON parsers that tend to be less flexible compared to their peers, such as Jackson for Java and Serde for Rust, make the rdf:_N prefix unfortunately difficult to handle in a graceful manner, and more-so since it also includes a "@type" parameter which is effectively a different value than the rest.

However, as you said, we can't claim ISO compatibility at all without supporting the rdf prefix, so I don't see what else we can do here. I'll make the necessary changes to the extension. Worst case scenario we can either write some libraries for reading this data or put out a guide on how to best serialize and deserialize the data for some of the more inflexible JSON parsers out there.

@lexaknyazev
Copy link
Member

I would say yes, lets support both @set and @list. Doesn't seem to be any harm in supporting either.

What do you think about requiring @set for unordered arrays? Supporting direct storage would increase polymorphism there.

@weegeekps
Copy link
Author

I'm on board with that. It's relatively painless to deal with for implementers and we gain quite a bit from it.

@UX3D-hohenester
Copy link

We have adapted an existing xmp asset to the new version.
box-json-id.zip

Is the extension used correctly?

@weegeekps
Copy link
Author

weegeekps commented Jan 25, 2021

I just merged in a PR with the edge cases. You can find them in the examples/edge-cases path in the glTF-Metadata-CLI repository.

@UX3D-hohenester I took a look at that zip and that does look correct to me. I'm hoping to have the migration feature in the CLI tool released sometime this or next week.

@emackey
Copy link
Member

emackey commented Jan 28, 2021

@weegeekps Can the samples be moved to a separate PR on the glTF-Sample-Models repo? We don't want sample models directly in the spec repo anymore. Thanks!

@weegeekps
Copy link
Author

@emackey Yes. I've just done so and it can be found at KhronosGroup/glTF-Sample-Models#291. Please let me know if there are any changes that need to be made to it. I didn't include a screenshot as the important bit is the metadata inside, not the visual appearance.

@lexaknyazev
Copy link
Member

lexaknyazev commented Feb 3, 2021

More nits.

  • Struct-type XMP properties require their own namespaces. We should list them for completeness:

    Preferred Prefix Namespace Usage
    xmpGImg http://ns.adobe.com/xap/1.0/g/img/ xmp:Thumbnails
    stRef http://ns.adobe.com/xap/1.0/sType/ResourceRef# xmpMM:DerivedFrom, xmpMM:Ingredients, xmpMM:ManagedFrom
    stEvt http://ns.adobe.com/xap/1.0/sType/ResourceEvent# xmpMM:History
    stVer http://ns.adobe.com/xap/1.0/sType/Version# xmpMM:Versions
  • I couldn't find any normative language about non-string data types. XMP defines Boolean, Integer, and Real as strings. ISO/DIS 16684-3 doesn't mention that and provides inconsistent examples (search for "xmp:Rating" there). Our spec maps them to JSON types which may be too strict. Needs some investigation.

  • The description of Language Alternatives in the "XMP Core Properties" section of this extension is outdated.

  • The schema file for packet's instantiating is specific to glTF nodes (see the filename and the property description). We should either generalize it or duplicate it for each supported object type.

  • Maybe mention that JSON-LD and XML use different notation for the default language (from ISO/DIS 16684-3):

    In the XML serialization, the default language is selected via x-default, but in JSON-LD the default is i-default.

  • Review Pantry-Ingredient process:

    • Fields within ResourceRef objects (e.g. values of xmpMM:Ingredients array) should probably use stRef namespace.
    • Maybe mention that implementations should be resilient to non-matching prefixes between ingredients since they will eventually share the same @context. For example in a case when asset A calls Dublin Core as dc while asset B calls the same namespace as dc11.
  • Add a normative reference to JSON-LD spec (https://www.w3.org/TR/json-ld11/).

@lexaknyazev
Copy link
Member

Couple more notes:

  • ISO/DIS 16684-3 requires use of compact IRIs (prefix:suffix). As a side-effect, this leads to @base and @vocab keywords being always ignored. We should probably mention that explicitly here as not everyone would have access to the ISO spec.
  • @language could be defined on @context level thus marking all Text values. Also, language could be redefined (e.g. with "@language": "fr") or cleared (with "@language": null) by using value objects or nested contexts.

@weegeekps
Copy link
Author

Good nits, @lexaknyazev.

I couldn't find any normative language about non-string data types. XMP defines Boolean, Integer, and Real as strings. ISO/DIS 16684-3 doesn't mention that and provides inconsistent examples (search for "xmp:Rating" there). Our spec maps them to JSON types which may be too strict. Needs some investigation.

In the original KHR_xmp extension specification we specified Boolean, Integer, and Real should use the relevant JSON types, but I agree that this might be too strict. While ISO/DIS 16684-3 has been approved, the FDIS version hasn't been published yet and I'm wondering if there will be any non-normative changes to in document examples that may provide clarity here.

@MiiBond Do you have contact with Leonard Rosenthal or anyone else on the XMP side at Adobe that might be able to help us answer this question?

@lrosenthol
Copy link

I'm here :). Let me go through the comments...

Let me start by saying that the spec is currently on way to publication, but we are going to see if we can pull it back to resolve some of these excellent findings!!

I couldn't find any normative language about non-string data types. XMP defines Boolean, Integer, and Real as strings. ISO/DIS 16684-3 doesn't mention that and provides inconsistent examples (search for "xmp:Rating" there). Our spec maps them to JSON types which may be too strict. Needs some investigation.

You are absolutely correct - this isn't covered and the examples are inconsistent. Ugh! https://github.com/ISOTC171SC2/XMP-JSON-LD/issues/3

The description of Language Alternatives in the "XMP Core Properties" section of this extension is outdated.

Can you say more about this? Not understanding the issue here...

Review Pantry-Ingredient process:

All good points here. Let me know if you need me to review anything on your side.

@lexaknyazev
Copy link
Member

The description of Language Alternatives in the "XMP Core Properties" section of this extension is outdated.

Can you say more about this? Not understanding the issue here...

That refers to some leftover descriptions in our spec from non-JSON-LD version of the extension.

@weegeekps
Copy link
Author

weegeekps commented Feb 9, 2021

Let me start by saying that the spec is currently on way to publication, but we are going to see if we can pull it back to resolve some of these excellent findings!!

@lrosenthol I'm wondering if we should hold off on ratification ourselves until this is resolved on your end. I'd like for us to get this right. If you manage to pull it back, do you have an idea of how long it will take to get a draft that resolves these issues?

You are absolutely correct - this isn't covered and the examples are inconsistent. Ugh! ISOTC171SC2/XMP-JSON-LD#3

Just an FYI: that link seems to be inaccessible. Perhaps it's a private repository?

@lrosenthol
Copy link

@weegeekps that's up to you folks, but I don't see it as necessary. I will make the updates this weekend and that should mean that we don't lose any time...

@weegeekps
Copy link
Author

@lrosenthol has there been any progress on your updates? We have decided to wait until we can get a glance at the next draft of the ISO spec so we can make sure we are fully compliant.

@weegeekps
Copy link
Author

@lrosenthol, have you made any progress regarding your updates to the ISO spec? Anything you can share us regarding what has changed would be helpful and much appreciated for getting this extension moving again.

@weegeekps
Copy link
Author

@lrosenthol @MiiBond Could we get some guidance on how we should handle Booleans, Integers, and Reals? I know you are updating the ISO spec now to fix this issue, but there very strong chance that we will soon be normatively referenced by others very soon, and we'd like to wrap this up as soon as possible given that. If we can get an answer to this one question, I think we can confidently go forward with ratification on our side. Thank you!

@weegeekps
Copy link
Author

I had an email conversation with @lrosenthol and we have the answers we need. The decision is to use the JSON types for Booleans, Integers, and Reals, which means there are no normative changes left. I am working on an update that will shore up our last nits and typos, and we are hoping to bring this to a ratification vote soon.

@weegeekps
Copy link
Author

I went through and finally pushed my updates based on your last nits and the decision on the Boolean, Integer, and Real issue, @lexaknyazev. I have a few questions bout how to handle a few of these.

* The schema file for packet's instantiating is specific to glTF nodes (see the filename and the property description). We should either generalize it or duplicate it for each supported object type.

Could you expand on this a bit? I'm not quite sure I follow.

Review Pantry-Ingredient process:

* Fields within `ResourceRef` objects (e.g. values of `xmpMM:Ingredients` array) should probably use `stRef` namespace.

* Maybe mention that implementations should be resilient to non-matching prefixes between ingredients since they will eventually share the same `@context`. For example in a case when asset _A_ calls Dublin Core as `dc` while asset _B_ calls the same namespace as `dc11`.

Do we have a suggested approach for dealing with this? I guess just normalize based on the actual namespace URI? Seems obvious but I want to confirm, I imagine implementers will want some direction on this.

* `@language` could be defined on `@context` level thus marking all Text values. Also, language could be redefined (e.g. with `"@language": "fr"`) or cleared (with `"@language": null`) by using value objects or nested contexts.

Are you talking about the root @context level? I know we have language around local @context only being allowed for @language, but I don't believe we have any around root @context and if that's what you're referring to, I will add some language to bring attention to that.

@weegeekps
Copy link
Author

@lexaknyazev, @emackey, now that this is ratified, I'd like to get this merged in by Monday prior to the wider announcement. Do either of you (or anyone else) have any final comments before I merge?

@emackey
Copy link
Member

emackey commented Jun 11, 2021

I'm not an XMP user, but from a glTF perspective this looks fine to me. Actually I think we should have merged this when it entered ratification. I'll merge it now.

@emackey emackey merged commit b34d804 into KhronosGroup:master Jun 11, 2021
@PsycoTodd
Copy link

I have one question. Do we allow multiple reference of xmp_json_ld from one structure (mesh, node, etc.)

For example, as below. Or this case should handled at the packet definition, aka the person who define the extension should merge the json-ld if there is this case, and always keep reference contains one extension? thank you

"materials" : [
{
"pbrMetallicRoughness": {
"baseColorFactor": [
0.8,
0.8,
0.8,
1
],
"metallicFactor": 0,
"roughnessFactor": 0.1
},
"extensions": {
"KHR_xmp_json_ld": {
"packet": 0
},
"KHR_xmp_json_ld": {
"packet": 2
}

},
"emissiveFactor": [
0,
0,
0
],
"alphaMode": "OPAQUE",
"doubleSided": false
},

@bghgary
Copy link
Contributor

bghgary commented Nov 27, 2023

JSON doesn't allow duplicate keys in one object, so it's not supported even if we wanted to. The forum might be a better place to ask these types of questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants