Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using "@language": "en-US" in context producing adverse affects #459

Closed
stuartasutton opened this issue Sep 7, 2017 · 49 comments
Closed

Comments

@stuartasutton
Copy link
Contributor

stuartasutton commented Sep 7, 2017

Nate, the use of "@language": "en-US" in the json-ld is resulting in every instance of a literal being tagged as en-US even where it shouldn't. For example:

In Turtle:

ceterms:ctid "ce-CB8EB8D0-208D-4DA0-9F50-7003F8A39ED6"@en-US ;

and in RDF/XML:

<ceterms:ctid xml:lang="en-US">ce-CB8EB8D0-208D-4DA0-9F50-7003F8A39ED6</ceterms:ctid>

@siuc-nate
Copy link
Contributor

That's...interesting. It seems like that would be a problem for any converter/parser, even one that was working in JSON directly, if it didn't know which literals were human-readable and which were just strings.

I wonder how that's been tackled by other users of JSON-LD.

Well, the only other thing that comes immediately to mind would be to change all of the human-readable string properties to arrays of objects with one object each which contains a @language and @value property, and then regenerating the schema validation document to permit it, and then republishing everything in the registry...that would be quite a headache. Is there no better solution?

@stuartasutton
Copy link
Contributor Author

Nate, apparently, this is a long-standing problem with json-ld. Here is an issue (quite looong) from the json-ld repo (json-ld/json-ld.org#133) back in 2012.

It is something that needs to be resolved. The ideal person (as you can tell from this thread) is Gregg Kellogg (who is currently recovering from very recent heart surgery).

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 7, 2017

I was reading that very issue thread earlier today, along with other discussions and the language map section of the JSON-LD documentation (which is lacking in examples and not very useful). I really like the original proposal, as it is the most succinct:

"label": {
  "en": "The Queen",
  "de": "Die Koenigin"
}

Though the use of "@container": "@language" in the context would seem to preclude using any other sort of container elsewhere; to that end I would almost prefer

"label": {
  "@type": "@language",
  "en": "The Queen",
  "de": "Die Koenigin"
}

as an option. But I don't think we'd necessarily need that for our purposes.

Anyway yes, there does not seem to be any definitive answer on this - the thread being several years old doesn't help much.

In the meantime, what should we do? The project is moving forward with or without a resolution to this.

@stuartasutton
Copy link
Contributor Author

But, there has been more with multiple issues and notions intertwining including use of named graphs. Anyway, in these threads there are at least 4 familiar people with the necessary expertise, any one of which might cast some light on the issue:

Gregg Kellogg (gkellogg) (editor of the JSON-LD spec.)
David Longley (dlongley)
Rubin Verborgh (RubinVerborgh)
Manu Sporny (msporny) (editor of the JSON-LD spec.)

@stuartasutton
Copy link
Contributor Author

Yes, this is not something that's a show stopper or should slow us down; but, it nevertheless needs to be resolved.

@siuc-nate
Copy link
Contributor

I could also see the potential usefulness of another approach that seemed to show up in that thread; I think it was called property-mapped? Something like:

"en": {
  "label": "The Queen"
},
"de": {
  "label": "Die Koenigin"
}

Since it would scale pretty well, and have minimal impact in cases where only one language was in use:

"en": {
  "label": "The Queen",
  "description": "English description",
  "another property": "English value",
},
"de": {
  "label": "Die Koenigin",
  "description": "German description",
  "another property": "German value",
}

Instead of

"label": {
  "en": "The Queen",
  "de": "Die Koenigin"
},
"description": {
  "en": "English description",
  "de": "German description"
},
"another property": {
  "en": "English  value",
  "de": "German value"
}

But then the problem becomes translating between that structure and an actual programming-friendly structure. You either have to do a lot of custom serialization/deserialization or give each class in the schema its own unique language container class, or one shared class but with some custom means of conveying rules about which properties are or aren't used in it...and then you also have to figure out how to handle it in the schema itself, if at all. So in the grand scheme of things the original proposal (properties as containers for languages rather than languages as containers for properties) ends up being easier to deal with, I think.

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 7, 2017

Yes, this is not something that's a show stopper or should slow us down

I want to agree, but the handling of all human-readable string properties across the schema is far from trivial to change once we start building entire systems around it. It'll be especially hard once we start having data in the registry that is coming directly from other partners and isn't easy for us to just wipe out and republish with an updated format (in fact, it'll be impossible to do that legitimately, as it would violate third-party envelope signatures).

In that sense, it kind of is a showstopper, now that I think about it some more.

In the meantime, would it be worth creating a ceterms:LanguageMap class or something, and using that as the range of all relevant rdfs:literal properties?

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 7, 2017

@siuc-nate, I'm jumping off to email to discuss options.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 8, 2017

The problem is that the use of @language in the @context assumes that all literals, unless datatyped otherwise, are language tagged strings. In our data, some of the problems with literals falsely identified as languaged-tagged strings are because we've not datatyped values in the descriptions where the CTDL says we should. For example, using the Western Governors BS in Information Technology as an example, we have the following jsonld for the properties certerms:minimumDuration and ceterms:maximumDuration which are datatyped as xsd:duration in the CTDL.

"ceterms:maximumDuration": "P3Y",
"ceterms:minimumDuration": "P1Y"

...where we should have

"ceterms:maximumDuration": { 
      "@value": "P3Y",
      "@type": "http://www.w3.org/2001/XMLSchema#duration"
}

"ceterms:minimumDuration": {
      "@value": "P1Y",
      "@type": "http://www.w3.org/2001/XMLSchema#duration"
}

The result is that when our current data is transformed to turtle, we get the following values falsely language tagged:

ceterms:maximumDuration "P3Y"@en-US ;
ceterms:minimumDuration "P1Y"@en-US

where we should see:

ceterms:maximumDuration "P3Y"^^xsd:duration ;
ceterm:minimumDuration "P1Y"^^xsd:duration

Alternatively, the types for these properties could be handled in the @context through type coercion .

This does not handle the other two problematic language taggings in the Western Governs' degree CER record with ceterms:currency and ceterms:ctid. Our jsonld endoding translated to turtle looks like the following:

ceterms:ctid "ce-CB8EB8D0-208D-4DA0-9F50-7003F8A39ED6"@en-US ;
ceterms:currency "USD"@en-US ;

If we had created and declared an xsd dataype for CTID, that would solve that ceterms:ctid problem. As for ceterms:currency, I am not sure that the following would work since the ISO 4217 is not formally declared as a "datatype" (schema.org alludes to http://en.wikipedia.org/wiki/ISO_4217):

"ceterms:currency": { 
    "@value": "USD",
    "@type": "https://www.currency-iso.org/"
}

And, for ceterms:ctid with a project defined datatype:

"ceterms:ctid": { 
    "@value": "ce-CB8EB8D0-208D-4DA0-9F50-7003F8A39ED6",
    "@type": "http://[ our defined datatype for CTID ]"
}

We do need to take steps to discover whether any other values in the CER data are being falsely languaged-tagged and make correction to the encoding process accordingly.

@siuc-nate
Copy link
Contributor

Thanks, Stuart.

I assume this applies not only to our schema serializations, but also the data we publish from the workit site, correct?

@stuartasutton
Copy link
Contributor Author

Nate, I was just going to do a review of the schema serialization :-) Until I've done that, I can't say whether the problems are shared across schema and record encoding and thus on to Workit.

As part of that schema review, I'll be looking for other properties that might fall in the basket of falsely [language] accused.

@siuc-nate
Copy link
Contributor

Sorry, I should have been more clear - I meant that our workit serializations (specifically the literal fields) will also need to be altered to be more conversion-friendly (whatever that may entail). Though before we go doing that, I do wonder if there is a sufficiently valid use case for it.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 8, 2017 via email

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 8, 2017

The functionality of the Registry is what I was referring to, yes - records are only ever JSON there, and the registry is meant to facilitate interoperability between systems (that would presumably speak JSON).

You are right about cases where the data is in the wild, but does complicate the serialization/deserialization process significantly. This will likely be a deterrent to adoption - if developers are able to create and get a more streamlined JSON record, they are less likely to have problems figuring out how to deal with its data. The schema documentation will tell them what the datatypes are; the data itself doesn't need to (and if it tries to, then the schema documentation becomes less immediately useful, since the values of many "literals" will actually be objects).

In other words, it's a lot easier to get developer buy-in if they only need to output and import

{
  "ceterms:myProperty": "my value",
  "ceterms:ctid": "ce-[GUID]",
  "ceterms:trueOrFalse": true
}

instead of

{
  "ceterms:myProperty": [
    {
      "@value": "my value",
      "@language": "en-US"
    }
  ],
  "ceterms:ctid": {
    "@value": "ce-[GUID]",
    "@type": "ceterms:ctid"
  },
  "ceterms:trueOrFalse": {
    "@value": true,
    "@type": "xsd:boolean"
  }
}

As a [third-party/partner] developer, I would balk at the latter, and wonder why we need a standalone schema at all if every property is going to tell me its type anyway.

So, I think we need to be very careful in how we solve this. The primary use case we need to serve is the communication of data between systems coded to understand CTDL; I would think that worrying about whether or not the data also converts smoothly to other formats should be an important, but secondary, concern. That's just my opinion, though; I can be swayed.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 8, 2017

Nate, this is a BIG point. First, the conversion to other RDF formats is not a "secondary concern" , it is both: (1) the test of how well the CTDL encoding conforms to RDF; and (2) necessary to full functioning on the open web where everyone in the Semantic Web context does not rely on JSON-LD. So, JSON-LD data that can't translate into flavors of RDF without rendering incorrect data doesn't seem to me to fit the goals of this project.

As for developers, the data standing alone has to carry (one way or another) the data necessary to proper handling and that includes both datatyping and language-tagging. Downstream systems should not have to reference a schema to figure out how to process the data.

Remember that there are usually developers on both ends of that transaction. The goal is smart, complete data. What's the price of sending incomplete, "dumb" data?

{
  "ceterms:myProperty": "Moje hodnota",
  "ceterms:ctid": "ce-[GUID]",
  "ceterms:trueOrFalse": skutečný
}

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 8, 2017

I will accept your point about the data on the open web. However, I still think it is a greater burden to developers to code for a data model that is so much more complex.

I'm not sure I understand your point about "not having to reference a schema". If the data carries all of the typing needed to process it, then what is the function of a schema? Does it only serve to define which properties go with which class? As a developer, I would think that the schema (or more specifically, a profile thereof) should tell me everything I need to know about how to process the data, so that I can write efficient code that handles "dumb" data by being a smarter system.

Consider a simple (pseudocode) class for the example:

class MyClass {
  string myProperty;
  string ctid;
  bool trueOrFalse;
}

Compared to what would be needed to handle the fully described example:

class MyClass {
  Literal myProperty;
  Literal ctid;
  Literal trueOrFalse;
}

class Literal {
  string Value;
  string Type;
}

If my code knows what types of values to expect for which properties in the data, it would know that your example contains faulty information (specifically the non-boolean boolean) and reject it. It is much more cumbersome to write code that must check each property for its type, and then determine what to do with it. If the schema (and therefore the types of the data) are known already, then it is also redundant to have to check. It also means processing a deeper, more complicated data structure for each property - and the data itself is bigger and much harder to read (as a human).

All of that also leads to documentation that is more confusing to write and read (again, discouraging developer buy-in), which in turn leads to a larger number and variety of chances for errors to occur.

Perhaps this is a point of friction between the world of RDF and the world of object-oriented programming. It is definitely possible to write code that handles fully-described data, but it is also unnecessary if the value types are known from the schema.

In the second example above, I now have to ensure against null objects for each property, dive into each object to set or get its value, and write handlers that can render, serialize, deserialize, store, query, load, etc., the additional pieces properly. The additional complication is an order of magnitude greater this way, and for no benefit since I already know what the types are from the schema. At the very least, I have to convert to and from this more complex format and something simpler used in the rest of my code - again without benefit due to already knowing the types. It also introduces the chance for typo-based errors (i.e., trueOrFalse.Type = "booll") that wouldn't be caught by a compiler or IDE, making debugging harder (you could get around this if your language supports enums, but that adds another layer of complexity during serialization/deserialization).

In other words, given a well-documented schema, the price for producing and processing "dumb" data is lower than the price of producing and processing "smart" data.

Having said that, there is one case where having the data type embedded in the data is necessary - when the range contains more than one type of data. If the range for a given property includes two different types of objects, then there is no avoiding writing code that must check the data first and then determine how to process it. We have some properties like this in CTDL.

I'm not sure where this leaves us in terms of solutions. I'm prepared to accept that we have to use the more complex format, in the end, regardless - but we must be prepared to answer developers who want us to explain it.

Perhaps the Registry Assistant API should be a two-way street? That is, not only should it convert from simplified classes to full CTDL, but also from CTDL back to simplified classes (give it a CTID, it fetches the record from the registry, converts it, and gives it to you)? Is that a possible solution?

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 9, 2017

@siuc-nate, I'm sympathetic to the implementation issues of trying to make the solution stack behave like a triplestore

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 9, 2017

@siuc-nate and @jkitchensSIUC, let me see if I can summarize (and characterize) the points so far.

  1. Functions of the RDF schema.

It seems that your position, Nate, is that the JSON-LD credential instance data generated need not include language tags or datatypes because the necessary information regarding them is embodied in the RDF schema for CTDL and that systems can just look it up. If this is a correct reading, it would look like this:

2017-09-08-rdf_instance_data-a

There are a couple of issues here. First it's a "closed systems" view that assumes that the CER datastore is merely passing a record to some 3rd party system and that the 3rd party system looks up the schema and fills in the blanks in the incomplete record. This is somewhat of a misunderstanding of the role of the RDF schema. When a 3rd party triplestore ingests RDF data in this manner, it could (but doesn't have to) use the RDF schema to infer new data in terms of the defined semantics in the schema such as creating an inferred triple from a declaration of owl:inverseOf on some property. It could not handle a missing language tag because it hasn't a clue about the actual language of a particular literal value. It does not assume that intended data is missing...i.e., it doesn't "fill in the blanks". It looks like this:

2017-09-08-rdf_instance_data-c

But the problem of "closed system" thinking is that closed systems are just one scenario of concern to us. It does not take into account Linked Data and how it works. There are two fundamental linking contexts where: (1) a system uses linked data to reach out through a link to a second system to retrieve some data that it uses in a mashup; and (2) an end user clicks on a link in a browser to retrieve a credential description to display. In the latter scenario, the browser assumes the instance data it gets is "complete". It does not look up anything, it just presents. It looks like this:

2017-09-08-rdf_instance_data-d

So, my take is that the JSON-LD data in the CER that is presented to the world has to be complete and done so that it translates accurately into other RDF serializations. Of course, lesser quality data could be produced in JSON-LD without either language tags or datatypes identified if the Registry somehow does the addition of datatypes and language tags in the exposed data. Just don't assume that some consuming systems is going to do the job of completing those records by looking up datatypes and inferring appropriate language tags.

  1. Too much encoding detail (language tags and datatypes) creates push-back.

@siuc-nate If your assumption that technical powers that be will not buy-in or balk, then the Registry would need to ingest the incomplete data and fill in the blanks before quality data could be exposed, exported or linked to.

I do find it interesting that tech folks like JSON-LD so much if they don't like "excessive" encoding when it is the most verbose language available for encoding linked data. If you take the CER's JSON-LD record for the Western Governors BS in Information Technology encoding and translate it into other serializations including datatypes etc., here are the stats:

Serialization Line Count
Turtle 161
N-Triples 177
RDF/XML 258
JSON-LD 377
  1. Primary and secondary concerns.

Nate, you state:

"I would think that worrying about whether or not the data also converts smoothly to other formats should be an important, but secondary, concern."

This in essence says that producing valid RDF data through transformation of the JSON-LD serialization is a secondary concern. I don't agree. It basically says that it's OK for data transformations to generate poor or incomplete RDF. I think that both are primary concerns; however, we need to leave a decision here to Jeanne and others.

@stuartasutton
Copy link
Contributor Author

In the end, Nate, the biggest issue in the instance data is with language tags since JSON handing of other datatypes is already extremely limited (e.g., no datatypes for forms of dates etc.) so the full expressiveness of CTDL isn't available anyway. In conversion, it handles primitive conversions of number/integer. Which brings us full circle back to the fact that the @language does not work for us in @context because it tags every literal in double quotes as @en-US. While we do not see it in the original encoding in the CER, that's what it is doing. This is evident by taking the our JSON-LD encoding of a credential and translate it from JSON-LD to JSON-LD. The translator expands the @language key word by tagging every literal affected, like:

      "ceterms:ctid": {
        "@language": "en-US",
        "@value": "ce-CB8EB8D0-208D-4DA0-9F50-7003F8A39ED6"
      },

So, we are back at square one.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 11, 2017

Nate, you asked:

"Not to belabor the point, but does this come back to creating a new class to serve as the range for language-dependent properties, and making that a subclass of Literal? That's how we've solved other, similar problems so far, though none of them have felt as uniquely far-reaching as this."

There is something we can do at the declaration level in the tables and the JSON-LD and Turtle of the schema; but not at all sure how that helps us with a JSON-LD encoding. All literals are datatyped. We can declare those that have language tags as being of the datatype rdf:langString. Anything not defined as having the rdf:langString designation should default to xsd:string (see: Literal in RDF 1.1 XML Syntax) . That would mean in the CTDL schema we'd have the datatype rdf:langString for all properties we intend to be language tagged. We'd not designate any other literals under the assumption that they'd default to xsd:string. But I'm still not sure how JSON-LD would handle it with the @language keyword in @context--i.e., whether it would only add the language tag to those designated rdf:langString and leave those with no range datatype alone? The changes in ranges in the schema makes sense and would then be clear...even if uncertain of JSON-LD @language behavior.

So a faked schema definition example here would looking the following:

Turtle

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://purl.org/ctdl/terms/subject> a rdf:Property ;
    rdfs:label "Subject"^^rdf:langString ;
    rdfs:domain <http://purl.org/ctdl/terms/Credential> ;
    rdfs:isDefinedBy <http://purl.org/ctdl/terms/> ;
    rdfs:range rdf:langString .

RDF/XML

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    >
    <rdf:Description rdf:about="http://purl.org/ctdl/terms/subject">
        <rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#langString"/>
        <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Property"/>
        <rdfs:domain rdf:resource="http://purl.org/ctdl/terms/Credential"/>
        <rdfs:isDefinedBy rdf:resource="http://purl.org/ctdl/terms/"/>
        <rdfs:label rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#langString">Subject</rdfs:label>
    </rdf:Description>
</rdf:RDF>

JSON-LD

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@id": "http://purl.org/ctdl/terms/subject",
  "@type": "rdf:Property",
  "rdfs:domain": {
    "@id": "http://purl.org/ctdl/terms/Credential"
  },
  "rdfs:isDefinedBy": {
    "@id": "http://purl.org/ctdl/terms/"
  },
  "rdfs:label": {
    "@type": "rdf:langString",
    "@value": "Subject"
  },
  "rdfs:range": {
    "@id": "rdf:langString"
  }
}

@stuartasutton
Copy link
Contributor Author

Nate, in the sentence: "Anything not defined as having the rdf:langString designation should default to xsd:string (see: Literal in RDF 1.1 XML Syntax) ." I meant that any property with no range definition defaults to xsd:string so we'd remove the rdf:Literal everwhere but those with rdf:langString or some other datatype like dates etc. Then perhaps @language might not append en-US to things like ctid because it would have no range declaration (although defaulting to xsd:string).

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 11, 2017

So if I'm understanding correctly, you're saying we'd replace all current instances of rdfs:Literal in the schema currently with the appropriate subclass (xsd:string, rdf:langString, etc)? It would be interesting to see how that impacts conversion. I wonder if there is anything in the JSON-LD specification that would specify how to treat langString vs Literal when converting/handling @language in the @context.

There is something we can do at the declaration level in the tables and the JSON-LD and Turtle of the schema; but not at all sure how that helps us with a JSON-LD encoding.

It seems like a two-step problem/solution. I think it's more valuable to solve the schema problem first; once we have a concrete indication at the schema level of which properties are language-dependent, we can handle that in any number of ways in the JSON-LD we generate.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 11, 2017

Not exactly, but very close. We'd change the rangeIncludes for those needing to include a language tag to rdf:langString and those that would be xsd:string to no range at all. I did a fast pass and get the following:

Change rangeIncludes to rdf:langString

addressCountry
addressLocality
addressRegion
agentPurposeDescription
alignmentType
assessmentExampleDescription
assessmentOutput
condition
contactType
creditHourType
creditUnitTypeDescription
deliveryTypeDescription
demographicInformation
description
experience
familyName
frameworkName
givenName
keyword
missionAndGoalsStatementDescription
name
paymentPattern
processFrequency
processMethodDescription
processStandardsDescription
revocationCriteriaDescription
scoringMethodDescription
scoringMethodExampleDescription
submissionOf
targetNodeDescription
targetNodeName
verificationMethodDescription

Remove rdfs:Literal (no rangeIncludes designation)

alternateName
codedNotation
contactOption
credentialId
Cited
currency
duns
email
faxNumber
fein
foundingDate
honorificSuffix
identifierType
identifierValueCode
ipedsID
latitude
longitude
nails
opeID
partOfIdentifierValueSet
postalCode
postOfficeBoxNumber
streetAddress
telephone

@siuc-nate
Copy link
Contributor

Why no range at all? It seems like that would create a lot of problems (particularly for developers - a property that can't hold data isn't a property, and is just confusing).

@stuartasutton
Copy link
Contributor Author

I see one reference to rdf:langString in the JSON discussions at json-ld/json-ld.org#133 ... search the page for langString.

In answer to your question, any property in RDF that is not declared as a datatype or rdf:langString is treated in RDF as an xsd:string. If there is no declaration of range, surely @langauge keyword would not expand "no range" to a language tag

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 11, 2017

I wouldn't alter the schema like that just to handle the @language in the context doing odd things during conversion - there are other ways to solve that. I think the schema should reflect the actual range, because that is incredibly useful for generating the documentation. If the range is string, we should declare it as such. This would also prevent confusion between "real" properties and "superproperty" properties.

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 11, 2017 via email

@siuc-nate
Copy link
Contributor

I agree, we should use langString. I strongly feel we should change the range of the other properties to the appropriate value as well (xsd:string in most cases; maybe we do something different for latitude/longitude? Shouldn't those be xsd:float anyway?)

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 19, 2017

Per our 9/19/2017 conversation:

We will explore using the language map as defined in https://json-ld.org/spec/latest/json-ld/#string-internationalization example 36 in addition to the assignment of rdf:langString and xsd:string as discussed above.

First, we will create a few tests and try them with http://www.easyrdf.org/converter and http://rdf-translator.appspot.com/

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 20, 2017

@stuartasutton Here is an example, based on https://credentialfinder.org/credential/1306/A_A_S__in_Dental_Assisting (with some extra locations removed).

Some experiments with this:

  • The use of both @id and @container (ceterms:addressCountry)
  • The use of only @container (other language properties)
  • The use of @id in the context to indicate URI fields (and thus the use of only the URI in the data itself - if this works, it will simplify a lot of things)
  • Deliberately inconsistent use of short language codes ("en") and language/region codes ("en-US")
  • One instance of data that actually has multiple languages, since that is kind of the point and we should probably test for it (degreeMajor.targetNodeName)
  • The use of @type: cetermsConditionProfile for the ceterms:requires property in the context (and not using a @type designation in the data) - this is mostly just to see what happens, rather than being a practical/implementation change.

I'll test this with the converters above. If you have other tests, Stuart, go ahead.

{
  "@context": {
    "ceterms": "http://purl.org/ctdl/terms/",
    "dc": "http://purl.org/dc/elements/1.1/",
    "dct": "http://purl.org/dc/terms/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "obi": "https://w3id.org/openbadges#",
    "owl": "http://www.w3.org/2002/07/owl#",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "schema": "http://schema.org/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "vs": "https://www.w3.org/2003/06/sw-vocab-status/ns",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "lrmi": "http://purl.org/dcx/lrmi-terms/",
    "asn": "http://purl.org/ASN/schema/core/",
    "vann": "http://purl.org/vocab/vann/",
    "actionStat": "http://purl.org/ctdl/vocabs/actionStat/",
    "agentSector": "http://purl.org/ctdl/vocabs/agentSector/",
    "serviceType": "http://purl.org/ctdl/vocabs/serviceType/",
    "assessMethod": "http://purl.org/ctdl/vocabs/assessMethod/",
    "assessUse": "http://purl.org/ctdl/vocabs/assessUse/",
    "audience": "http://purl.org/ctdl/vocabs/audience/",
    "claimType": "http://purl.org/ctdl/vocabs/claimType/",
    "costType": "http://purl.org/ctdl/vocabs/costType/",
    "credentialStat": "http://purl.org/ctdl/vocabs/credentialStat/",
    "creditUnit": "http://purl.org/ctdl/vocabs/creditUnit/",
    "deliveryType": "http://purl.org/ctdl/vocabs/deliveryType/",
    "inputType": "http://purl.org/ctdl/vocabs/inputType/",
    "learnMethod": "http://purl.org/ctdl/vocabs/learnMethod/",
    "orgType": "http://purl.org/ctdl/vocabs/orgType/",
    "residency": "http://purl.org/ctdl/vocabs/residency/",
    "score": "http://purl.org/ctdl/vocabs/score/",
	"ceterms:name": { "@container": "@language" },
	"ceterms:addressCountry": { "@id": "ceterms:addressCountry", "@container": "@language" },
	"ceterms:addressLocality": { "@container": "@language" },
	"ceterms:description": { "@container": "@language" },
	"ceterms:targetNodeName": { "@container": "@language" },
	"ceterms:creditHourType": { "@container": "@language" },
	"ceterms:commonConditions": { "@type": "@id" },
	"ceterms:commonCosts": { "@type": "@id" },
	"ceterms:ownedBy": { "@type": "@id" },
	"ceterms:subjectWebpage": { "@type": "@id" },
	"ceterms:assertedBy": { "@type": "@id" },
	"ceterms:requires": { "@type": "ceterms:ConditionProfile" }
  },
  "@type": "ceterms:AssociateDegree",
  "@id": "http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
  "ceterms:complaintProcess": [
    {
      "@type": "ceterms:ProcessProfile",
      "ceterms:description": "add complaint process, and ensure shown in proper section"
    }
  ],
  "ceterms:availableAt": [
    {
      "@type": "ceterms:GeoCoordinates",
      "ceterms:address": [
        {
          "@type": "ceterms:PostalAddress",
          "ceterms:name": { "en": "Anderson" },
          "ceterms:addressCountry": { "en": "United States" },
          "ceterms:addressLocality": { "en": "Anderson" },
          "ceterms:addressRegion": "IN",
          "ceterms:postalCode": "46013",
          "ceterms:streetAddress": "104 W. 53rd Street"
        }
      ],
      "ceterms:name": { "en": "Anderson" }
    }
  ],
  "ceterms:commonConditions": [
    "http://lr-staging.learningtapestry.com/resources/ce-b96bddc6-97eb-4824-a39d-d48468324b22"
  ],
  "ceterms:commonCosts": [
    "http://lr-staging.learningtapestry.com/resources/ce-085214da-6eb8-4d46-9751-d37c02092f41"
  ],
  "ceterms:ctid": "ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
  "ceterms:description": { "en-US": "The Dental Assisting program at Ivy Tech is taught by instructors who have all worked as dental assistants in the field. Students will get to study in a state-of-the-art facility learning a variety of skills involving radiography, dental materials, infection control, chairside, and general dentistry as well as dental specialties. For more information about dental assisting, visit the American Dental Assistant's Association website.\n\nThe Dental Assisting program is a selective admission program. When you apply to the college, you would be accepted into the Healthcare Specialist program with a concentration in Dental Assisting while you complete the prerequisite requirements. The Dental Assisting program accepts a limited number of students each year and there is a separate application process." },
  "ceterms:degreeMajor": [
    {
      "@type": "ceterms:CredentialAlignmentObject",
      "ceterms:targetNodeName": { "en-US": "Dental Assisting", "es": "Asistencia dental", "ru-RU": "Стоматологическая помощь" }
    }
  ],
  "ceterms:inLanguage": [
    "en"
  ],
  "ceterms:name": { "en-US": "A.A.S. in Dental Assisting" },
  "ceterms:ownedBy": [
    "http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"
  ],
  "ceterms:requires": [
    {
      "ceterms:assertedBy": [
        "http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"
      ],
      "ceterms:creditHourType": { "en": "Semester Hours" },
      "ceterms:creditHourValue": 64.00,
      "ceterms:name": { "en": "A.A.S. in Dental Assisting" }
    }
  ],
  "ceterms:subjectWebpage": [
    "https://www.ivytech.edu/dental-assisting"
  ]
}

@siuc-nate
Copy link
Contributor

Results:

Conversion to Turtle via http://www.easyrdf.org/converter :

@prefix ns0: <http://purl.org/ctdl/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13>
  a <http://purl.org/ctdl/terms/AssociateDegree> ;
  ns0:availableAt [
    a ns0:GeoCoordinates ;
    ns0:address [
      a ns0:PostalAddress ;
      ns0:addressCountry "United States"@en ;
      ns0:addressLocality "Anderson"@en ;
      ns0:addressRegion "IN"^^xsd:string ;
      ns0:name "Anderson"@en ;
      ns0:postalCode "46013"^^xsd:string ;
      ns0:streetAddress "104 W. 53rd Street"^^xsd:string
    ] ;
    ns0:name "Anderson"@en
  ] ;
  ns0:commonConditions <http://lr-staging.learningtapestry.com/resources/ce-b96bddc6-97eb-4824-a39d-d48468324b22> ;
  ns0:commonCosts <http://lr-staging.learningtapestry.com/resources/ce-085214da-6eb8-4d46-9751-d37c02092f41> ;
  ns0:complaintProcess [
    a ns0:ProcessProfile ;
    ns0:description "add complaint process, and ensure shown in proper section"^^xsd:string
  ] ;
  ns0:ctid "ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13"^^xsd:string ;
  ns0:degreeMajor [
    a ns0:CredentialAlignmentObject ;
    ns0:targetNodeName "Dental Assisting"@en-us, "Asistencia dental"@es, "Стоматологическая помощь"@ru-ru
  ] ;
  ns0:description """The Dental Assisting program at Ivy Tech is taught by instructors who have all worked as dental assistants in the field. Students will get to study in a state-of-the-art facility learning a variety of skills involving radiography, dental materials, infection control, chairside, and general dentistry as well as dental specialties. For more information about dental assisting, visit the American Dental Assistant's Association website.

The Dental Assisting program is a selective admission program. When you apply to the college, you would be accepted into the Healthcare Specialist program with a concentration in Dental Assisting while you complete the prerequisite requirements. The Dental Assisting program accepts a limited number of students each year and there is a separate application process."""@en-us ;
  ns0:inLanguage "en"^^xsd:string ;
  ns0:name "A.A.S. in Dental Assisting"@en-us ;
  ns0:ownedBy <http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106> ;
  ns0:requires [
    ns0:assertedBy <http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106> ;
    ns0:creditHourType "Semester Hours"@en ;
    ns0:creditHourValue 64 ;
    ns0:name "A.A.S. in Dental Assisting"@en
  ] ;
  ns0:subjectWebpage <https://www.ivytech.edu/dental-assisting> .

Everything appears to have succeeded, with the exception of there not being a type value for the ConditionProfile. This is certainly encouraging.

Same tool, but with XML:

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:ns0="http://purl.org/ctdl/terms/">

  <rdf:Description rdf:about="http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13">
    <rdf:type rdf:resource="http://purl.org/ctdl/terms/AssociateDegree"/>
    <ns0:availableAt>
      <ns0:GeoCoordinates>
        <ns0:address>
          <ns0:PostalAddress>
            <ns0:addressCountry xml:lang="en">United States</ns0:addressCountry>
            <ns0:addressLocality xml:lang="en">Anderson</ns0:addressLocality>
            <ns0:addressRegion rdf:datatype="http://www.w3.org/2001/XMLSchema#string">IN</ns0:addressRegion>
            <ns0:name xml:lang="en">Anderson</ns0:name>
            <ns0:postalCode rdf:datatype="http://www.w3.org/2001/XMLSchema#string">46013</ns0:postalCode>
            <ns0:streetAddress rdf:datatype="http://www.w3.org/2001/XMLSchema#string">104 W. 53rd Street</ns0:streetAddress>
          </ns0:PostalAddress>
        </ns0:address>

        <ns0:name xml:lang="en">Anderson</ns0:name>
      </ns0:GeoCoordinates>
    </ns0:availableAt>

    <ns0:commonConditions rdf:resource="http://lr-staging.learningtapestry.com/resources/ce-b96bddc6-97eb-4824-a39d-d48468324b22"/>
    <ns0:commonCosts rdf:resource="http://lr-staging.learningtapestry.com/resources/ce-085214da-6eb8-4d46-9751-d37c02092f41"/>
    <ns0:complaintProcess>
      <ns0:ProcessProfile>
        <ns0:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">add complaint process, and ensure shown in proper section</ns0:description>
      </ns0:ProcessProfile>
    </ns0:complaintProcess>

    <ns0:ctid rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13</ns0:ctid>
    <ns0:degreeMajor>
      <ns0:CredentialAlignmentObject>
        <ns0:targetNodeName xml:lang="en-us">Dental Assisting</ns0:targetNodeName>
        <ns0:targetNodeName xml:lang="es">Asistencia dental</ns0:targetNodeName>
        <ns0:targetNodeName xml:lang="ru-ru">Стоматологическая помощь</ns0:targetNodeName>
      </ns0:CredentialAlignmentObject>
    </ns0:degreeMajor>

    <ns0:description xml:lang="en-us">The Dental Assisting program at Ivy Tech is taught by instructors who have all worked as dental assistants in the field. Students will get to study in a state-of-the-art facility learning a variety of skills involving radiography, dental materials, infection control, chairside, and general dentistry as well as dental specialties. For more information about dental assisting, visit the American Dental Assistant's Association website.

The Dental Assisting program is a selective admission program. When you apply to the college, you would be accepted into the Healthcare Specialist program with a concentration in Dental Assisting while you complete the prerequisite requirements. The Dental Assisting program accepts a limited number of students each year and there is a separate application process.</ns0:description>
    <ns0:inLanguage rdf:datatype="http://www.w3.org/2001/XMLSchema#string">en</ns0:inLanguage>
    <ns0:name xml:lang="en-us">A.A.S. in Dental Assisting</ns0:name>
    <ns0:ownedBy rdf:resource="http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"/>
    <ns0:requires>
      <rdf:Description>
        <ns0:assertedBy rdf:resource="http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"/>
        <ns0:creditHourType xml:lang="en">Semester Hours</ns0:creditHourType>
        <ns0:creditHourValue rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">64</ns0:creditHourValue>
        <ns0:name xml:lang="en">A.A.S. in Dental Assisting</ns0:name>
      </rdf:Description>
    </ns0:requires>

    <ns0:subjectWebpage rdf:resource="https://www.ivytech.edu/dental-assisting"/>
  </rdf:Description>

</rdf:RDF>

Again, it all appears to have worked, except for the ConditionProfile type.

Conversion to N3 via http://rdf-translator.appspot.com/ :

@prefix actionStat: <http://purl.org/ctdl/vocabs/actionStat/> .
@prefix agentSector: <http://purl.org/ctdl/vocabs/agentSector/> .
@prefix asn: <http://purl.org/ASN/schema/core/> .
@prefix assessMethod: <http://purl.org/ctdl/vocabs/assessMethod/> .
@prefix assessUse: <http://purl.org/ctdl/vocabs/assessUse/> .
@prefix audience: <http://purl.org/ctdl/vocabs/audience/> .
@prefix ceterms: <http://purl.org/ctdl/terms/> .
@prefix claimType: <http://purl.org/ctdl/vocabs/claimType/> .
@prefix costType: <http://purl.org/ctdl/vocabs/costType/> .
@prefix credentialStat: <http://purl.org/ctdl/vocabs/credentialStat/> .
@prefix creditUnit: <http://purl.org/ctdl/vocabs/creditUnit/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix deliveryType: <http://purl.org/ctdl/vocabs/deliveryType/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix inputType: <http://purl.org/ctdl/vocabs/inputType/> .
@prefix learnMethod: <http://purl.org/ctdl/vocabs/learnMethod/> .
@prefix lrmi: <http://purl.org/dcx/lrmi-terms/> .
@prefix obi: <https://w3id.org/openbadges#> .
@prefix orgType: <http://purl.org/ctdl/vocabs/orgType/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix residency: <http://purl.org/ctdl/vocabs/residency/> .
@prefix schema: <http://schema.org/> .
@prefix score: <http://purl.org/ctdl/vocabs/score/> .
@prefix serviceType: <http://purl.org/ctdl/vocabs/serviceType/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix vann: <http://purl.org/vocab/vann/> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13> a ceterms:AssociateDegree ;
    ceterms:availableAt [ a ceterms:GeoCoordinates ;
            ceterms:address [ a ceterms:PostalAddress ;
                    ceterms:addressCountry "United States"@en ;
                    ceterms:addressLocality "Anderson"@en ;
                    ceterms:addressRegion "IN" ;
                    ceterms:name "Anderson"@en ;
                    ceterms:postalCode "46013" ;
                    ceterms:streetAddress "104 W. 53rd Street" ] ;
            ceterms:name "Anderson"@en ] ;
    ceterms:commonConditions <http://lr-staging.learningtapestry.com/resources/ce-b96bddc6-97eb-4824-a39d-d48468324b22> ;
    ceterms:commonCosts <http://lr-staging.learningtapestry.com/resources/ce-085214da-6eb8-4d46-9751-d37c02092f41> ;
    ceterms:complaintProcess [ a ceterms:ProcessProfile ;
            ceterms:description "add complaint process, and ensure shown in proper section" ] ;
    ceterms:ctid "ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13" ;
    ceterms:degreeMajor [ a ceterms:CredentialAlignmentObject ;
            ceterms:targetNodeName "Dental Assisting"@en-US,
                "Asistencia dental"@es,
                "Стоматологическая помощь"@ru-RU ] ;
    ceterms:description """The Dental Assisting program at Ivy Tech is taught by instructors who have all worked as dental assistants in the field. Students will get to study in a state-of-the-art facility learning a variety of skills involving radiography, dental materials, infection control, chairside, and general dentistry as well as dental specialties. For more information about dental assisting, visit the American Dental Assistant's Association website.

The Dental Assisting program is a selective admission program. When you apply to the college, you would be accepted into the Healthcare Specialist program with a concentration in Dental Assisting while you complete the prerequisite requirements. The Dental Assisting program accepts a limited number of students each year and there is a separate application process."""@en-US ;
    ceterms:inLanguage "en" ;
    ceterms:name "A.A.S. in Dental Assisting"@en-US ;
    ceterms:ownedBy <http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106> ;
    ceterms:requires [ ceterms:assertedBy <http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106> ;
            ceterms:creditHourType "Semester Hours"@en ;
            ceterms:creditHourValue 6.4e+01 ;
            ceterms:name "A.A.S. in Dental Assisting"@en ] ;
    ceterms:subjectWebpage <https://www.ivytech.edu/dental-assisting> .

I'm not 100% sure I'm reading it correctly, but it appears to have the same result as above.

Conversion to JSON-LD (or rather, expansion I suppose) via the same tool:

{
  "@context": {
    "actionStat": "http://purl.org/ctdl/vocabs/actionStat/",
    "agentSector": "http://purl.org/ctdl/vocabs/agentSector/",
    "asn": "http://purl.org/ASN/schema/core/",
    "assessMethod": "http://purl.org/ctdl/vocabs/assessMethod/",
    "assessUse": "http://purl.org/ctdl/vocabs/assessUse/",
    "audience": "http://purl.org/ctdl/vocabs/audience/",
    "ceterms": "http://purl.org/ctdl/terms/",
    "claimType": "http://purl.org/ctdl/vocabs/claimType/",
    "costType": "http://purl.org/ctdl/vocabs/costType/",
    "credentialStat": "http://purl.org/ctdl/vocabs/credentialStat/",
    "creditUnit": "http://purl.org/ctdl/vocabs/creditUnit/",
    "dc": "http://purl.org/dc/elements/1.1/",
    "dct": "http://purl.org/dc/terms/",
    "dcterms": "http://purl.org/dc/terms/",
    "deliveryType": "http://purl.org/ctdl/vocabs/deliveryType/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "inputType": "http://purl.org/ctdl/vocabs/inputType/",
    "learnMethod": "http://purl.org/ctdl/vocabs/learnMethod/",
    "lrmi": "http://purl.org/dcx/lrmi-terms/",
    "obi": "https://w3id.org/openbadges#",
    "orgType": "http://purl.org/ctdl/vocabs/orgType/",
    "owl": "http://www.w3.org/2002/07/owl#",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "residency": "http://purl.org/ctdl/vocabs/residency/",
    "schema": "http://schema.org/",
    "score": "http://purl.org/ctdl/vocabs/score/",
    "serviceType": "http://purl.org/ctdl/vocabs/serviceType/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "vann": "http://purl.org/vocab/vann/",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@graph": [
    {
      "@id": "_:Na300c5c35b57465db1bea9c6e2c65c95",
      "ceterms:assertedBy": {
        "@id": "http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"
      },
      "ceterms:creditHourType": {
        "@language": "en",
        "@value": "Semester Hours"
      },
      "ceterms:creditHourValue": 64.0,
      "ceterms:name": {
        "@language": "en",
        "@value": "A.A.S. in Dental Assisting"
      }
    },
    {
      "@id": "_:N196b2d94338240e29bcf1bb1c5135dd7",
      "@type": "ceterms:GeoCoordinates",
      "ceterms:address": {
        "@id": "_:N6d8db46e22d245eb9a39c8140092d519"
      },
      "ceterms:name": {
        "@language": "en",
        "@value": "Anderson"
      }
    },
    {
      "@id": "_:Nfa8a951592d5467f9f03fa9233b0119e",
      "@type": "ceterms:CredentialAlignmentObject",
      "ceterms:targetNodeName": [
        {
          "@language": "en-US",
          "@value": "Dental Assisting"
        },
        {
          "@language": "es",
          "@value": "Asistencia dental"
        },
        {
          "@language": "ru-RU",
          "@value": "Стоматологическая помощь"
        }
      ]
    },
    {
      "@id": "_:Nb4fce7d9f081489ca36b900fa2ef4a31",
      "@type": "ceterms:ProcessProfile",
      "ceterms:description": "add complaint process, and ensure shown in proper section"
    },
    {
      "@id": "http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
      "@type": "ceterms:AssociateDegree",
      "ceterms:availableAt": {
        "@id": "_:N196b2d94338240e29bcf1bb1c5135dd7"
      },
      "ceterms:commonConditions": {
        "@id": "http://lr-staging.learningtapestry.com/resources/ce-b96bddc6-97eb-4824-a39d-d48468324b22"
      },
      "ceterms:commonCosts": {
        "@id": "http://lr-staging.learningtapestry.com/resources/ce-085214da-6eb8-4d46-9751-d37c02092f41"
      },
      "ceterms:complaintProcess": {
        "@id": "_:Nb4fce7d9f081489ca36b900fa2ef4a31"
      },
      "ceterms:ctid": "ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
      "ceterms:degreeMajor": {
        "@id": "_:Nfa8a951592d5467f9f03fa9233b0119e"
      },
      "ceterms:description": {
        "@language": "en-US",
        "@value": "The Dental Assisting program at Ivy Tech is taught by instructors who have all worked as dental assistants in the field. Students will get to study in a state-of-the-art facility learning a variety of skills involving radiography, dental materials, infection control, chairside, and general dentistry as well as dental specialties. For more information about dental assisting, visit the American Dental Assistant's Association website.\n\nThe Dental Assisting program is a selective admission program. When you apply to the college, you would be accepted into the Healthcare Specialist program with a concentration in Dental Assisting while you complete the prerequisite requirements. The Dental Assisting program accepts a limited number of students each year and there is a separate application process."
      },
      "ceterms:inLanguage": "en",
      "ceterms:name": {
        "@language": "en-US",
        "@value": "A.A.S. in Dental Assisting"
      },
      "ceterms:ownedBy": {
        "@id": "http://lr-staging.learningtapestry.com/resources/ce-1ABB6C52-0F8C-4B17-9F89-7E9807673106"
      },
      "ceterms:requires": {
        "@id": "_:Na300c5c35b57465db1bea9c6e2c65c95"
      },
      "ceterms:subjectWebpage": {
        "@id": "https://www.ivytech.edu/dental-assisting"
      }
    },
    {
      "@id": "_:N6d8db46e22d245eb9a39c8140092d519",
      "@type": "ceterms:PostalAddress",
      "ceterms:addressCountry": {
        "@language": "en",
        "@value": "United States"
      },
      "ceterms:addressLocality": {
        "@language": "en",
        "@value": "Anderson"
      },
      "ceterms:addressRegion": "IN",
      "ceterms:name": {
        "@language": "en",
        "@value": "Anderson"
      },
      "ceterms:postalCode": "46013",
      "ceterms:streetAddress": "104 W. 53rd Street"
    }
  ]
}

Again, same result - everything but the @type for ConditionProfile seems to work.

I think we may have a viable approach here. If I add the @container: @language and @type: @id values to the context for the whole schema, it should enable both the language map object and a much-appreciated simplification of URIs. It will also make the @context quite large, which is an argument in favor of a previous discussion we had that involved maintaining a standalone @context document and linking to it instead of embedding it in the data. I will explore adding that.

Well, @stuartasutton, what do you think? Does this solve the problem? Are there any other @context optimizations we might make to shrink the data further?

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 20, 2017

Just to identify the steps I think we can/should make at this point (and in roughly the order we should make them):

  1. Replace all designations of rdfs:Literal as the range of all relevant properties with either rdf:langString or xsd:string, as appropriate
  2. Update the @context generator to output all properties with a range of rdf:langString as having a @container value of @language
  3. Update the @context generator to output all properties with a range of either xsd:anyURI or any top-level entity (Credential subclasses, Organizations, Assessments, Learning Opportunities, Manifests, etc.) as having a @type value of @id
  4. Update the schema @context link on the credreg site to use the newer @context format instead of the schema.org-based one that includes all properties
  5. Update the Registry validation document to expect a link to @context instead of an embedded @context
  6. Update the Registry validation document to expect language maps and URIs instead of strings and @id objects, respectively
  7. Update the editor/workit prototype to publish and consume a linked @context instead of an embedded @context
  8. Update the editor/workit prototype to publish and consume JSON that uses language maps and URIs instead of strings and @id objects, respectively
  9. Update the documentation across the technical site wherever relevant to reflect these changes to both the schema, publishing, and consuming

@cwd-mparsons, @stuartasutton, @jkitchenssiuc if you can think of any other changes, please let me know.

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 20, 2017

Here is a tentative list of affected properties (relevant portions based on Stuart's list in a previous post):

Schema serialization rdf:langString

  • rdfs:label
  • rdfs:comment
  • dcterms:description
  • vann:usageNote
  • cs:changeReason
  • rdf:object (when the data is a string rather than a URI)

CTDL rdf:langString

  • ceterms:addressCountry
  • ceterms:addressLocality
  • ceterms:addressRegion
  • ceterms:agentPurposeDescription
  • ceterms:alternateName
  • ceterms:assessmentExampleDescription
  • ceterms:assessmentOutput
  • ceterms:condition
  • ceterms:contactOption
  • ceterms:contactType
  • ceterms:creditHourType
  • ceterms:creditUnitTypeDescription
  • ceterms:deliveryTypeDescription
  • ceterms:demographicInformation
  • ceterms:description
  • ceterms:experience
  • ceterms:familyName
  • ceterms:frameworkName
  • ceterms:givenName
  • ceterms:honorificSuffix
  • ceterms:identifierType
  • ceterms:keyword
  • ceterms:missionAndGoalsStatementDescription
  • ceterms:name
  • ceterms:paymentPattern
  • ceterms:processFrequency
  • ceterms:processMethodDescription
  • ceterms:processStandardsDescription
  • ceterms:revocationCriteriaDescription
  • ceterms:scoringMethodDescription
  • ceterms:scoringMethodExampleDescription
  • ceterms:streetAddress
  • ceterms:submissionOf
  • ceterms:targetNodeDescription
  • ceterms:targetNodeName
  • ceterms:verificationMethodDescription
  • ceterms:partOfIdentifierValueSet (remove literal range)

CTDL xsd:string

  • ceterms:alignmentType
  • ceterms:codedNotation
  • ceterms:credentialId
  • ceterms:ctid
  • ceterms:currency
  • ceterms:duns
  • ceterms:email
  • ceterms:faxNumber
  • ceterms:fein
  • ceterms:foundingDate
  • ceterms:identifierValueCode
  • ceterms:ipedsID
  • ceterms:naics
  • ceterms:opeID
  • ceterms:postalCode
  • ceterms:postOfficeBoxNumber
  • ceterms:telephone

CTDL xsd:float

  • ceterms:latitude
  • ceterms:longitude

For rdf:object above, how would the @context look? I decided to try a simple experiment:

{
	"@context": {
		"ceterms": "http://purl.org/ctdl/terms/",
		"ceterms:name": { "@container": "@language", "@type": "@id" }
	},
	"@type": "ceterms:AssociateDegree",
	"@id": "http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
	"ceterms:availableAt": [
		{
			"@type": "ceterms:GeoCoordinates",
			"ceterms:name": [ { "en": "Anderson" }, "ceterms:GeoCoordinates" ]
		}
	],
	"ceterms:name": { "en-US": "A.A.S. in Dental Assisting" },
	"ceterms:requires": [
		{
			"@type": "ceterms:ConditionProfile",
			"ceterms:name": [ "ceterms:degreeMajor" ]
		}
	]
}

When converted, it appears to work if the data is either a literal or an object, but not both:

@prefix ns0: <http://purl.org/ctdl/terms/> .

<http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13>
  a <http://purl.org/ctdl/terms/AssociateDegree> ;
  ns0:availableAt [
    a ns0:GeoCoordinates ;
    ns0:name [ ], ns0:GeoCoordinates
  ] ;
  ns0:name "A.A.S. in Dental Assisting"@en-us ;
  ns0:requires [
    a ns0:ConditionProfile ;
    ns0:name ns0:degreeMajor
  ] .
{
  "@context": {
    "ceterms": "http://purl.org/ctdl/terms/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@graph": [
    {
      "@id": "http://lr-staging.learningtapestry.com/resources/ce-dd1df89c-5ed7-43ee-be8e-44e799ca6f13",
      "@type": "ceterms:AssociateDegree",
      "ceterms:availableAt": {
        "@id": "_:N7e42aed07ed9475da96b8d01bbe71255"
      },
      "ceterms:name": {
        "@language": "en-US",
        "@value": "A.A.S. in Dental Assisting"
      },
      "ceterms:requires": {
        "@id": "_:N8d333cef713d40a5984ff12cd7679619"
      }
    },
    {
      "@id": "_:N17f3c0889071412c8d26ce1c357abc7f"
    },
    {
      "@id": "_:N8d333cef713d40a5984ff12cd7679619",
      "@type": "ceterms:ConditionProfile",
      "ceterms:name": {
        "@id": "ceterms:degreeMajor"
      }
    },
    {
      "@id": "_:N7e42aed07ed9475da96b8d01bbe71255",
      "@type": "ceterms:GeoCoordinates",
      "ceterms:name": [
        {
          "@id": "_:N17f3c0889071412c8d26ce1c357abc7f"
        },
        {
          "@id": "ceterms:GeoCoordinates"
        }
      ]
    }
  ]
}

Is there an alternate way of encoding history changes such that the object uses rdf:object for URIs and rdf:langString for language-strings?

@stuartasutton
Copy link
Contributor Author

stuartasutton commented Sep 20, 2017

@siuc-nate, looks good.

In your original, you have the following rdf:langString that has not been language tagged.:

"ceterms:description": "add complaint process, and ensure shown in proper section"

I manually tagged it to align with how it is declared in the @context ("ceterms:description": { "@container": "@language" } and it validated in terms of the context--good; i.e.:

"ceterms:description": { "en": "add complaint process, and ensure shown in proper section" }

This means that the structure defined in the context works with language tagged data; but, it does not enforce the addition of such a language tag. So prepping for use of the language tag as you have done in the context does not compel such use. For me, this is a good thing. Whether including a language tag gets enforced as a requirement in an implementation (CER?) issue --i.e., the value of the description property must have a language tag-- has to be an outcome of some other validation process on the data.

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 20, 2017

Thanks, Stuart.

I was mid-edit of the post above yours when you replied, and I added the information about the use of rdf:object with langString - can you review that and let me know your thoughts?

One potential solution would be to leave rdf:object out of the @context altogether and use either a URI (with @id notation) or a language string (as an object with @language and @value properties). Since this would be the only property identified so far with this kind of (implicit) range, it might be an acceptable complication, though not preferable if a simpler, @context-based solution can be found.

I would wonder how the JSON-LD folks would handle such a situation.

@stuartasutton
Copy link
Contributor Author

Not sure what's the issue.

rdf:object (when the data is a string rather than a URI)

When is an object ever a string? I think I am not getting this.

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 20, 2017

In the encodings for history tracking, when a definition or whatever gets changed, the rdf:object is a (lang)string. Otherwise it's a URI to something (when a domain or range changes, for example).

Compare:

"cs:addition": {
  "rdf:subject": "ceterms:name",
  "rdf:predicate": "rdfs:label",
  "rdf:object": { "en-US": "Name" }
}
"cs:addition": {
  "rdf:subject": "ceterms:name",
  "rdf:predicate": "rdfs:domainIncludes",
  "rdf:object": { "@id": "ceterms:Credential" }
}

The question is, what, if anything, should the @context hold for rdf:object?

@siuc-nate
Copy link
Contributor

After discussing this with @stuartasutton, the current approach is to use rdf:object for non-language strings and a new property, meta:objectText, for language strings. This property will be declared to be a subproperty of rdf:object:

{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdfs:range",
"rdf:object": "schema:ContactPoint"
},
{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdf:label",
"meta:objectText ": { "en": "Contact Point" }
}

@stuartasutton
Copy link
Contributor Author

Nate, we need to keep thinking about this since parsers will not recognize the following as a triple, and that's key:

{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdf:label",
"meta:objectText ": { "en": "Contact Point" }
}

So, both of these are legal:

{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdfs:range",
"rdf:object": "schema:ContactPoint"
},
{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdf:label",
"rdf:object ": { "en": "Contact Point" }
}

@siuc-nate
Copy link
Contributor

siuc-nate commented Sep 22, 2017

@stuartasutton I was afraid of that. That's why I conducted the experiment referenced in #459 (comment) after the property list (and with the name property rather than the object property).

Basically, it doesn't work to have a @context like this:

"@context": {
	"rdf:object": { "@container": "@language", "@type": "@id" }
},

That means we would need to fall back to leaving rdf:object out of the context and defining it mid-use, e.g.:

{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdfs:range",
"rdf:object": { "@id": "schema:ContactPoint" }
},
{
"@type": "rdf:Statement",
"rdf:subject": "ceterms:contactPoint",
"rdf:predicate": "rdf:label",
"rdf:object ": { "@language": "en", "@value": "Contact Point" }
}

Given that it only applies to one property and that property only shows up in history tracking, this may be an acceptable compromise - however, it means we won't have a consistent implementation of langString properties and that could come back to bite us if we add rdf:object anywhere else in the schema (such as the third-party mapping). It takes special handling in code to serialize and deserialize a value that could be more than one type, hence my recommendation of using meta:objectText instead. I had hoped that by declaring meta:objectText to be a subproperty (or perhaps even an equivalent) of rdf:object we would solve the issue of parsers not understanding it - otherwise, I'm uncertain of the purpose of making such a declaration.

@stuartasutton
Copy link
Contributor Author

@siuc-nate , can you provide me with a complete @context for the snippet above?

As is above, the snippet does validate. So I added a @context and @graph. It then validates using easyrdf, but does not translate meaningfully. I'm thinking I don't have an adequate @context.

I got this far:

{ 
    "@context": { 
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", 
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#"
    },
    "@graph": [
        {
            "@type": "rdf:Statement",
            "rdf:subject": "ceterms:contactPoint",
            "rdf:predicate": "rdfs:range",
            "rdf:object": { "@id": "schema:ContactPoint" }
        },
        {
            "@type": "rdf:Statement",
            "rdf:subject": "ceterms:contactPoint",
            "rdf:predicate": "rdf:label",
            "rdf:object ": { "@language": "en", "@value": "Contact Point" }
        }
    ]
}

But translation to turtle looks like this:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[]
  a rdf:Statement ;
  rdf:subject "ceterms:contactPoint"^^xsd:string ;
  rdf:predicate "rdfs:range"^^xsd:string ;  
  rdf:object schema:ContactPoint .

[]
  a rdf:Statement ;
  rdf:subject "ceterms:contactPoint"^^xsd:string ;
  rdf:predicate "rdf:label"^^xsd:string ;    
  rdf:object  "Contact Point"@en .

@siuc-nate
Copy link
Contributor

The context in your example seems to fit, though perhaps the translator doesn't assume the rdf:subject and rdf:predicate properties are actually URI pointers (which I assume a triple store would). Is that what you're referring to? I'm not sure where there is missing information in your post.

@stuartasutton
Copy link
Contributor Author

Yes, that's what I am referring to.

@siuc-nate
Copy link
Contributor

So something like this?:

{ 
    "@context": { 
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", 
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "rdf:subject": { "@type": "@id" },
    "rdf:predicate": { "@type": "@id" },
    },
    "@graph": [
        {
            "@type": "rdf:Statement",
            "rdf:subject": "ceterms:contactPoint",
            "rdf:predicate": "rdfs:range",
            "rdf:object": { "@id": "schema:ContactPoint" }
        },
        {
            "@type": "rdf:Statement",
            "rdf:subject": "ceterms:contactPoint",
            "rdf:predicate": "rdf:label",
            "rdf:object ": { "@language": "en", "@value": "Contact Point" }
        }
    ]
}

@stuartasutton
Copy link
Contributor Author

Bingo!

RDF/XML

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    
    <rdf:Statement rdf:nodeID="genid1">
        <rdf:object rdf:resource="http://schema.org/ContactPoint"/>
        <rdf:predicate rdf:resource="http://www.w3.org/2000/01/rdf-schema#range"/>
        <rdf:subject rdf:resource="ceterms:contactPoint"/>
    </rdf:Statement>
    
    <rdf:Statement rdf:nodeID="genid2">
        <rdf:object  xml:lang="en">Contact Point</rdf:object >
        <rdf:predicate rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#label"/>
        <rdf:subject rdf:resource="ceterms:contactPoint"/>
    </rdf:Statement>
    
</rdf:RDF>

TURTLE

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

[]
  a rdf:Statement ;
  rdf:subject <ceterms:contactPoint> ;
  rdf:predicate rdfs:range ;  
  rdf:object schema:ContactPoint .

[]
  a rdf:Statement ;
  rdf:subject <ceterms:contactPoint> ;
  rdf:predicate rdf:label ;    
  rdf:object  "Contact Point"@en .

@siuc-nate
Copy link
Contributor

These changes have been made in pending CTDL and noted in history tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants