Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing file-format-types to be RDFS instead of OWL #27

Merged
merged 1 commit into from Dec 4, 2015

Conversation

escowles
Copy link
Contributor

@escowles escowles commented Sep 9, 2015

  • adding support for named individuals to rdfs2html stylesheet

Fixes #26

@escowles
Copy link
Contributor Author

escowles commented Sep 9, 2015

I wasn't sure what to do with the udfrs:GenreFacetType instances -- these look like OWL named individuals, and I don't think those map directly to RDFS. Is it OK to leave them as is?

@ruebot
Copy link
Contributor

ruebot commented Sep 10, 2015

@escowles thanks for updating the stylesheet too!

Looks like udfrs:GenreFacetType is an owl class: http://udfr.org/onto/onto.rdf

<owl:Class rdf:ID="GenreFacetType">
  <rdfs:subClassOf rdf:resource="#ControlledVocabulary"/>
  <rdfs:isDefinedBy rdf:resource="#"/>
  <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Genre facet type</rdfs:label>
  <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
    The genre facet type defines the main classes found in the GDFR classification system. It is intended to indicate broadly the type of content associated with a format.
  </dc:description>
</owl:Class>

...and http://udfr.org/docs/onto/

@escowles
Copy link
Contributor Author

@ruebot: Right, udfrs:GenreFacetType is a class -- and the terms here are all instances of it. So I think that makes them NamedIndividuals in OWL. This wasn't explicit before, but they could have been:

<owl:NamedIndividual rdf:about="http://pcdm.org/file-format-types#Archive">
  <rdf:type rdf:resource="http://www.udfr.org/onto#GenreFacetType"/>
   ...
</owl:NamedIndividual>

So, should we leave them as individuals? Or should they be classes that subclass udfrs:GenreFacetType? Maybe @azaroth42 or @acoburn have opinions?

@ruebot
Copy link
Contributor

ruebot commented Sep 10, 2015

@escowles oof. I totally misinterpreted that 😬

@acoburn
Copy link
Contributor

acoburn commented Sep 10, 2015

As currently defined, these are definitely individuals. For instance, see the OWL guide. @escowles is correct in his example above.

However, when I look at other similar vocabularies (e.g. DCMIType), theses sorts of entities are defined as classes. So I'm somewhat inclined to follow that pattern (though there may be another pattern suggesting otherwise).

In terms of using this vocabulary (as it currently stands), am I correct that one might express this:

<my-resource> a pcdm:File, pcdmuse:OriginalFile ;
    dc:type pcdmformat:Dataset .

as opposed to this:

<my-resource> a pcdm:File, pcdmuse:OriginalFile, pcdmformat:Dataset .

My understanding of the Class vs. Individual distinction is that an Individual is one particular thing. For example, lit:JohnMilton or planet:Jupiter, as opposed to a "generic type": lit:Author or planet:Jovian. And so it would follow that e.g. a Dataset is a type of thing (i.e. a rdfs:Class) rather than a particular thing (owl:NamedIndividual). However, one could also argue that a Dataset is a particular GenreFormat (and hence an owl:NamedIndividual rather than an rdfs:Class).

That is to say, I could go either way but am inclined toward defining them as classes because it seems other similar vocabs do that. Do @azaroth42 or @barmintor have an opinion?

@escowles
Copy link
Contributor Author

I just checked the other vocabs, and DCMIType, MARC Resources, Nepomuk and Pronom define their terms as classes, and AAT and UDFRS define them as individuals.

I agree with Aaron: I could go either way, but these terms to seem more like categories, so maybe converting them to rdfs:Classes makes more sense.

@escowles
Copy link
Contributor Author

I've updated this PR to make the entities RDFS Classes instead of named individuals.

@ruebot
Copy link
Contributor

ruebot commented Sep 15, 2015

👍

@acoburn
Copy link
Contributor

acoburn commented Sep 15, 2015

👍 (non-binding)

@barmintor
Copy link
Contributor

I'm not sure about this. If you get into the notion/category debate, I think you're getting into an overly expansive ontology of class. I'd ask, for example, whether you expect these Things to be the object of rdf:type, or of (for example) dc:format. EDIT: I see @acoburn is ahead of me!

@barmintor
Copy link
Contributor

That said, I'm very interested to see what Rob's opinion is.

@escowles
Copy link
Contributor Author

@barmintor I was definitely thinking these would be used with dc:format. Does that argue against defining them as classes? The DMCIType terms are defined as classes, which seem like the canonical terms to use with dc:format: http://dublincore.org/2012/06/14/dctype.rdf

@acoburn
Copy link
Contributor

acoburn commented Sep 15, 2015

FWIW, I was also planning to use these with dc:format.

@barmintor
Copy link
Contributor

I'm not digging in my heels, I only want to make sure we're not conflating semantic contexts here. If we're going to follow the DC practice here, we should probably remove the "<rdfs:subClassOf rdf:resource="http://www.udfr.org/onto#GenreFacetType\" />" statements.

@ruebot
Copy link
Contributor

ruebot commented Sep 15, 2015

pings @azaroth42

@azaroth42
Copy link
Contributor

👎 to both making them classes and using dc:format.

If the pattern is:

_:x a pcdm-ext:Archive ;

Then I'm okay with a class. But having classes as the object of dc:format
seems very strange. What would the instances of the class be?

@barmintor
Copy link
Contributor

@azaroth42 I am reading this as "-1 to making them classes while using dc:format", and not "-1 to classes; -1 to dc:format". Is that correct?

@azaroth42
Copy link
Contributor

Yes...

👎 to ?x dc:format ?y . ?y a rdfs:Class .

But I'm fine with either ?x a ?y . or ?x dc[terms]:format ?y .

Happy to hear arguments as to why it should be a class though?

@escowles
Copy link
Contributor Author

@azaroth42: I think instances of the classes would be fully-specified file formats (e.g. TIFF 6.0 would be an instance of #RasterImage). I think the vocabs were referencing here are split between whether their terms are classes or individuals, though DC/DCMIType definitely envisions using dc:format with DCMIType classes.

@azaroth42
Copy link
Contributor

Some worked examples might help?

@escowles
Copy link
Contributor Author

@azaroth42: I would expect the typical use to be something like:

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ebucore: <http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
@prefix pcdm: <http://pcdm.org/models#> .
@prefix pcdmfmt: <http://pcdm.org#file-format-type#> .
@prefix pcdmuse: <http://pcdm.org/use#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema> .

</object1/files/file1> a pcdm:File, ldp:NonRDFSource, pcdmuse:ServiceFile;
  dc:format pcdmfmt:Video;
  ebucore:fileSize "12345678"^^xsd:long;
  ebucore:filename "movie.mp4" .

</object1/files/file2> a pcdm:File, ldp:NonRDFSource, pcdmuse:ThumbnailImage;
  dc:format pcdmfmt:RasterImage;
  ebucore:fileSize "5678"^^xsd:long;
  ebucore:filename "thumbnail.jpg" .

</object1/files/file3> a pcdm:File, ldp:NonRDFSource, pcdmuse:ExtractedText;
  dc:format pcdmfmt:UnstructuredText;
  ebucore:fileSize "1234"^^xsd:long;
  ebucore:filename "fulltext.txt" .

</object1/files/file4> a pcdm:File, ldp:NonRDFSource, pcdmuse:Transcript;
  dc:format pcdmfmt:HTML;
  ebucore:fileSize "1234"^^xsd:long;
  ebucore:filename "transcript.html" .

But you could also define individuals if you wanted to record a specific format for some reason:

</object1/files/file5> a pcdm:File, ldp:NonRDFSource, pcdmuse:OriginalFile;
  dc:format pcdmfmt:Video, </formats/VideoFormat1>;
  ebucore:fileSize "123456789"^^xsd:long;
  ebucore:filename "movie.vid" .

</formats/VideoFormat1> a pcdmfmt:Video;
  dc:title "Video Format #1" .

@jpstroop
Copy link

But having classes as the object of dc:format seems very strange.

Yes.

If I'm understanding the concern correctly, I think it comes down to: "This thing is a this format" vs. "This thing is of this format", which is a pretty subtle distinction.

To me, an instance of a postcard is not an instance of a format. It might be a resource that has characteristics in common with other things that are also of this dc:format, but, from a practical perspective, you can only use that fact to contextualize it among other resources (based on their dc:formats) or maybe trigger certain behaviors in an application. You can't use the object of dct:format to constrain, e.g., the rdfs:range or rdfs:domain of a resource, so what does making it a class get us? If anything, by not making the object of dct:format a class, the distinction between rdf:type and our intentions for dct:format becomes clearer.

@azaroth42
Copy link
Contributor

Thanks for the example @escowles! I'm still 👎 to using both classes and instances of those classes as the object of dc:format. The video File doesn't have a format which is the class of video formats ... it has a particular format. The video File (OTOH) is a Video. Having a class for use (which is context specific) but a mix of class and instance for format (which is not context specific) doesn't fill me with happiness.

I agree with @jpstroop: The video file is-a Video. It is-in-a/has-a format, which is-a Format.

@escowles
Copy link
Contributor Author

I think I understand the reasoning for using individuals instead of classes here: dc:format should point to a concrete instance not a class, and using classes will lead to confusion with the File Use Vocab.

I'm happy to revert to using udfrs:GenreFacetType instead of rdfs:Class. But the existing rdfs:subClassOf statements should probably be changed to something else: skos:broader makes the most sense to me, given the skos:exactMatch/skos:closeMatch we're already using.

@ruebot
Copy link
Contributor

ruebot commented Sep 30, 2015

@escowles 👍 -- I like the use of skos:broader

@ruebot
Copy link
Contributor

ruebot commented Oct 9, 2015

I'm gonna tag some new committers to see if we can get some movement on this:

@daniel-dgi @no-reply @kestlund

@kestlund
Copy link

@escowles @ruebot
Catching up on this discussion... 👍 to 'skos:broader' ; I had been indifferent to 'udfrs:GenreFacetType' but if it resolves the arguments, then it certainly is worth keeping.

Are there any other outstanding issues or just looking for additional consensus?

@DiegoPino
Copy link

👍 for skos semantic relations, but
also i think it would be correct to define that every instance is also of rdf:type -> skos:Concept
Since skos:broader, exact match, etc work on skos:Concept individuals.

Lastly, just a functional idea (wish), it could be useful to add an owl:imports for udfrs. It's a practical need when using pcdm ontologies in applications like protege. No need to import skos because udfrs already does this.

@escowles
Copy link
Contributor Author

@kestlund I think we're just making sure we've got consensus here.

@DiegoPino: I agree it would be good to define the terms as skos:Concepts, since we're using the SKOS predicates to link them. I'm not sure about importing UDFRS -- is that just for the udfrs:GenreFacetType definition?

@DiegoPino
Copy link

@escowlesthe idea of importing is just functional. We are creating individuals from an external ontology defined classes. So i thought it may be a good idea to import them, but don't worry, just a wish based on one of my personal use case and maybe out off scope (so no intention to add this topics to this particular conversation):

Personal use case:

I have been trying to deal and understand the strange/modal (strange for me, i'm sure there is a need, but i'm not aware clearly) mix in PCDM of rdfs and owl worlds and doing some local research using Protege to see how well all those different ontologies + ldp + PCDM play together. I have seen some comments here in the issues post about owl being a complicated beast to handle but i still see some parts of owl are being used(thats the modal part), and being my own experience the opposite( like the beautiful idea of having ObjectProperty and DatatypeProperty as different properties) and also not fully understanding how jumping from rdf to owl affects this, i usually pass PCDM ontologies through Protege. So said that, without imports it makes testing very complicated.

…for named individuals to rdfs2html stylesheet
@escowles
Copy link
Contributor Author

I've added rdf:type statements to the terms to make them skos:Concepts, and rebased to squash and resolve conflicts with the updated rdfs2html stylesheet.

@DiegoPino: I haven't included an owl:imports declaration, since that seems like a separate issue to me. Can you create another ticket for that? It seems like there is a broader discussion of OWL/RDFS, compatibility with tools, etc. that we should have.

@DiegoPino
Copy link

@escowles: thanks, don't worry about the owl:imports, it's just a good practice if creating new individuals from external defined classes. But I will create a new ticket for that because i'm having some issues dealing with this strange (strange for me…long discussion) mix of owl and rdfs use when trying to validate and do some interoperation with PCDM + LDP in protege

@ruebot
Copy link
Contributor

ruebot commented Nov 10, 2015

@duraspace/pdcm-committers shall we review/vote on this again since we new commits from @escowles?

@ruebot
Copy link
Contributor

ruebot commented Dec 2, 2015

@duraspace/pdcm-committers bump 😓

@kestlund
Copy link

kestlund commented Dec 4, 2015

+1

@azaroth42
Copy link
Contributor

👌 ... This isn't how I would do it, but as I'm not doing it and it's not core, I have no technical objections.

FWIW, the approach that I have seen taken most often is to use classes and rdf:type, such as:

  • dcmitypes
  • schema.org
  • activitystreams

@jpstroop
Copy link

jpstroop commented Dec 4, 2015 via email

awoods pushed a commit that referenced this pull request Dec 4, 2015
Changing file-format-types to be RDFS instead of OWL
@awoods awoods merged commit 1251d4e into master Dec 4, 2015
@awoods awoods deleted the rdfs-and-html branch December 4, 2015 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants