-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usefulness of E55_Type #37
Comments
So... the overall issue is that E55 is used in many places because CRM is a general ontology and needs to be specialized. This means that you might put different kinds of category (e55) typically using p2 has type. In order to solve this problem, LinkedArt proposes the pattern: E1 -> p2 -> E55 -> p2 -> E55 Where the second E55 qualifies the first. This would help with dealing with reaching the classes programmatically. You would look for only the E55 types classifying this entity that were themselves classified as X. E.g.: E21 p2 E55 Painter p2 E55 Profession If you want to query the E55 for gender, you don’t want to return the E55 for profession and this will allow you to clearly distinguish. I think what you argue instead is to specialize the property p2 E21 has profession E55 Painter This is also a valid solution. It is one that means generating a number of new properties. Because of the isA hierarchy they would all be reached by p2. That being said, you have to manage all these new properties. The argument in LinkedArt was that the first solution is a more repeatable pattern. I too am not sure when and how a third level of E55 would help. |
To put it another way CIDOC CRM keeps the ontology small by not covering classification work. A flexible way of extending CIDOC CRM without creating endless new classes but so that it has the semantic richness users need is to type classes. The business of classification is well handled in RDF by adopting SKOS. So wherever one sees a reference to E55 Type it is functionally equivalent to SKOS:concept. CIDOC CRM defers to SKOS as the well known and accepted system for encoding thesauri/vocabularies in RDF. So let's take the example of occupation/profession (there's an argument there too) from the start. If we want to indicate that a person has an occupation/profession, then we are going to be classifying that person. So the basic CIDOC CRM statement will apply: E21 Person -> p2 has type -> E55 Type Following this pattern, we could instantiate : http://viaf.org/viaf/15873 a crm:E21_Person; or in 'colloquial' language: Pablo Picasso has type Painter Now the problem with the above is that we can have multiple reasons for classifying an instance of person. If we want to add to this and say Picasso has the state assigned sex of male, then we would want to say also: http://viaf.org/viaf/15873 a crm:E21_Person; Now this is fine and good. But... now we want to support a query that helps understand who this Picasso fellows is. We want to query all the classifications that have been put on the individual and then display them somewhere, but in an ordered way: like the profession information in a profession display area and the gender information in a gender area. But there is nothing in these concepts themselves that says what we are functionally using them for. They are in a SKOS hierarchy, but this does not tell us what we are functionally using them for here. We are missing a meta type to tell us this. This is the argument for putting a type on these types, generating meta types in the modelling in a standard way, so that developers can easily retrieve just the information they want. So we add: http://viaf.org/viaf/15873 a crm:E21_Person; This means we can put it all together and have: http://viaf.org/viaf/15873 a crm:E21_Person; Which colloquially reads: Picasso has type Painter. and it should functionally allow to write a SparQL so that you can retrieve all types on an entity but sort them OR just retrieve one of the kinds of type on an entity and not all the type information. Ie retrieve the gender types on this entity only. The declaration of the type on a type is not an attempt to go around or do something different than SKOS. If one knows the particular vocabulary that has been adopted, then there are probably much more interesting searches you can do exploiting the power of the broader/narrower of the particular vocabulary adopted. So the researcher could do fascinating gender analysis of artists through time (if they had the data) by a thorough understanding of homosaurus and its distinctions. At the basic level, however, within the CHIN data, the developer/researcher is able to pick out that some type/concept has been used for a particular function within the model (gender/profession and so on), thus facilitating their search and retrieval. |
Notes on verbal meeting 2020-02-17 Flutifioc: If for example we write down all the tags needed for the knowledge base we have our vocab and there will be others as well. How are we to link these together? Habennin: The specification of the Target Model would be similar to linked art: for meta types (anywhere the model points to a type where a discussion would be interesting). A statement of the materials, for example, might be interesting to standardize or normalize without strict enforcement. Having to go out and choose would be meta types. For the meta-type it would be better to use a single vocabulary to have a constant reference point for the model so we can search for that. On a more theoretical basis, the discussion happens a lot in Parthenos who tried to integrate a number of datasets and ideally the fields would have been normalized despite messy or bad data or competing standards. FORTH is developing a tool called VisTA. There is also the development of OpenTheso. That could be a long term strategy. If the data is well-curated. Original fields and Enriched fields were done in Ariadne so that everything is reversible. Illip: Should we develop our own vocabularies or translate others? Habennin: Aligning with the most likely and reliable in terms of science, scholarship and durability is important. Habennin: we might need in some cases to rely on our own vocabularies. How to manage those extensions wasn’t clear. Flutifioc: should there be a mapping of metatypes? Habennin: The data can be translated to a type and the data value can be dumped as the value of E57 node in the mapping and eventual RDF and the tag would be a rule of CHIN. That E57 has type -- static URI to organize types. It could be CHIN’s, the AAT's, etc. Whatever we can have as a specification for programmers. Flutifioc: Is there a formal relation such as same:as? Or subclass of? Habennin: Official advice is to treat E55 as SKOS concept and some aspects of CRM do not even have to be there because they were there before. It is a rare area where SKOS does not cover the domain correctly. Stephen: the important thing is to link the metatype to a vocabulary, but the type does not have to be… is that right? Flutifioc: Like this? (Drawing below) Habennin: Yes. Even the occupation one can be contentious as not anyone can execute a profession. Occupation is a weaker term. Some of this is also discussed in linked.art. Illip: We need to define precisely the metatype (e55) but we can be less restrictive for the types. It depends on the level of interoperability vs simplicity. |
Quite agree with @Habennin : adding endless classes to CRM is not an option. Two considerations though:
|
@VladimirAlexiev 1) Yes for culture and nationality we are reusing the groups patterns. 2) What do you mean by usage-dependent info? Do you see an issue if we use @Flutifioc Are you still concerned with this issue of E55? or you understand the benefits of avoiding the endless classes even if it might be a little bit less semantic? |
@VladimirAlexiev Why is it not an option ? What is the issue with complementing CRM with some classes that are relevant to our use-case ? @illip Both. I understand the benefits of avoiding adding classes, even though we are not talking "endless", but some select few that seem very relevant to me, such as Profession or Work of art. But I'm still concerned, as, in my humble opinion, there are also benefits to adding such classes, and I'm definitely not as convinced as you concerning which option is best. |
@Flutifioc Adding a few classes is ok, provided they conform to CRM compatibility principles (which means make subClasses, subProps, or long-paths to complement existing CRM props used as short-cuts).
So there is a bit of danger of adding P2 to a concept because of the way it's used in other data. I don't mean the originator of that concept (eg Getty) is going to sue you: I'm just saying be careful that the added P2 makes sense universally |
@VladimirAlexiev Thanks for the advice concerning the conflicts that might arise from this I would also like to keep track of some Linked.Art discussions regarding this topic: Types of Types definition on their website Going back to the intial topic, CHIN would like to define a policy to define exactly when a situation requires the definition of new classes and properties. |
The current CHIN's proposal on this issue is:
We will validate this proposal with our Semantic Committee on January 7th. Our proposal regarding the three levels of |
All the aforementioned items have been approved by the Semantic Committee on 2021-01-07. |
A new section called Prioritization of E55_Type and P2_has_type over new classes and properties has been added to the Target Model. |
In lots of places in the target model, we are using an instance of E55_Type. Cf Issue #29 with the three levels of type :
JPRiopelle - P2 -> Painter - P2 -> Profession - P2 -> Occupation.
I feel that it would be semantically more correct to say that JPRiopelle - profession -> Painter, with Painter an instance of the class Profession. We could even put Profession as a subclass of E55 Type, this way we could still say that:
JPRiopelle - P2 -> Painter - rdf:type -> Profession - rdfs:subClassOf -> E55_Type.
Regardless of the need of a third level (which I feel we don't, but it is a question for Issue 29), this seems more... RDF-esque ? than using two or three levels of E55.
In general, I feel that creating an instance of E55_Type should be done when the line between class and instance is blurred (typically, Painter : we can say that it is an instance of Profession, but that JP Riopelle is an instance of Painter). In this understanding, Profession has no place being an instance of E55. And in many places in the target model, the instances of E55 could probably be classes. The model would be far easier to understand. Am I wrong in my understanding ?
The text was updated successfully, but these errors were encountered: