-
Notifications
You must be signed in to change notification settings - Fork 822
-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CategoryCode Proposal (formerly EnumerationValue Proposal) #894
Comments
This is a very useful proposal for all of us developing markup of external enumerations of considerable size and working with organizations for the deployment of such enumerations in the context of sharing them on the web. I'm pleased to see a pull request in place. |
In OWL, are these "Object Property Restrictions" [edit slash "Data Property Restrictions"] like https://www.w3.org/TR/owl2-quick-reference/#Class_Expressions ... The class / instance distinction here is less than clear. |
Firstly as with the rest of Schema.org there are no implied constraints, rules, or inference implied in this proposal. That is not to say that a set of values applicable to a specific situation could not be described (and published using Schema) using the types and properties proposed here. It would be up to an individual application to apply its own internal [OWL] rules to indicate the all the EnumerationValues that are 'part of' a particular EnumerationValueSet are valid in a specific circumstance. However, in Schema markup no such inference could be made. |
|
On 23 January 2016 at 07:40, Wes Turner wrote:
A recommended read that lays out the underlying principles and history of |
@Dataliberate Q: what is the difference between (or relation between) the proposed EnumerationValueSet and http://schema.org/Enumeration ? |
@philbarker good question! The main relation is between EnumerationValue and Enumeration. The relation being very similar to that of the core Schema.org vocabulary, hosted extensions such as auto.schema.org, and external extensions. In theory any value, or identifier for that value, could be defined in the Schema.org vocabulary as an Enumeration-subtype type. As per current examples - OrderInTransit a subtype of OrderStatus, Paperback a subtype of BookFormatType, etc. all themselves subtypes of Enumeration. However, other than for commonly known/used values, it is not practical to burden the vocabulary, and the agreement process for managing it, with the maintenance of all the potential lists of values for such things. Wanting to address the need to be able to mark up, using Schema.org, these terms and values that probably will never get assigned in the vocabulary, is what is behind the proposal for EnumerationValue. EnumerationValue could be considered an 'external Enumeration value'. So in answer to your question, they are closely related at least in how they are/would be used. So much so, that I am considering updating the proposal to make EnumerationValue a subtype of Enumeration. There are already in existence many candidates for terms that could be marked up in Schema using this approach. These examples often have properties in addition to their URI value (name, description, code, etc.) and are often grouped together in sets/dictionaries/terms such as the Library of Congress Subject Headings. That style of need being catered for with the EnumerationValueSet and valueCode property in the proposal. Hope that helps. |
@Dataliberate Q: Did you consider making EnumerationValue a subtype of Enumeration? I think it would be useful. |
Yes I did, and having be asked a couple of times about it, I have concluded that it would be the right thing to do. So my @RichardWallis persona has just done it. |
I'm trying to translate this proposal for those familiar with SKOS. Please correct my if I'm wrong:
I miss counterparts of |
@nichtich your list of approximate relations to SKOS terms is about right for the proposal as it stands. Glad you used '≈'. As described above this is a simple proposal mainly targeted at simple use cases. For example an already existent list of values for some types of things (eg. The list of ISO639-2 Language Codes). Many of these do not have any of the hierarchy or relationship concepts that would require the extra modelling power of SKOS (related,broader,narrower,exactMatch,etc.). Yes it could be applied to sets of terms already defined in SKOS, but for an initial simple proposal adding much more would a) Introduce complexity; b) Consequentially reduce the potential for broad adoption across the [mostly non-SKOS] web. The several issues/threads, dedicated to the re-creation of SKOS in Schema, that as yet are to come to a satisfactory conclusion are I believe symptomatic of a lack of a view of where it would be implemented widely. My approach in making this proposal was to take something simple, with obvious simple use cases, that could possibly be used to partially address more complex issues. If we implement it and it gets used, we would have a real foundation with real usage to build on for future extension/enhancement. Meanwhile the much wider discussions around similarity, relatedness, matching, and sameAs can come to a natural conclusion in their own time and future proposals So I think we should continue with this in its current state. |
@Dataliberate I like this approach, fits well with what I need for my current project. However I also need to relate an This would be equivalent to |
Further question: is it still recommended to add types for these category values, as defined in the original guidance. Or is this route only intended whether that is not useful/advised? I'm having a hard time deciding which option might be best, so was hoping I could use this approach and later add some types & extra semantics if necessary. |
Update: @ldodds Part of the motivations behind this proposal were previous discussions about if Schema.org should include/support/reference SKOS and if so by how much. It was designed as a very lightweight approach that could be built upon based usage experience. I would suggest that, at least initially, broader/narrower hierarchical relationships between values would be best handled in localised data structures that the [proposed] Schema types would be added to for wider sharing. As per the examples, it depends on what is already in place as to the final modelling of a set of CategoryCodes. If the terms are already defined (in SKOS or something else), it would be a matter of adding further Schema Types. For example a term could be defined as being both a skos:Concept and a schema:CategoryCode. |
@RichardWallis I'm not sure what you mean by "best handled in localised data structures", that we'd need to define a custom set of properties? For the openactive project none of these category code sets exists as SKOS, or even as publicly available data for the most part. So part of my interest here is in helping that data be made open. The broader/narrower relationships are quite important for tieing together physical activities. Also for many, many different controlled vocabularies. Their addition here seems like a relatively small change to me? |
@ldodds By localised structures I meant where they were already defined (in SKOS for instance). In such cases what you describe would already be in place if needed. As to starting from scratch I can understand your desire, in this use case, to introduce broader/narrower into Schema. In isolation it does seem like a small addition. However, it would also potentially introduce some assumptions about things marked up as CategoryCode types and the relationships between them that do not exist. SKOS, and hence its terms, assumes an organised structure of terms, such as in some controlled vocabularies. Whereas CategoryCodes could be applied to disconnected things with no such relationships or hierarchy. In previous discussions around possibly including SKOS in Schema, potential issues about introducing a too constraining structure, were raised. Also the issue of where you draw the line as to which terms would/would not be a small change to include was subject of some debate. Although such relationships are important within controlled vocabularies, I wonder if or how they would be used by data consumers At this stage I am still inclined to keep this proposal as simple as possible, which in itself will be a major step forward in this area. Looking to future proposals to possibly extended it based on experience of implementation. |
@danbri I personally prefer to just leave CategoryCode as a subtype. That aligns with DMCI Abstract Model and in particular their nearly equivalent usecase with VocabularyEncodingScheme as a Class http://dublincore.org/documents/dcmi-terms/#section-8 Anyways, can we just get DefinedTerm already ? :) Its Simple English and will foster more growth compared to "EnumerationBlah's" and "VocabularyBlah's" (and it comes from @philbarker who makes good judgement calls and has never failed us yet :) |
I like the idea of having |
+1 for CategoryCode
…On Wed, Nov 22, 2017 at 7:46 AM, Jarno van Driel ***@***.***> wrote:
I like the idea off leaving CategoryCode as a subType as well, especially
since it represents more closely what marketers are looking for for their
day to day work. I fear using just TermDefinition will cause many
marketers (and the developers that implement markup for them) to overlook
it.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#894 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFAlCuxR75Bhuieu-WNNwW1ru7IuFj8vks5s5EHegaJpZM4GiNgk>
.
|
Liking @danbri's suggestion of keeping I will follow that logic through and map out how that would look with DefinedTermSet and relevant properties. |
CategoryCode as a subtype of DefinedTerm sounds good. I agree with @thadguidry's sentiment: I think these being in pending also inhibits uptake. Any prospect of moving this into the main vocabulary? |
I have now updated the PR (#1776) to reflect the proposal of making CategoryCode a subtype of TermDefinition - looks good to me. This is still all in pending - as expressed by others, it would be good to get this in the core. |
@RichardWallis GraphicNovel link is broken at the bottom of https://schema.org/BookFormatType |
I am rather late to this discussiion. First thinking about this from a practical viewpoint. In my institute we have a lot of vocabularies, classifications, termlists. Managing them in a specifc vocabulary tool, taking them out of excel files or normal webpages is my goal these days. Tools like that use SKOS. I might consider adding schema.org equivalents like proposed here. But there it stops for me. I want to use them as separate applications. I do agree that copying SKOS into schema.org is not a good thing. Still, going on about the case of Datasets and DCAT that is also mentioned, I would think that the case for SKOS support in schema.org is strong. Could definedTermSet in the future be one of the type SKOS? I am also triggered by the fact that Google in the schema.org/Dataset now gives support to DCAT as such. Anyway, these are interesting developments. |
I'm going to jump in here. Does this extension support a range of categories? I.e. My Thing is in categories C-E. Common in patents. |
@dgrahn yes, that would be the valueCode now in https://pending.schema.org/CategoryCode as "codeValue" that @RichardWallis proposed. It can be used to hold any "key" or "code" and where there is an associated meaningful value when that "key" or "code" is looked up. Your "C-E" key/code has some associated meaningful value to the publishers or consumers of patents.
By the way, what does "C-E" mean ? |
@thadguidry It's an example. Could have done foo-bar. What if you don't want to have users parse the value? |
@dgrahn sorry, I don't understand. Can you explain further what you mean ? Give us the scenario or problem you have that you are trying to solve. That will help. |
So categories can sometimes be given as a range. i.e. from C to E. Those category names can actually have "special" characters in them like,
In fact, that's what I'm using right now as an extension of |
@dgrahn what would the parent types be for that scenario ? Can you give an example of the Thing that has a startCode of C and endCode of E ? I'm trying to understand that parent Thing and what is it called in Patent terminology ? If we can understand that better, then maybe we can find an easier path or different way to help. |
I've been thinking about proposing Is that making sense? |
@dgrahn Yeap, makes much more sense now. OK, someone else has a common need and opened an issue for Patents: https://github.com/schemaorg/schemaorg/issues/1863 I would suggest to begin working with the community to create and maintain an extension for Patents (this might involve working with the loose Law proposals also in our issues, just search them). To begin - See "Extensibility Mechanisms" section and other sections in How We Work And use our mailing list to begin, or if you want to get formal, you can request a W3C Community, https://www.w3.org/community/schemaorg/ |
If I am getting your intention @dgrahn, I don't think |
I wanted to follow up on some discussion on Twitter that seems related to this. First, I think the general proposal is very useful. My chief concern is that some users may find the approach difficult to implement and may need a light weight alternative. Two hurdles I see to adoption are:
I like how the MedicalCode provides an option to use I wonder if we could extend this option to cover any
Note: only the name of the classification systems would be enumerated, not all the individual codes belonging to that system, which would stay outside of schema.org. The benefits of having these external codes as enumerated options is that it will point users to potentially helpful classifications, and will save them the effort of finding and entering the URL. The list of enumerated options does not need to be fixed. If they need to indicate a less commonly used classification, users still have the ability to indicate using the I know some of these are already available as dedicated properties, a practice I would guess schema.org does not want to proliferate. Currently their are dedicated properties for:
In pending, there is also a One question raised is if users want a quick way to indicate a category value, why can't they use wikidata? While wikidata is useful to resolve entities, it is not the most friendly resource for classifying the kind of thing being described. It has a flat structure and can sometimes be prone to duplication. I think it will be easiest for users to start with a list of values that they can choose from. |
@MichaelAndrews-RM Creating enumeration types for all the potential classification systems across all useful domains I believe would be beyond the remit of Schema.org and not least a significant task to keep up to date. With the Where those may be equivalent to individual terms/categories in external authoritative classification systems or the classification systems themselves the |
@RichardWallis I am happy that the proposal can accommodate so many different kinds of classifications. While it is good at addressing the "long tail" of the distribution of many classifications available, it isn't so good at helping people who need to use one of the 5-10 most-frequently cited classification systems because it is so complex, and I doubt webmasters will find it easy to understand and use. Having these classification is a big benefit to support the findability and aggregation of data, but for all scale adoption to occur, search engines will need to be able to promote an easy-to-use way for webmasters to add these codes. I know that classification code schemes aren't fixed enumerations like days-of-the-week, but I expect the most popular ones are reasonably stable, so that updating them (e.g. from ISICv4 to ISICv5 in the future) would not be too hard to do. Having said that, I recognize the first priority is getting the |
So the updated proposal would have a mapping to SKOS as following:
By the way I don't fully get why we need both Anyway I think the proposal does not catch the use case of referencing a concept/term/category/... by its code/notation/... For instance how to express the DDC notation of a book is
|
@nichtich Regarding the need for both As to your question about usage... There are a few ways that this could be described using
Note: due to licensing restrictions, the sharing of DDC classification names is difficult. Other schemes are not so restrictive - I have included a Library of Congress Subject heading to demonstrate. As the LCSH data is public and online a potential more light-weight version could be like this:
or possibly:
How these would be represented in Microdata would depend on individual implementations, but here is one possible way:
|
Is this necessary? Why not just use SKOS? |
@dr-shorthair
Also in response to a similar question/comment:
The previous discussions, over several years, concluded that adopting SKOS [in Schema.org], in whole or part, would not fit well with the [current] approach and use cases for the vocabulary. Currently the terms created are located in the pending section of the vocabulary. In the future, as they become widely adopted, I expect a proposal to move them into the core of the vocabulary. At that time it may be appropriate to also suggest that the terms should be mapped to their SKOS equivalents. |
Implemented in PR #1255 |
Is there still no suitable formula with which to mark a glossary page with many terms included? Thank s for all. Because I have several pages with complete dictionaries. And the truth is that they are quite despised by "google" and its indexing. Maybe this would help make it worthwhile. Something so useful, like creating glossaries. If not. They are a real waste of work time. It's not worth spending your time creating a glossary / dictionary. If then a book sale that carries that term in the title is going to have total preference ... There is no formula that revalues this type of content (such as glossaries). If they know it. Could you suggest me an idea. For example (https://ciberninjas.com/glosario/completo-tecnologias-python/) is a dictionary of terms related to Python. The result will be 48 "what is" questions on a single page? Isn't Google going to penalize me if I do that? If there is no specific markup for this type of publication. Sorry for me bad english. Thousand pardons. I hope you have been able to understand me, more or less. |
@rosepac I think you are looking for DefinedTermSet and DefinedTerm. The example for DefinedTermSet (at the bottom of the page) shows how to use it for a dictionary; a glossary would be the same. Whether Google likes the markup is beyond the scope of schema.org |
I am looking at using Defined term for classifying data files on a language project - so there would be sets of terms for classifying CreativeWorks in various dimensions. Is there a way to leverage the sets so you can can that a particular property, say linguisticDataType can have a value that comes from one of the sets? |
Background
When marking-up with Schema.org there is often need to associate the Thing being described with a pre-defined value - a type, category, subject, topic, definition, etc.
In certain specific cases the vocabulary handles this using Enumerations and Enumeration subtypes to provide a specific type value. For example BookFormatType, which has subtypes of EBook, HardCover, PaperBack, and in the bib.schema.org extension, http://bib.schema.org/GraphicNovel. This mechanism works well with enumerations containing a small number of enumeration types and of fairly static content.
Where it is not practical, or desired, for Schema.org to become the authority for many, various, and or large sets of values, external enumerations are recommended. In a blog post referenced from the Schema.org documentation the mechanism for external enumerations is introduced for referencing lists of values external to the vocabulary.
What could be viewed as a compromise between these approaches is demonstrated in the DayOfWeek Type. It could be argued that the actual days of the week should have been defined in Schema.org, Monday, Tuesday, etc., as subtypes of DayOfWeek. Instead values in the GoodRelations vocabulary, for days of the week, are documented as commonly used. Thus both encouraging the use of external values whilst, expressing implied preference for a particular external set of values.
Markup for External Enumerations
What Schema.org does not yet address however, is the markup of external enumeration values in the context of them being shared on the web. The use-cases for this include potential addition of Schema.org markup for existing sets of values and for the creation of new sets. Examples include adding Schema markup to values, often referred to as authorities in the library domain, for subjects and persons at national level libraries such as The Library of Congress; the markup of a new authoritative list of sports types, or bank account types, or medical treatment types.
Previous discussions [1] [2] referencing an earlier MiniSKOS proposal provide background and some use cases to view this simple proposal against.
Proposal
This proposal consists of:
Definition RDFa
Examples (Turtle)
1- A Library of Congress resource type
2- An animal classification term and term set
3- Terms in a dictionary of legal terms
4- A occupation term defined by O*Net Online
5- An ISO639-2 Language Code
The text was updated successfully, but these errors were encountered: