-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiles are uniquely named sets of data constraints, such as data elements (e.g. classes, properties, value domains) that describe (meta)data. #275
Comments
Could this be reworded as "data" or "(meta)data"? It's perhaps better to make it more general. |
The issue here was the use of "properties" and whether that applies only to RDF. So "or metadata terms" was (awkwardly) added to show that solutions other than RDF are included. Any wording that you can come up with that gets that across would be acceptable to me. |
ok I won't add to the currently ongoing work by suggesting new wording now, but this is worth keeping in mind, thanks. |
Please consider rewording in more requirement-oriented , imperative terms of RFC 2119, e.g. |
I'd prefer: Profiles should comprise a collection of metadata terms (addding:) This matches the DCMI definition, although as I recall Annette thought that profiles should be able to define new terms. |
Actually I would argue against 'comprise' in @jpullmann 's suggestion because it feels strange to say that profile comprises a collection with a name, while the original idea is that the name of the collection is the name of the profile itself. Then the fact that the elements of a profile are from published vocabularies is probably relevant but I would prefer to rely on other existing requirements to carry this instead of extending the scope of this one. E.g. we could piggyback on the one that says that profiles are based on existing specifications. Such a requirement is probably a better fit, as profiling a specification is probably nonsensical if that specification has not been published anywhere. And talking about @agreiner I guess she may argue against restricting the scope to 'metadata' - she has made a comment about this for the intro of the Profile guidance ;-) |
@aisaac I'm open to your suggestions - maybe: Profiles should made up of a collection of properties or terms, and the collection should have a name. The problem is that it is a named collection but it can't be "made up of a named collection" (because it can't be made up of a single thing, a named collection) although it can be that they "are a named collection". |
"Profiles should be made up of a collections of properties or data terms and have a name"? Sorry I wasn't very constructive: in fact my only issue about the original wording was on the specific aspect of data vs metadata. |
I have suggested this wording in the profiles guidance document. It may need to be expanded: 2.1 Profiles are named collections of properties Profiles SHOULD be made up of a collection of properties or terms, and that collection SHOULD have a name. Properties are selected from existing vocabularies. Because profiles as being described here are not limited to those in RDF, collections could be of metadata terms from any type of metadata schema. |
I agree with the general idea of the proposal, but I would nitpick over two words:
|
I very much agree about "properties" and am struggling to find a better word or phrase:
I get the desire to speak of "data" as well as "metadata" but I wouldn't want to confound a profile with the instance data in a dataset, such as statistical data. The profile is not instance data (except of the profile itself, but that seems tautological), but a set of terms and constraints that are used to create instance data that usually has a descriptive role. Maybe we need to define data and metadata? |
From @agreiner in email on 20-12-2018:
|
@kcoyle before I make a proposal about the wording itself, a bit of clarification about metadata vs data is perhaps needed, indeed. Or more precisely about 'data terms (or elements)' vs 'metadata terms (or elements)', because I feel that this is the problem (since I guess we both agree on what metadata and data are, and that metadata can be seen as a kind of data). I call 'data terms (or elements)' the classes and properties that are used to create (instance) data. And 'metadata terms (or elements)' the classes and properties that are used to create metadata. Profiles can be for either level, e.g. there can be DCAT profiles for data catalogues (which is metadata) and profiles for the statistical data that is in a dataset. In any case, in my view writing that profiles are made up of (meta)data terms (or elements) wouldn't confound profiles with instance data. Are we on the same line? |
@aisaac "there can be DCAT profiles for data catalogues (which is metadata) and profiles for the statistical data that is in a dataset." I totally agree on the separation between descriptive data (metadata) and instance data, and that profiles can consist of element sets for either type. Profiles are sets of elements intended to define data. Those elements have names (e.g. dct:title). The named elements may be for units of instance data or units of descriptive metadata. The profile itself should have an identifier by which it can be referred. So: ?? It's very hard to do as a single sentence, and I don't know if we've decided that something is NOT a profile if it doesn't have an identifier. Even saying "must" does not mean that a profile without an identifier is not a profile, only that it doesn't meet our best practices. So having the name in the definition gives us a philosophical problem, IMO, and that should instead be a requirement, not a definition. I'm not sure this helps, but you should give it your best shot and then I think we should consider this done. |
I like very much your proposal @kcoyle . Indeed it's hard and the progress looks minor, but I believe this word- and concept-smithing is precious. I would suggest to adapt your suggestion to an even more "requirement-focused" approach (as @jpullmann suggested earlier) by having a should/must for the first part of the requirement too. And keep classes and properties as example, in order to make the thing easier to relate to existing approaches "Profiles must be made up sets of elements, such as classes and properties, that define instance data or metadata. Profiles must have an unique identifier or name." I'm opting for "must" over "should" because I think this reflects the original requirement. I.e. for the corresponding use case I guess that a profile that doesn't bring a set of elements and that doesn't have a name is useless. The WG may argue about this later on, as you suggest, but I believe this is a fair re-writing of the requirement. We could add a note about it, calling for feedback. But at least we would have solved a first round of discussion about this requirement :-) |
Great, @aisaac. I say let's go with your re-write and we'll see if anything that follows in the document causes us to re-think it. |
I dont feel very comfortable with this rewrite yet.. "Profiles must be made up sets of elements, such as classes and properties, that define instance data or metadata. Profiles must have an unique identifier or name" is confusing data and metadata (of course) but i dont think we want to join the list of failed attempts to distinguish these (failed as in we dont have an obvious consensus yet!) specifically, i think profiles consist of "statements about" "data elements" ( and thus can be seen as metadata about such elements, and indirectly about data within such elements). The profile represents a set (i.e a named set) of such statements. |
I could go for "statements about data elements" - now we need to fit that into the rest. The trick here is : statements I'm not sure whether "that" refers to the statements or the data elements or "statements about data elements". Does anyone have a better grasp of grammar rules for this? |
i think both the data elements and the statements about the data elements define aspects of the data instances (but probably not a full definition either - for example if we say an author identifier must be an ORCID id, there is a lot of external context defining what an ORCID id is that would never be replication within a profile that requires it. so maybe "that" = "where the data element definition and the statements about it defined by a profile combine to provide metadata about data instances" ... |
If it doesn't have to be one sentence, we could write "Profiles are made up sets of statements about data elements, such as classes and properties. Those statements define instance data or metadata. Profiles must have an unique identifier or name" |
I'm a bit confused about this exchange. To me the point about metadata in the last wording was about saying that the data created according to profiles can happen to be metadata for other data (just as there are profiles for the metadata in data catalogues). I didn't want to stress that statements in a profile are metadata (even though, yes, they are). I.e I wanted to raise the possibility of profiles-for-metadata, not profiles-as-metadata. @larsgsvensson I don't understand your last comment: where do we say that "if something is a set of statements about data elements and has a unique name, then it's a profile"? |
@aisaac Perhaps it's only a matter of style how we write definitions. One way is to say "A profile MUST have A, B and C". Another way is to say "If a resource has A, B and C, then it's a profile" (kind of indirect typing). I think I'd prefer the second one and then go on "If your resource turns out to have a profile, then it MUST have those attributes, too". And yes, that's a discussion for later since this discussion is about this requirement's title. Perhaps "Profiles are uniquely named sets of statements about data elements, such as classes and properties, that describe how instance data is structured". That would highlight that profiles help to understand the inner structure of a (meta)data collection. |
@larsgsvensson ok it's a matter of style indeed. Let's discuss this later :-) So I'd now propose something like
And if none of these two is ok, then I suggest wielding the axe and remove all attempt to clarify 'metadata terms' in the original title. That would be |
I really don't care what we do with the title of the use case, but I do have an idea for the definition, when we get around to that: "Profiles are uniquely named sets of constraints, such as prescribed classes or properties, applied to data elements. They can be used to ensure consistency in instance data or metadata." |
@aisaac yes, the use of "metadata" is perfectly fine with me. And looking at this definition for the umpteenth time: If we talk only about "classes and properties" it has a very RDFy touch and it feels as if we rule out the possibility to have profiles for XML documents or in other markup languages. Should we extend to "e. g. classes, properties or markup elements"? |
"data elements"? |
@larsgsvensson thanks for the feedback @larsgsvensson @kcoyle at this stage the expression is "data elements, such as classes and properties" and I think this is the best we can get as we've discussed it many times ;-) |
As per the discussion on #435 (for example this message from @kcoyle ) it seems that this requirement is, in terms of the categorization currently made in the UCR document, both a general 'definition' requirement and a requirement that indicates a function of profiles. This means it would fit both into sections 6.10 and 6.11. @jpullmann would you be ok with such duplication? |
@jpullmann I'd really like to hear your opinion about possible duplication of this in the UCR! |
Since one month the discussion has not been very active, and it seems that in the last exchanges, people were rather keen on discussing the general definition of profiles, not the title of this requirement, which captures only one specific dimension of it. So I'm renaming it to one of the last proposals - I'm picking the shorter one: |
In the discussion 28-02-2019: "Profiles are uniquely named sets of constraints, such as data elements (e.g. classes and properties) that describe (meta)data." |
Further refinement from discussion 2019-02-18: "Profiles are uniquely named sets of data constraints, such as data elements (e.g. classes, properties, value domains) that describe (meta)data." |
Entered from Google Doc
The text was updated successfully, but these errors were encountered: