Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiles in relation to Data on the Web Best Practices #487

Open
kcoyle opened this issue Oct 24, 2018 · 8 comments
Open

Profiles in relation to Data on the Web Best Practices #487

kcoyle opened this issue Oct 24, 2018 · 8 comments

Comments

@kcoyle
Copy link
Contributor

kcoyle commented Oct 24, 2018

This is a very rough draft of a reading of DWBP in relation to profiles. I don't know if it fits into the document directly but using DWBP would fit this document into a W3C context. Obviously if we use this it would include links to those BPs.

Using the assumption that a profile on the web has some characteristics in common with data on the web, we can look to the Data on the Web Best Practices [link] as a source of some preferred practices for profiles.

The first obvious practice is BP9 [], to use persistent URIs as identifiers of profiles. This seems to be a non-controversial requirement because it applies to any web resource. There is a related best practice (BP10) which is stated as "Use persistent URIs as identifiers within datasets." This can be well-suited to the aspect of a profile that consists of the reuse of vocabulary terms that have been prevously defined, but also to the definition of new terms within the scope of the profile itself. Each element of a profile's vocabulary should have a URI (IRI?) that identifies the term and information about the term (such as labels, definitions, etc.)

The next relevant group of best practices have to do with describing the profile itself (BP1-3). In the BPoW this is defined as metadata about the dataset; with profiles this can be descriptive metadata within the profile or a separate metadata statement about the profile. (Since profiles themselves are generally forms of metadata they may be able to incorporate description and other administrative information about the profile within itself, if desired.) In addition there is a common set of administrative data that is recommended for many information resources such as dates and version designators for each version, and provenance (who or what agency created the digital file).

The BPDoW limits its recommendation of metadata covering the topic of the dataset to general keywords and themes and categories (BP2). It may be desirable provide more specific topical information to satisfy the DXWG requirement that profiles should be discoverable by search engines. The quality of discoverability will vary based on the depth of description of the topic and/or community area that it serves. (We may wish to recommend some particulars.)

BP15 is one of the cornerstones of the profile practice, which is to reuse vocabularies, in particular standardized ones. One would need to define "standardized" in this context, but perhaps a better solution is to define the qualities of preferred vocabularies: have a stable URI; are supported by an organization or community; (more??).

The BPs also recommend that data be provided in multiple formats, and this is a good recommendation for profiles as well. If we take the point of view that profiles, like DCAT, have an abstract essence that can be made manifest in more than one way, we already have a good basis for satisfaction of BP14. (This will bring up the question that has already come up in the context of DCAT and conneg - are all of the forms equivalent? We may not be able to resolve that question.)

The recommendation in BP16 is to choose the right formalization level. This is a useful recommendation for all data and metadata, and would naturally apply to profiles. Profiles should be suited to the tasks they are designed to support; not less nor more. In particular we should caution against overly strict use of constraints, which then make it harder for others to either make direct use of the profile or to create a profile of the profile where their needs vary only a small amount.

Naturally, profiles should be published in standard data formats (BP12). It should also be stated that profiles should be published in and make use of technologies that are appropriate to the community which is expected to use them. A profile using RDF and OWL will not well serve a community that has only an XML/XSD-based skill set. This also relates to the above recommendation relating to providing data in multiple formats. In many communities the skills and data history can vary, so providing profiles with as many as possible of the commonly used technologies will increase the utility of the profile (as well as the profiled instance data).

Because profiles are intended to convey information both within a community and at times between communities, wide use would be facilitated by BP13, which is that the profile should use locale-neutral data representations where possible. Some data communities have deep and historical practices that use terminology that is specific to the community. The creation of a profile is an opportunity to transform that practice to widely known standards.

Ideally, the profile would have a management cycle for maintenance and updates. This should involve the community of users, as noted in BP29 (Gather feedback from data consumers) and BP30 (make feedback available). The strength and value of a profile will depend on the involvement of the community of users.

The aspect of the management cycle that is often ignored is that of the de-commissioning of datasets or of superceeded versions. For users it is key that previously used identifiers always point to a useful document or message. (BP27 preserve identifiers).

@kcoyle kcoyle added this to To do in Profile Guidance via automation Oct 24, 2018
@aisaac
Copy link
Contributor

aisaac commented Oct 24, 2018

This is a candidate solution for #450

@aisaac aisaac closed this as completed Oct 24, 2018
Profile Guidance automation moved this from To do to Done Oct 24, 2018
@aisaac
Copy link
Contributor

aisaac commented Oct 30, 2018

I'm not sure which manipulation made me close this issue for no valid reason, sorry

@aisaac aisaac reopened this Oct 30, 2018
Profile Guidance automation moved this from Done to In progress Oct 30, 2018
@aisaac
Copy link
Contributor

aisaac commented Oct 30, 2018

@kcoyle I think this is very good a starting point. I have not thought it through yet, though, because I'm contemplating a first basic question before diving in: shouldn't we have two levels of applying the BPs wrt profiles?
The first one, to which I believe a lot of what you have written belongs to, is applying BPs to the design/publication of profiles. I think this is good as 'design principles' for motivating the decisions we make - and reinforce the requirements we have identified.
The second level is the one that would motivate profiles in terms of them being an answer to implement some BPs. For example I would consider that the publication of data according to different profiles implements a generalized version of BP14 (Provide data in multiple formats). I.e. BP 14 could be read as recommending to publish not only different serializations of the data, but also different 'flavours' of it, which can save data consumers a lot of time and errors, having to map data in different profiles. Profiles can also be read as implementing BP15 at a different level too: they allow for expressing diverse data, while keeping the common bits expressed using the same vocabularies. I think you've covered some of that too.
The second level could be a key part of the intro, in fact maybe earlier than the first level, which goes intro introducing the design of our solutions/recommendations.

But I'll stop here: do you think it's a possible way to look at things, or should I just abstain and focus on what you've written, whatever level it may belong?

@rob-metalinkage
Copy link
Contributor

I think we are "all good" (now we have the profiles ontology as at least one mechanism to publish descriptions) - however its useful to reinforce these expectations in the guidance.)

many of these practices relate to deployment: e.g. BP12

The implementation I will do for the OGC will expose as a lightweight XML (not RDF/XML) and JSON-LD - these are general "Linked Data" concerns IMHO - how much should we spell it out in the guidance document? IMHO we should make some statement about this when we introduce the Profiles ontology as it enables the RDF TTL and JSON-LD serialisations - which is probably a necessary and sufficient start.

@kcoyle
Copy link
Contributor Author

kcoyle commented Oct 30, 2018

@aisaac I'm about to hop on a plane but with a quick reading of your comment I think it is worth thinking about in this way. My text was just to capture what we can from BP but wasn't organized in any way. The next time my brain is working I'll think about how it might be organized in the document, taking your suggestions. Then maybe we can talk.

@aisaac
Copy link
Contributor

aisaac commented Oct 30, 2018

@kcoyle this sounds great. I may be able to give it a thought myself, but you are probably going to be able to work on it before me...

@kcoyle
Copy link
Contributor Author

kcoyle commented Nov 2, 2018

Here's my first pass at reordering the best practices using the document outline. Again, this is not the text suggested for the document, this is an analysis of BPoW in relation to our work, without saying how we would use it.

Using the assumption that a profile on the web has some characteristics in common with data on the web, we can look to the Data on the Web Best Practices as a source of some preferred practices for profiles.

  1. Introduction

BP13 Use locale-neutral data representations
Because profiles are intended to convey information both within a community and at times between communities, wide use would be facilitated by BP13, which is that the profile should use locale-neutral data representations where possible. Some data communities have deep and historical practices that use terminology that is specific to the community. The creation of a profile is an opportunity to transform that practice to widely known standards.

  1. What is a profile?

BP15 Reuse vocabularies, preferably standardized ones
BP15 is one of the cornerstones of the profile practice, which is to reuse vocabularies, in particular standardized ones. One would need to define "standardized" in this context, but perhaps a better solution is to define the qualities of preferred vocabularies: have a stable URI; are supported by an organization or community; (more??).

  1. The Functions of Profile Components/Manifestations

BP16 Choose the right formalization level
The recommendation in BP16 is to choose the right formalization level. This is a useful recommendation for all data and metadata, and would naturally apply to profiles. Profiles should be suited to the tasks they are designed to support; not less nor more. In particular we should caution against overly strict use of constraints, which then make it harder for others to either make direct use of the profile or to create a profile of the profile where their needs vary only a small amount.

  1. Profile publication

BP9 Use persistent URIs as identifiers of datasets
The first obvious practice is BP9, to use persistent URIs as identifiers of profiles. This seems to be a non-controversial requirement because it applies to any web resource. There is a related best practice (BP10) which is stated as "Use persistent URIs as identifiers within datasets." This can be well-suited to the aspect of a profile that consists of the reuse of vocabulary terms that have been prevously defined, but also to the definition of new terms within the scope of the profile itself. Each element of a profile's vocabulary should have a URI (IRI?) that identifies the term and information about the term (such as labels, definitions, etc.)

BP14 Provide data in multiple formats
The BPs also recommend that data be provided in multiple formats, and this is a good recommendation for profiles as well. If we take the point of view that profiles, like DCAT, have an abstract essence that can be made manifest in more than one way, we already have a good basis for satisfaction of BP14. (This will bring up the question that has already come up in the context of DCAT and conneg - are all of the forms equivalent? We may not be able to resolve that question.)

BP12 Use machine-readable standardized data formats
Naturally, profiles should be published in standard data formats (BP12). It should also be stated that profiles should be published in and make use of technologies that are appropriate to the community which is expected to use them. A profile using RDF and OWL will not well serve a community that has only an XML/XSD-based skill set. This also relates to the above recommendation relating to providing data in multiple formats. In many communities the skills and data history can vary, so providing profiles with as many as possible of the commonly used technologies will increase the utility of the profile (as well as the profiled instance data).

  1. Administrative and descriptive metadata

BP 1: Provide metadata
BP 2: Provide descriptive metadata
BP 3: Provide structural metadata
The next relevant group of best practices have to do with describing the profile itself (BP1-3). In the BPoW this is defined as metadata about the dataset; with profiles this can be descriptive metadata within the profile or a separate metadata statement about the profile. (Since profiles themselves are generally forms of metadata they may be able to incorporate description and other administrative information about the profile within itself, if desired.) In addition there is a common set of administrative data that is recommended for many information resources such as dates and version designators for each version, and provenance (who or what agency created the digital file).

BP2 Provide descriptive metadata
The BPDoW limits its recommendation of metadata covering the topic of the dataset to general keywords and themes and categories (BP2). It may be desirable provide more specific topical information to satisfy the DXWG requirement that profiles should be discoverable by search engines. The quality of discoverability will vary based on the depth of description of the topic and/or community area that it serves. (We may wish to recommend some particulars.)

BP30 Make feedback available
Ideally, the profile would have a management cycle for maintenance and updates. This should involve the community of users, as noted in BP29 (Gather feedback from data consumers) and BP30 (make feedback available). The strength and value of a profile will depend on the involvement of the community of users.

BP27 Preserve identifiers
The aspect of the management cycle that is often ignored is that of the de-commissioning of datasets or of superseded versions. For users it is key that previously used identifiers always point to a useful document or message. (BP27 preserve identifiers).

  1. The Profiles ontology

BP 1: Provide metadata
BP 2: Provide descriptive metadata
BP 3: Provide structural metadata
The next relevant group of best practices have to do with describing the profile itself (BP1-3). In the BPoW this is defined as metadata about the dataset; with profiles this can be descriptive metadata within the profile or a separate metadata statement about the profile. (Since profiles themselves are generally forms of metadata they may be able to incorporate description and other administrative information about the profile within itself, if desired.) In addition there is a common set of administrative data that is recommended for many information resources such as dates and version designators for each version, and provenance (who or what agency created the digital file).

@kcoyle
Copy link
Contributor Author

kcoyle commented Nov 6, 2018

Here is the respec guide on how to reference best practices:

https://github.com/w3c/respec/wiki/ReSpec-Editor's-Guide#best-practice-documents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Profile Guidance
  
In progress
Development

No branches or pull requests

4 participants