-
Notifications
You must be signed in to change notification settings - Fork 48
Issue 463: Change the value term in the model document to annotation value.
#469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Remove `property value` term, and use within the model document refering instead to the appropriate annotation value within the model document. Fixes #463.
value term in the model document to annotation value.value term in the model document to annotation value.
|
I have an unease but I am not sure I can put the finger on it very precisely. This set of changes was mostly triggered by the approach that the model document and the conversions should stand on their own because other specifications may be created in the future that might create and annotated table through some other means than what is described in the metadata document. And if such specs are created, we do not want to redo, say, the metadata spec. I hope I understand this right. However, ideally, what this should mean is that the syntax document should actually not refer to the metadata document in any normative way. However, that would require a much more thorough re-write. Let me take two examples for my unease (referring to the version in the branch-to-be-merged). On the one hand, the metadata document describes
Surely than this should be changed, right? How the Another, similar, issue is around natural language properties (as we call them in the metadata document). The syntax document says
whereas the metadata document describes them in much more operational detail:
But... is it o.k. to actually define the annotation value in the metadata document? Shouldn't the definition be part of the syntax document entirely? The language map structure is, in this sense, irrelevant for the syntax document, isn't it? In general, I believe all references to the metadata document from the syntax document should be considered and, if possible, removed, more exactly replaced by the definition of the annotation value in that document. Sorry if I am not entirely systematic in my rumbling... |
|
Too late for me to dive into this this evening (for me), but I take your point. This change did not introduce references from the model document to the metadata document, but it has exacerbated it. I'll work on trying to stick to a purely functional definition of what these annotations mean in the model, and operational view of the properties and how they affect the annotation values from the metadata document. I was worried that too many changes this late in the game would be destabilizing, but I'll see what I can do. In the mean time, please suggest any specific changes you'd like to see. Regarding Many other things are simpler, and can be described in terms of how the metadata property is used to create the corresponding annotation value. The |
Yep, that was my worry as well. But, well, we decided yesterday we would do it even if this means shifting the publication date. Thinking about all this a little bit I seriously doubt we can make the publication next week (at least if we go down that line), but we shall see.
Actually, I believe the While this means a simple change in the syntax and model documents, we have to be careful that this also modifies the conversion document. Indeed, in section 3.1 of the csv2rdf document it says:
which probably should be removed; in terms of the conversion, Once the model and syntax documents are changed, the conversion documents will have to go through a thorough rewrite, too! @6a6d74 :-(
Well, the current specification of the title annotation is fine I believe:
The metadata document that should change, removing the reference to JSON-LD structures altogether; processing the metadata produces essentially an array ("any number") of strings with an associated language as described in the model (this also means a slight modification of the merging algorithm) I think the most spectacular change that we have to do concerns the datatypes. Indeed, the cell values are defined in the syntax document as
I believe that the syntax document should clearly remove the cell parsing reference, but it should include the allowed datatypes. Essentially, the the whole of 4.11 from the model document should be moved into the syntax, because those datatypes are constraints on what datatypes the model may include and they also drive the datatypes used in the conversions. Parsing the cells (in the model document) produces such values (so the parsing algorithm stays as it is and where it is). (B.t.w., the definition of values above should also include a reference to the language information, too!) Sigh... yes, it is a lot of work. /Cc: @JeniT |
…y-values # Conflicts: # metadata/index.html
|
@iherman I removed most of the explicit references to metadata, at least as they describe how the annotation values are derived. I still need to make sure the metadata document properly describes how these annotation values are created. Note that I ended up changing the I don't think that the datatypes section from metadata needs to be moved over, as it's used to derive the datatypes values. Really, these values are actually RDF Literals, and perhaps should be described as such; we could even get rid of cell-value-URL and just have this be included in the cell-value, and make it an RDF Term (exclusive of BNodes). This is really what they are, even if a serialization may not represent it that way. It's also how my implementation works, and seems the most logical. Alternatively, we could re-invent this, and just say that the values have string- datatype- and language-facets, and may be absolute URLs. See what you think. |
… Table and Column. #463.
|
After working on the metadata document, I do believe that much of the Datatype and Parsing Cells needs to be moved to the model document. The syntactic requirements for Datatypes must remain so that they can be normalized to form the datatype annotation on the column. Annotations on Rows and Cells should probably be moved to the model document and normatively describe creating Row and Cell annotations. The Parsing Tabular Data section needs to have a non-normative subsection containing the current algorithm, but needs to reference Parsing Cells, including Datatype parsing and the creation of the value of the cel. along with other cell annotations. We should consider merging the value-url and value annotations on a cell. |
… enough of Datatypes in metadata to describe how the annotations are created. Fixes #463.
|
I think this last set of massive edits accomplishes the separation we need. There is still a reference to URI Template processing in the metadata document, but it seems reasonable to leave this there. |
Syntax document: * I have added some words in the abstract to make it clear that other applications may come with other means of creating annotations, although the standard metadata format is the one we have defined * In 3.1 I have changed "resources" to "tables", to be consistent with the changes we introduced elsewhere * I changed a bit the reference to common properties. Other mechanisms may generate those additional annotations through different means, and the current text read as if those would come only from the common properties * I also removed the dependency of notes on the metadata document. It does not really bring too much to refer to it. * I have added a reference to BCP47 for the lang annotation for columns. The document already uses that for the titles, and this restriction is needed for conversions. Metadata document: * I have added (well copied from the text) a paragraph from the abstract. It define the role of this document better...
|
First of all, deep bow towards San Francisco:-) I have some editing on the text; I will commit that (with comments) separately, so that you can accept them or change them. I also have some comments below that may require some discussion, so I did not want to change them.
/Cc @JeniT |
|
Just for the good order, here are the changes I made in d567e9c: Syntax document:
Metadata document:
|
|
All - the dependencies from csv2* docs back to the model doc (e.g. the description of I'm reticent to begin another round of edits but will do so if there's consensus. (need to avoid creating merge conflicts though!) cc/ @JeniT |
Very honestly I do not know. And the problem is that Jeni seems to be on the road (at least that is what she said...) Ivan
Ivan Herman, W3C |
|
The point of having If you like, I can take a pass and updating the csv2rdf document language as part of this PR. |
This was the result of issue #446, which we didn't ever discuss. But, it is consistent with recording the value of inherited properties as annotations on the columns. |
|
@gkellogg - if you can take a pass thru the csv2* docs that would be great. On Friday, 10 April 2015, Gregg Kellogg notifications@github.com wrote:
|
Ultimately, the form needed in the model annotation should be whatever is most convenient for the conversion documents, IMO. This also relates to a previous statement I made that it might be better if the Grammatically, I think the sentence is correct, but it certainly is multi-layered. However, I can revert this to make it more vague, if that doesn't just make life harder for the conversion documents. |
The |
I think it's useful to have the diagram in both places. As a metadata author, it prevents having to bounce back and forth between the metadata and model documents. |
Conflicts: csv2json/index.html csv2rdf/index.html
…rties. Depends on `cell lang` which needs to be defined, `cell value` better specifying _list_ and _datatype_. #463.
…y-values # Conflicts: # csv2rdf/index.html
the aliases can be used in the metadata, but they should map to the standard datatypes in the model
…uage` and use in csv2rdf.
… column cound must be the same as the column count of each row.
|
I believe I've made necessary changes to both csv2rdf and csv2json documents. @6a6d74, if you'd please check them out. |
|
Also, it might be worth considering taking the common parts of examples (annotation descriptions, mostly) and putting them in common included files rather than repeat them inline; it makes fixing issues consistently more difficult. |
Yep, I can see the point. (B.t.w., the schema annotation is not yet in the document). Maybe it is worth emphasizing that these annotations are there for history record, ie, that other applications generating annotations may ignore them. Ivan
Ivan Herman, W3C |
Ok Ivan
Ivan Herman, W3C |
Ok
Ivan Herman, W3C |
I am sorry... :-) Ivan
Ivan Herman, W3C |
Well, for RDF, a simple array of language tagged literals is certainly easier; the JSON-LD form has to be unpacked, so to say. But the difference is not big. What about: "any number of human-readable titles for the column; titles are grouped by language codes as defined by [[!BCP47]], each group consisting of any number of titles in that language." which is a bit less implementation specific and (maybe) clearer? Ivan
Ivan Herman, W3C |
|
Just to say, I did a bunch of edits on the plane on this branch on the conversion documents but there are lots of merge conflicts on them now. I'm in all day meeting today but either later today or during tomorrow I will resolve those and merge this in. All times are US times for me... |
|
(Still in the process of merging.) I have noticed one mismatch in expectations as I go so flagging it in case it requires discussion: in a cell value that is a list, each value can have its own datatype and language. There were lots of examples that made this clear but I'm not sure where they are now. |
|
If cell sub-values can have different datatypes or languages, we probably need a term to capture this. This is consistent with how we set the datatype and language on these values in the cell parsing steps, though. We should note that creating such annotations through this algorithm where these annotations are different is not possible now, though. This might go with the LCCR issue on multiple datatypes per column, though. |
…issue-463-property-values Conflicts: csv2json/index.html csv2rdf/index.html
|
I'm merging this now because there are lots of large changes and it will be a pain to apply other changes without these in place. |
Issue 463: Change the `value` term in the model document to `annotation value`.
… (comment): > The title property in the syntax is defined as "any number of human-readable titles for the column, each of which has an associated language represented as an object whose properties MUST be language codes as defined by [[!BCP47]] and whose values are arrays of strings related to that language." Is this correct, grammatically? But, more importantly... I personally would have preferred to keep it as it was, i.e., something like "any number of human readable titles for the column, each of which with an associated language as defined in [[!BCP47]]". However, if we do that change, the metadata document has to change, to (for natural language properties) so I did not do any change.
Remove
property valueterm, and use within the model document refering instead to the appropriate annotation value within the model document.Fixes #463.