Blank node shorthand #116

iherman · 2014-12-17T10:25:01Z

The document contains that following (hitherto unnumbered) issue:

In the example output above we see Turtle's shorthand syntax for dealing with blank nodes. Should the recommended output form explicitly identify the blank nodes using the row number? e.g.

csvw:row _:1 , _:2 , _:3 , _:4 .

_:1 :country "AD" ; :name "Andorra" .

etc.

However, I think this can be only a MAY or possibly (but reluctantly) a SHOULD. Indeed, if an implementation relies on an external RDF package, leaving the serialization done by that package, then there may be no control over the identifier used for blank nodes. Ie, this requirement cannot be reinforces in such a setup.

The text was updated successfully, but these errors were encountered:

See ISSUE w3c#116

gkellogg · 2014-12-17T15:48:48Z

JSON-LD specifies blank node naming, and t has not caused implementation issues.

iherman · 2014-12-17T15:50:24Z

Well, had I chosen to implement it, I would have had it:-) AFAIK, I cannot influence how RDFLib generates blank nodes, for example.

Ivan

On 17 Dec 2014, at 16:48 , Gregg Kellogg notifications@github.com wrote:

JSON-LD specifies blank node naming, and t has not caused implementation issues.

—
Reply to this email directly or view it on GitHub.

Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

jaw111 · 2015-01-12T15:02:46Z

If the order of rows in a table is significant and needs to be retained (even after load to a graph store), would it be possible to use the container membership properties rdf:_1, rdf:_2, rdf:_3 ... to relate the table to its rows instead of (or as well as) csvw:row?

iherman · 2015-01-12T16:58:29Z

Formally, this is of course possible.

I do feel a little bit uncomfortable, though. There is very little semantics attached to the rdf:_1, etc, terms, but the RDF Semantics document has an informal Appendix for "intended use" which binds these terms to the RDF container vocabulary. So let me ask you this: what do you think we would gain by doing that?

Ivan

On 12 Jan 2015, at 15:02 , jaw111 notifications@github.com wrote:

If the order of rows in a table is significant and needs to be retained (even after load to a graph store), would it be possible to use the container membership properties rdf:_1, rdf:_2, rdf:_3 ... to relate the table to its rows instead of (or as well as) csvw:row?

—
Reply to this email directly or view it on GitHub.

Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

jaw111 · 2015-01-12T17:38:37Z

One possible gain would be the ability to round-trip CSV -> RDF -> CSV in a generic way and still retain the order of rows (should that be relevant). That way an RDF storage backend could potentially be used where one could HTTP PUT a CSV table and be able the HTTP GET 'the same' table without the row order being potentially changed.

Also I would say that stating csvw:row to be an rdfs:subPropertyOf rdfs:member already, for better or worse, makes an implicit binding to the RDF container vocabulary. Let me turn the tables (ouch!) on your question: What is gained by introducing the terms cswv:Table and cswv:row above and beyond defining a standardized method of using the RDF container vocabulary to describe CSV tables?

iherman · 2015-01-13T06:35:54Z

On 12 Jan 2015, at 17:38 , jaw111 notifications@github.com wrote:

One possible gain would be the ability to round-trip CSV -> RDF -> CSV in a generic way and still retain the order of rows (should that be relevant). That way an RDF storage backend could potentially be used where one could HTTP PUT a CSV table and be able the HTTP GET 'the same' table without the row order being potentially changed.

Indeed, that is true. Although this could be achieved e.g. by adding, for each row, an extra triplet on the row number. This was actually part of earlier drafts, if I remember it well.

Also I would say that stating csvw:row to be an rdfs:subPropertyOf rdfs:member already, for better or worse, makes an implicit binding to the RDF container vocabulary.

True, and I forgot about that. But using rdf:_1 and friends would make a significant step further...

Let me turn the tables (ouch!)

:-)

on your question: What is gained by introducing the terms cswv:Table and cswv:row above and beyond defining a standardized method of using the RDF container vocabulary to describe CSV tables?

It is not a good practice to add explicit semantics, through some sort of a microsyntax, to a predicate URI; this is clearly the case of rdf:_1 & Co. RDF processing should be based on the semantics of a predicate as expressed, e.g., by an RDFS definition, and not by analysing the string making up the URI. As such, relying on them for a roundtripping would be bad practice imho. (This is one of the reasons why these have always been shunned upon by the community ever since they were introduced; there were serious discussions in the latest RDF WG to remove them from RDF1.1 altogether.)

Also, repeating myself: rdf:_1 and friends are almost exclusively used (this is the "accepted practice", so to say) in conjunction with an RDF sequence (and bag and alt, but that is not relevant here). Using them in isolation, as you propose, though formally possible, goes against this accepted practice.

I know these are not strong arguments based on some mathematical reasoning, but we cannot completely ignore the (best) practice out there either.

Cheers

Ivan

—
Reply to this email directly or view it on GitHub.

Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

jaw111 · 2015-01-13T08:58:40Z

One might also argue that abstractly a CSV file is a table container that contains a sequence of rows, so csvw:Table could be thought of as a sub-class of rdf:Container, maybe more specifically as sub-class of rdf:Seq (or rdf:Bag) to make explicit the order is (not) relevant.

However I totally agree on your point about putting semantics into the URIs for rdf:_1, rdf:_2, rdf:_3 .... The URI should be treated as an opaque identifier and one should not have to parse it.

In general I'm personally not a fan of indicating the order of things by adding literal values. Using an RDF Collection is another option, especially as both Turtle and JSON-LD have sufficient syntactic sugar to make them palatable. Here's a small example based on Example 6 from the draft spec using csvw:rows to link from the table to the list of rows.

Turtle:

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix : <http://example.org/country-codes-and-names.csv#> .

<http://example.org/country-codes-and-names.csv#table>
  a <http://www.w3.org/ns/csvw#Table> ;
  csvw:rows (
    [ :country "AD" ; :name "Andorra" ]
    [ :country "AF" ; :name "Afghanistan" ]
    [ :country "AI" ; :name "Anguilla" ]
    [ :country "AL" ; :name "Albania" ]
  ) .

JSON-LD:

{
    "@context": {
        "@vocab": "http://example.org/country-codes-and-names.csv#",
        "csvw": "http://www.w3.org/ns/csvw#",
        "csvw:rows": {
            "@container": "@list"
        }
    },
    "@id": "http://example.org/country-codes-and-names.csv#table",
    "@type": "csvw:Table",
    "csvw:rows": [
        {
            "country": "AD",
            "name": "Andorra"
        },
        {
            "country": "AF",
            "name": "Afshanistan"
        },
        {
            "country": "AI",
            "name": "Anguilla"
        },
        {
            "country": "AL",
            "name": "Albania"
        }
    ]
}

iherman · 2015-01-13T10:48:24Z

On 13 Jan 2015, at 08:58 , jaw111 notifications@github.com wrote:

One might also argue that abstractly a CSV file is a table container that contains a sequence of rows, so csvw:Table could be thought of as a sub-class of rdf:Container, maybe more specifically as sub-class of rdf:Seq (or rdf:Bag) to make explicit the order is (not) relevant.

However I totally agree on your point about putting semantics into the URIs for rdf:_1, rdf:_2, rdf:_3 .... The URI should be treated as an opaque identifier and one should not have to parse it.

I am happy we agree on that! :-)

In general I'm personally not a fan of indicating the order of things by adding literal values. Using an RDF Collection is another option, especially as both Turtle and JSON-LD have sufficient syntactic sugar to make them palatable.

Agreed. If we decide that maintaining the order of rows is important, then collections may make very much sense. RDF/XML users may not like it, but we may want to ignore that, I do not know.

There is an efficiently price, of course. Syntactic sugar is one thing, but the fact is that for all each entry we get 2-3 more triples. For smaller CSV files we do not care. Do we care if the CSV file is GB level (or more?)

Ivan

Here's a small example based on Example 6 from the draft spec using csvw:rows to link from the table to the list of rows.

Turtle:

@Prefix csvw: http://www.w3.org/ns/csvw# .
@Prefix : http://example.org/country-codes-and-names.csv# .

http://example.org/country-codes-and-names.csv#table
a http://www.w3.org/ns/csvw#Table ;
csvw:rows (
[ :country "AD" ; :name "Andorra" ]
[ :country "AF" ; :name "Afghanistan" ]
[ :country "AI" ; :name "Anguilla" ]
[ :country "AL" ; :name "Albania" ]
) .

JSON-LD:

{
"@context": {
"@base": "http://example.org/country-codes-and-names.csv#",
"csvw": "http://www.w3.org/ns/csvw#",
"csvw:rows": {
"@container": "@list"
}
},
"@id": "http://example.org/country-codes-and-names.csv#table",
"@type": "csvw:Table",
"csvw:rows": [
{
"country": "AD",
"name": "Andorra"
},
{
"country": "AF",
"name": "Afshanistan"
},
{
"country": "AI",
"name": "Anguilla"
},
{
"country": "AL",
"name": "Albania"
}
]
}

—
Reply to this email directly or view it on GitHub.

Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

6a6d74 · 2015-03-04T01:03:53Z

csv2rdf document now re-written to reflect decisions of the f2f meeting in Feb 2015, London.

iherman added the CSV to RDF mapping label Dec 17, 2014

6a6d74 added a commit to 6a6d74/csvw that referenced this issue Dec 17, 2014

added issue number

3516993

See ISSUE w3c#116

iherman assigned 6a6d74 Dec 17, 2014

iherman added the in progress label Dec 17, 2014

6a6d74 mentioned this issue Dec 17, 2014

added issue number #126

Merged

JeniT added the Editorial label Feb 4, 2015

iherman removed the in progress label Feb 15, 2015

6a6d74 closed this as completed Mar 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blank node shorthand #116

Blank node shorthand #116

iherman commented Dec 17, 2014

gkellogg commented Dec 17, 2014

iherman commented Dec 17, 2014

jaw111 commented Jan 12, 2015

iherman commented Jan 12, 2015

jaw111 commented Jan 12, 2015

iherman commented Jan 13, 2015

jaw111 commented Jan 13, 2015

iherman commented Jan 13, 2015

6a6d74 commented Mar 4, 2015

Blank node shorthand #116

Blank node shorthand #116

Comments

iherman commented Dec 17, 2014

gkellogg commented Dec 17, 2014

iherman commented Dec 17, 2014

jaw111 commented Jan 12, 2015

iherman commented Jan 12, 2015

jaw111 commented Jan 12, 2015

iherman commented Jan 13, 2015

jaw111 commented Jan 13, 2015

iherman commented Jan 13, 2015

6a6d74 commented Mar 4, 2015