Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank node shorthand #116

Closed
iherman opened this issue Dec 17, 2014 · 9 comments
Closed

Blank node shorthand #116

iherman opened this issue Dec 17, 2014 · 9 comments

Comments

@iherman
Copy link
Member

iherman commented Dec 17, 2014

The document contains that following (hitherto unnumbered) issue:

In the example output above we see Turtle's shorthand syntax for dealing with blank nodes. Should the recommended output form explicitly identify the blank nodes using the row number? e.g.

csvw:row _:1 , _:2 , _:3 , _:4 .

_:1 :country "AD" ; :name "Andorra" .

etc.

However, I think this can be only a MAY or possibly (but reluctantly) a SHOULD. Indeed, if an implementation relies on an external RDF package, leaving the serialization done by that package, then there may be no control over the identifier used for blank nodes. Ie, this requirement cannot be reinforces in such a setup.

6a6d74 added a commit to 6a6d74/csvw that referenced this issue Dec 17, 2014
@gkellogg
Copy link
Member

JSON-LD specifies blank node naming, and t has not caused implementation issues.

@iherman
Copy link
Member Author

iherman commented Dec 17, 2014

Well, had I chosen to implement it, I would have had it:-) AFAIK, I cannot influence how RDFLib generates blank nodes, for example.

Ivan

On 17 Dec 2014, at 16:48 , Gregg Kellogg notifications@github.com wrote:

JSON-LD specifies blank node naming, and t has not caused implementation issues.


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@jaw111
Copy link

jaw111 commented Jan 12, 2015

If the order of rows in a table is significant and needs to be retained (even after load to a graph store), would it be possible to use the container membership properties rdf:_1, rdf:_2, rdf:_3 ... to relate the table to its rows instead of (or as well as) csvw:row?

@iherman
Copy link
Member Author

iherman commented Jan 12, 2015

Formally, this is of course possible.

I do feel a little bit uncomfortable, though. There is very little semantics attached to the rdf:_1, etc, terms, but the RDF Semantics document has an informal Appendix for "intended use" which binds these terms to the RDF container vocabulary. So let me ask you this: what do you think we would gain by doing that?

Ivan

On 12 Jan 2015, at 15:02 , jaw111 notifications@github.com wrote:

If the order of rows in a table is significant and needs to be retained (even after load to a graph store), would it be possible to use the container membership properties rdf:_1, rdf:_2, rdf:_3 ... to relate the table to its rows instead of (or as well as) csvw:row?


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@jaw111
Copy link

jaw111 commented Jan 12, 2015

One possible gain would be the ability to round-trip CSV -> RDF -> CSV in a generic way and still retain the order of rows (should that be relevant). That way an RDF storage backend could potentially be used where one could HTTP PUT a CSV table and be able the HTTP GET 'the same' table without the row order being potentially changed.

Also I would say that stating csvw:row to be an rdfs:subPropertyOf rdfs:member already, for better or worse, makes an implicit binding to the RDF container vocabulary. Let me turn the tables (ouch!) on your question: What is gained by introducing the terms cswv:Table and cswv:row above and beyond defining a standardized method of using the RDF container vocabulary to describe CSV tables?

@iherman
Copy link
Member Author

iherman commented Jan 13, 2015

On 12 Jan 2015, at 17:38 , jaw111 notifications@github.com wrote:

One possible gain would be the ability to round-trip CSV -> RDF -> CSV in a generic way and still retain the order of rows (should that be relevant). That way an RDF storage backend could potentially be used where one could HTTP PUT a CSV table and be able the HTTP GET 'the same' table without the row order being potentially changed.

Indeed, that is true. Although this could be achieved e.g. by adding, for each row, an extra triplet on the row number. This was actually part of earlier drafts, if I remember it well.

Also I would say that stating csvw:row to be an rdfs:subPropertyOf rdfs:member already, for better or worse, makes an implicit binding to the RDF container vocabulary.

True, and I forgot about that. But using rdf:_1 and friends would make a significant step further...

Let me turn the tables (ouch!)

:-)

on your question: What is gained by introducing the terms cswv:Table and cswv:row above and beyond defining a standardized method of using the RDF container vocabulary to describe CSV tables?

It is not a good practice to add explicit semantics, through some sort of a microsyntax, to a predicate URI; this is clearly the case of rdf:_1 & Co. RDF processing should be based on the semantics of a predicate as expressed, e.g., by an RDFS definition, and not by analysing the string making up the URI. As such, relying on them for a roundtripping would be bad practice imho. (This is one of the reasons why these have always been shunned upon by the community ever since they were introduced; there were serious discussions in the latest RDF WG to remove them from RDF1.1 altogether.)

Also, repeating myself: rdf:_1 and friends are almost exclusively used (this is the "accepted practice", so to say) in conjunction with an RDF sequence (and bag and alt, but that is not relevant here). Using them in isolation, as you propose, though formally possible, goes against this accepted practice.

I know these are not strong arguments based on some mathematical reasoning, but we cannot completely ignore the (best) practice out there either.

Cheers

Ivan


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@jaw111
Copy link

jaw111 commented Jan 13, 2015

One might also argue that abstractly a CSV file is a table container that contains a sequence of rows, so csvw:Table could be thought of as a sub-class of rdf:Container, maybe more specifically as sub-class of rdf:Seq (or rdf:Bag) to make explicit the order is (not) relevant.

However I totally agree on your point about putting semantics into the URIs for rdf:_1, rdf:_2, rdf:_3 .... The URI should be treated as an opaque identifier and one should not have to parse it.

In general I'm personally not a fan of indicating the order of things by adding literal values. Using an RDF Collection is another option, especially as both Turtle and JSON-LD have sufficient syntactic sugar to make them palatable. Here's a small example based on Example 6 from the draft spec using csvw:rows to link from the table to the list of rows.

Turtle:

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix : <http://example.org/country-codes-and-names.csv#> .

<http://example.org/country-codes-and-names.csv#table>
  a <http://www.w3.org/ns/csvw#Table> ;
  csvw:rows (
    [ :country "AD" ; :name "Andorra" ]
    [ :country "AF" ; :name "Afghanistan" ]
    [ :country "AI" ; :name "Anguilla" ]
    [ :country "AL" ; :name "Albania" ]
  ) .

JSON-LD:

{
    "@context": {
        "@vocab": "http://example.org/country-codes-and-names.csv#",
        "csvw": "http://www.w3.org/ns/csvw#",
        "csvw:rows": {
            "@container": "@list"
        }
    },
    "@id": "http://example.org/country-codes-and-names.csv#table",
    "@type": "csvw:Table",
    "csvw:rows": [
        {
            "country": "AD",
            "name": "Andorra"
        },
        {
            "country": "AF",
            "name": "Afshanistan"
        },
        {
            "country": "AI",
            "name": "Anguilla"
        },
        {
            "country": "AL",
            "name": "Albania"
        }
    ]
}

@iherman
Copy link
Member Author

iherman commented Jan 13, 2015

On 13 Jan 2015, at 08:58 , jaw111 notifications@github.com wrote:

One might also argue that abstractly a CSV file is a table container that contains a sequence of rows, so csvw:Table could be thought of as a sub-class of rdf:Container, maybe more specifically as sub-class of rdf:Seq (or rdf:Bag) to make explicit the order is (not) relevant.

However I totally agree on your point about putting semantics into the URIs for rdf:_1, rdf:_2, rdf:_3 .... The URI should be treated as an opaque identifier and one should not have to parse it.

I am happy we agree on that! :-)

In general I'm personally not a fan of indicating the order of things by adding literal values. Using an RDF Collection is another option, especially as both Turtle and JSON-LD have sufficient syntactic sugar to make them palatable.

Agreed. If we decide that maintaining the order of rows is important, then collections may make very much sense. RDF/XML users may not like it, but we may want to ignore that, I do not know.

There is an efficiently price, of course. Syntactic sugar is one thing, but the fact is that for all each entry we get 2-3 more triples. For smaller CSV files we do not care. Do we care if the CSV file is GB level (or more?)

Ivan

Here's a small example based on Example 6 from the draft spec using csvw:rows to link from the table to the list of rows.

Turtle:

@Prefix csvw: http://www.w3.org/ns/csvw# .
@Prefix : http://example.org/country-codes-and-names.csv# .

http://example.org/country-codes-and-names.csv#table
a http://www.w3.org/ns/csvw#Table ;
csvw:rows (
[ :country "AD" ; :name "Andorra" ]
[ :country "AF" ; :name "Afghanistan" ]
[ :country "AI" ; :name "Anguilla" ]
[ :country "AL" ; :name "Albania" ]
) .

JSON-LD:

{
"@context": {
"@base": "http://example.org/country-codes-and-names.csv#",
"csvw": "http://www.w3.org/ns/csvw#",
"csvw:rows": {
"@container": "@list"
}
},
"@id": "http://example.org/country-codes-and-names.csv#table",
"@type": "csvw:Table",
"csvw:rows": [
{
"country": "AD",
"name": "Andorra"
},
{
"country": "AF",
"name": "Afshanistan"
},
{
"country": "AI",
"name": "Anguilla"
},
{
"country": "AL",
"name": "Albania"
}
]
}


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@6a6d74
Copy link
Contributor

6a6d74 commented Mar 4, 2015

csv2rdf document now re-written to reflect decisions of the f2f meeting in Feb 2015, London.

@6a6d74 6a6d74 closed this as completed Mar 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants