Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV on the Web #55

Closed
torgo opened this issue Apr 27, 2015 · 10 comments
Closed

CSV on the Web #55

torgo opened this issue Apr 27, 2015 · 10 comments
Assignees

Comments

@torgo
Copy link
Member

torgo commented Apr 27, 2015

TAG alumna @JeniT has requested TAG feedback on the CSV on the Web working drafts.

@torgo
Copy link
Member Author

torgo commented May 20, 2015

We will discuss on telecon June 03 - @JeniT to join.

@hadleybeeman
Copy link
Member

To discuss with the TAG:

  1. We have a facility to enable transformations over tabular data using templates or scripts [1], to provide for transformations beyond those we’ve defined for JSON and RDF. In doing this we need to be able to indicate the format of both the result of the transformation and the format of the template or script that is being used.

We think that the “correct” way of doing this would be to use media types. However, it’s quite rare for templating syntaxes (such as Mustache) to have a registered media type, so instead we have opted to use URLs to name those formats and encourage users to use URLs in the form http://www.iana.org/assignments/media-types/{mediatype} when there is a registered media type. Is this the right approach to take or should we be more insistent on the use of a media type?

I think this is most appropriate, to future-proof the spec. The last thing we want to do is to tie people to the list of formats the working group currently use (which may or may not be representative of everything out there) or to the list of everything currently in use (with no scope for future expansion). Further, it makes sense to not restrict this to the list of media types, which has its own priorities and delays.

  1. In the conversion to RDF, we want to use the ‘describes’ link relation defined in [2] to say that a particular row in the tabular data describes a particular thing (such as a person or event). Because this is RDF, the relationship has to have a URL.

However, as has been discussed elsewhere [3], IANA registered link relations do not have individual URLs and http://www.iana.org/assignments/link-relations/describes doesn’t resolve. Similarly, the link relation wiki doesn’t have individual URLs for link relations. We decided to create a URL for this relationship in our own namespace, with a reference to the proper definition (see discussion at [4]), but hope that this case might prompt the TAG to try to get some movement on this issue.

I'm in favour, but I don't know enough about our relationship with IANA to know how to take this forward. Suggestions?

  1. The model of access that we’re assuming for CSV and other tabular data files is that someone will link directly to the CSV file (as currently) and that processors will need to retrieve a metadata file about that CSV based on the location of the CSV file. Note that metadata files are file-specific; we wouldn’t expect a single metadata file that includes information about every CSV file on a particular site.

We think that the “correct” way of getting this pointer to a metadata file (given that there is no scope for embedding information within the CSV file itself) is to use a Link header that points to the metadata file, and we have specified that here [5].

However, we recognise that there are many publishing environments in which it is impossible for users to set HTTP headers, particularly on an individual file basis. We have therefore specified two other mechanisms to retrieve metadata files, used only if the URL of the original CSV file doesn’t include a query string:

  • appending ‘-metadata.json’ to the end of the URL to get file-specific metadata [6]
  • resolving the URL ‘../metadata.json’ against the URL to get directory-level metadata [7]

Neither of these feels great: they require users who can’t use Link headers to structure their URL space in particular ways, and they use string concatenation on URLs which is horrible. However, we can’t see any better alternative to meet our requirement for what is in effect a file-specific well known URI.

I agree that appending a set string to a URL isn't ideal (and doesn't really fit with the kind of context-specific freedom we'd like people to have in setting their URL schemes). So I understand the concern. On the other hand, appending .json to a URL is common enough, and this is definitely human-readable, which helps — so I don't see a better way to solve the problem, so I think this is the lesser of the evils. Might be worth an explicit review when CSVW and DWBP wind up, and the next charters are being set up?

@torgo
Copy link
Member Author

torgo commented Jun 2, 2015

@JeniT
Copy link
Contributor

JeniT commented Jun 2, 2015

Could we also please discuss w3c/csvw#562 briefly.

And it's probably useful to have the link to the TAG thread: https://lists.w3.org/Archives/Public/www-tag/2015May/0014.html

@torgo
Copy link
Member Author

torgo commented Jun 2, 2015

Sounds good – can you make a PR to the agenda file?

Dan

On 2 Jun 2015, at 11:00, Jeni Tennison notifications@github.com wrote:

Could we also please discuss w3c/csvw#562 w3c/csvw#562 briefly.

And it's probably useful to have the link to the TAG thread: https://lists.w3.org/Archives/Public/www-tag/2015May/0014.html https://lists.w3.org/Archives/Public/www-tag/2015May/0014.html

Reply to this email directly or view it on GitHub #55 (comment).

@mnot
Copy link
Member

mnot commented Jun 8, 2015

WRT media types - it's extremely easy to register them now, see http://tools.ietf.org/html/rfc6838#section-3.2.

@hadleybeeman
Copy link
Member

Where are we on this, procedurally? Jeni, when can you next join us? I'm really sorry to have been ill last week.

@torgo
Copy link
Member Author

torgo commented Jun 8, 2015

No worries Hadley. We agreed on last week's call to rerun the topic this week - this Wednesday. Jeni has agreed to join us. I haven't published our minutes yet but I will do so by tomorrow. Dan

@torgo
Copy link
Member Author

torgo commented Jun 9, 2015

See https://github.com/w3ctag/meetings/blob/gh-pages/2015/telcons/06-03-csv-minutes.md for minutes from last week's call.

@torgo
Copy link
Member Author

torgo commented Jul 16, 2015

Discussed and closed at the Berlin f2f. According to @JeniT, "We’re planning to go to CR this week and I think the issues I’d brought to the TAG were resolved through the telcons."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants