Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at integrating the metadata proposal #64

Merged
merged 2 commits into from Sep 19, 2017

Conversation

@baldurbjarnason
Copy link
Contributor

commented Sep 11, 2017

The changes are in two places. Firstly the infoset section is extended with an additional subsection on additional metadata that is all optional as well as the role of external metadata files.

Secondly, we needed to make sure the manifest serialisation has a generic linking mechanism as that's going to be used by a lot of the web publication's moving parts.

First pass, so the usual caveats of the placement, structure, and wording being preliminary and quite possibly suboptimal apply. I also wasn't quite sure about what writing style to use. I'm also guessing at the appropriate HTML structure from the context.

The contributors ('we') here are:
+@BorisAnthony
+@hughmcguire
+@laudrain

(I hope I got everybody's GitHub user ids right. Let me know if I didn't.)


Preview | Diff

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 11, 2017

(Admin: do not worry for now on the 'all checks failed' mention, taking care of this...)

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 11, 2017

@baldurbjarnason,

first of all, thanks.

Three, overall minor, remarks.

  1. I am not sure I understand the following:

This mechanism is used in to express many parts of the web publication's infoset, including but not limited to:

  • Canonical identifier
  • External Table of Contents
  • External metadata

The "external metadata" if clear. "External Table of Content" is less clear, maybe it is worth giving an example of what you mean, and how it differs from the ToC that is part of the Infoset. What is not clear is "Canonical identifier". Do you man something like "Additional Identifiers" beyond the one that is part of the infoset? I can imagine, say, a scholarly publication having both a DOI and an ISBN, in which case the author might use the DOI as the "real" identifier in the infoset and add the ISBN in an external file, or simply reuse the same identifier by attached to the dc namespace. But I am not sure that is what you meant and, if yes, it would be worth stating it more clearly.

  1. An extra remark that was in your original document and would be worth, imho, being added as, say, an editorial Note, is that external metadata may be expressed in well known vocabularies like schema.org, Dublin Core, or ONIX, and authors should be encouraged to re-use existing vocabularies wherever they can. This is an informal note, ie, not a normative statement, but would be good to "place" things into the right context.

  2. A final, editorial remark is that it is probably better, in view of the overall style, to put Web Mentions, Atom, or OPDS into the reference list, rather then (or additionally) to the link in the document.

Thanks!

@HadrienGardeur

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2017

I really like this pull request but have one question to ask to @baldurbjarnason: have you considered support for URI templates for the general linking mechanism?

This would help if we want to link to services (search, dictionary and others) instead of static resources.

@mattgarrish

This comment has been minimized.

Copy link
Member

commented Sep 11, 2017

A few additional editorial comments to Ivan's:

  • I'm a bit confused by having "should" requirements in additional metadata. It puts into question how we tie in a pointer from the requirements. The title and language are also only should requirements, so what separates these? If they get that priority, I'd break them out and list them as subsections and leave what remains perhaps as "optional metadata".

  • At least as I understand the infoset, it's a generic set of properties that get exposed. At some point, we have to map what a "title" is from the manifest, for example (exact same name, must be a dc:title, something else). Which I only note because I wonder how we can make classes of properties, like accessibility metadata, members of the infoset without naming them. It also raises questions for me about extensibility. Are we dealing with vocab prefixes in the compiled infoset, for example? As I write I see this becoming a separate issue, but I wonder if the infoset is, in fact, nothing more than the processing instructions for the manifest. If we resolved on JSON, as I recall we did a meeting or two ago, we're no longer dealing with consolidating multiple serializations.

  • Similarly, the "linking from" section is probably more abstract than it needs to be in a single-serialization world. I'm expecting details like this will be stated explicitly (i.e., where in the json object they occur, how they're named, etc.)

Those nitpicks aside, though, I'm generally fine with the direction of the instructions in this PR.

@baldurbjarnason

This comment has been minimized.

Copy link
Contributor Author

commented Sep 18, 2017

I've made a few edits to try to address @iherman's and @HadrienGardeur's comments.

I'm not quite sure what's the best approach for me to integrate @mattgarrish's feedback, though.

I'm a bit hesitant to jump the gun on proposing concrete JSON structures (and if I had to at this point, I'd just crib off what Readium has been doing).

I bundled the pieces in additional metadata together there because they are much less ambiguous than titles, reading order, or language, or whose ambiguity will be largely resolved by the serialisation format (e.g. creator + role) or other groups (e.g. accessibility metadata).

Now, I don't disagree that they might be better off as individual entries (perhaps merging the date parts into one entry) but I'm also a bit limited in the time I can contribute to W3C things (hence this work happening over the weekend). I might not have time to write up the 3-4 additional subsections required until next weekend at the earliest.

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 18, 2017

@baldurbjarnason,

I've made a few edits to try to address @iherman's and @HadrienGardeur's comments.

Thanks

I'm a bit hesitant to jump the gun on proposing concrete JSON structures (and if I had to at this point, I'd just crib off what Readium has been doing).

I think we should hold off with that. The issue of serialization is still ahead of us, we do not know, for example, whether we would use JSON-LD or not, whether we would build on top of WAM or not, and these decision would affect the precise serialization.

@mattgarrish,

I'm a bit confused by having "should" requirements in additional metadata. It puts into question how we tie in a pointer from the requirements. The title and language are also only should requirements, so what separates these? If they get that priority, I'd break them out and list them as subsections and leave what remains perhaps as "optional metadata".

Which may mean that the new list by @baldurbjarnason are MAY and not SHOULD. More exactly: if some are really to be held as SHOULD-s then they should move together with the older list. Looking at the "SHOULD" list f 3.10, the publication and modification dates have more of MAY flair to me, although I am not 100% sure; the list of creators and a11y look more real SHOULD-s, and maybe they could be moved into 3.2 as SHOUld-s

As I write I see this becoming a separate issue, but I wonder if the infoset is, in fact, nothing more than the processing instructions for the manifest. If we resolved on JSON, as I recall we did a meeting or two ago, we're no longer dealing with consolidating multiple serializations.

I am not sure I agree with this (yet). We do have a number of fallback procedures described in the various sections, meaning that not all of these information items would necessarily appear in a concrete manifest for a specific publication. It is not a matter of multiple serialization per se, but conceptually things are still different.

But we can come back to these later.

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 18, 2017

@baldurbjarnason, just nitpicking: I wonder whether the last paragraph on using URI patterns should be listed in the document as an ISSUE (either there, or adding a separate GitHub issue and making a reference to it); it is a more controversial question. (We already had discussion on this I believe.)

But that can be done as part of a general cleanup

@mattgarrish

This comment has been minimized.

Copy link
Member

commented Sep 18, 2017

We do have a number of fallback procedures described in the various sections, meaning that not all of these information items would necessarily appear in a concrete manifest for a specific publication.

Yes, but these are steps in processing the manifest to obtain the information, right? If you don't find certain bits of information in the one serialization, go get it from X.

Given that we have an unfilled section for processing the manifest, and that people find the infoset confusing, I just wonder what value it's really providing. If there is only one path for processing the infoset (irregardless of fallback processing), then there doesn't seem to be anything special about it.

But, if we're planning to expose the infoset, my view maybe changes. (But in that case, it needs better definition.)

I think we should hold off with that.

That's fine with me. I just think longer term that the section on linking from belongs under serialization, since it eventually has to become a concrete part of how the manifest is constructed. I wasn't expecting a rewrite to be specific at this time, only musing that maybe it should be a subsection of serialization.

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 19, 2017

Per WG Decision this will be merged and issues made explicit in the issues' list.

@iherman iherman merged commit b6da65f into w3c:master Sep 19, 2017

1 check passed

ipr PR deemed acceptable.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.