Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to infoset and extensibility #83

Merged
merged 3 commits into from
Oct 11, 2017
Merged

Changes to infoset and extensibility #83

merged 3 commits into from
Oct 11, 2017

Conversation

mattgarrish
Copy link
Member

@mattgarrish mattgarrish commented Oct 9, 2017

Possible mods to address issue #74:

  • Adds sections for "should" metadata and merges list into requirements section.
  • Removes the additional metadata section and replaces with short statement about extensibility.

Preview | Diff

@laudrain
Copy link

laudrain commented Oct 9, 2017

It's ok for me too.

@iherman
Copy link
Member

iherman commented Oct 9, 2017

I approved the merge, but I do have one question on the Accessibility metadata. AFAIK, and thanks to the work of @avneeshsingh and @marisademeglio, we have now the accessibility metadata as part of schema.org. On the other hand, at least for my taste, the information items are, essentially, those that belong (in abstract) to "our" namespace, whereas other metadata should be linked. Or do we want to include schema.org terms as part of the items listed in the manifest?

@mattgarrish
Copy link
Member Author

Right, I don't know yet exactly how info is compiled from the manifest, since we're lacking a concrete serialization. I was questioning how we represent properties in the original PR for this, but sort of veered off into whether the infoset is necessary.

I assumed this was a placeholder for schema.org metadata, which could be prefixed or have a mapping through an initial context, but that's why I kept the description light for now.

@mattgarrish
Copy link
Member Author

Maybe it would help to have an ednote at the start of the infoset section that says the details of how the information is compiled is still TBD as it depends on the manifest?

@eshellman
Copy link

I have to say, the language around "publication date", reflects an antiquated, pre-web notion of the publication "event". What function does this "publication date" have? If it is meant to serve copyright-related decisions, then the language (and the machinery) are poorly crafted.

@dauwhe
Copy link
Contributor

dauwhe commented Oct 10, 2017

@eshellman I think the idea of a publication date transcends formats. It locates a publication in time, which is critical for the reader's understanding of the content. And it's intrinsic to the very act of publishing, where something that was not public is made public. Publication date is also a fundamental way of sorting groups of publications.

Would you propose different language? Make a publication date optional? Do you envision publications where the idea of a publication date does not make sense?

@laudrain
Copy link

@eshellman some concepts of publishing are not only pre-web dated, but also centuries old. Among them authorship, reliabilty of information in time, that's where "publication date" take some meaning.
Also, IMHO, in the Web era, an anonymous and undated information has all the chance to be unreliable. This may be useful for any Web Publication.

@mattgarrish
Copy link
Member Author

Putting my neutral editor's hat on, let's not argue the merit of the properties in this PR. As Ivan noted in #74, these were discussed by the working group and agreed for inclusion in the FPWD. Which survive and at what priority will be taken up again, and more detail will no doubt need to be provided on their use.

Please feel free to open new issues against any of them if you don't agree they belong. We can always add issue pointers.

If anyone would like to propose alternative wording, I'm always happy to make changes.

@eshellman
Copy link

@dauwhe I always find that real-world examples are useful. What is the publication date for this web publication?: https://www.gutenberg.org/ebooks/1 How about this one?: https://press.rebus.community/introwgss/ Your suggestion that the publication should be used for sorting is an interesting one that suggests that we should have an 18th century web publication date for the first one; or maybe 1971? When should a new publication date be assigned?

@dauwhe and @laudrain I am aware of the traditional meaning of "publication date", but "wpub" is ostensibly something new; it is less than clear how the working group intends it to be implemented in the wpub context, or how it ought to be used; having participated in standards work myself, I observe that ambiguity about intended use leads diverse implementation and general uselessness.

There is some useful guidance for "modification date", appropriate for this stage in the process.

Happy to move discussion to a separate issue; my observation in #74 was that the document is muddy about "why" which seem to be a prerequisite for "what" decisions, which seem to be elements of this PR. I can't suggest language myself without mind-reading the working group.

@baldurbjarnason
Copy link
Contributor

baldurbjarnason commented Oct 10, 2017

Like @eshellman I'm very sceptical about the concept of a publication date but decided not to debate it so far since it felt like I was in a minority of one. 🙂

The problem with publication date is that it is a very ambiguous term in general, even with physical books—where its meaning has only achieved its current clarity through centuries of context-specific refinement. Digital publications haven't had the same time. 'Publishing' on the web is an ongoing process in most cases. Also made ambiguous in that web 'publishing' is generally about updates to access control and not about the publication itself, per se.

I know that most of us on the web side of things will just use it as a synonym for creation date—which is a more meaningful piece of data in our context but not what 'publishing date' actually means.

Switching the spec to use creation date instead of publishing date would solve a lot of these issues as that's unambiguous in the web context (the date this particular web resource was created, no matter what access controls are in place or its legal status) while still able to serve the publishing industry for indicating publication date.

@baldurbjarnason
Copy link
Contributor

Also, creation date maps much better to the metadata in other web formats (Activity Streams, Atom/RSS, etc.) and so would be be better for interoperability.

@baldurbjarnason
Copy link
Contributor

Actually, never mind what I said above. Both activity streams and atom use 'published'.

(Should do your research before you blather on, Baldur!)

Atom:

The "atom:published" element is a Date construct indicating an instant in time associated with an event early in the life cycle of the entry.

Typically, atom:published will be associated with the initial creation or first availability of the resource.

As long as we use language similar to atom's and let people use it for whichever purpose makes the most sense in their context, I'm fine with 'published'.

@eshellman
Copy link

@baldurbjarnason's suggestion works for me. except where atom talks about component resources, wpub is more about a collection of resources, each of which may have its own metadata.

@mattgarrish
Copy link
Member Author

What about changing as follows:

The publication date is the date on which the Web Publication was originally created/published. The exact moment of publication is intentionally left open to interpretation: it could be when the Web Publication is first made available online, or could be a point in time before publication when the Web Publication is considered final.

Do not change the publication date when a Web Publication is updated as it represents a static event in the lifecycle of a Web Publication and allows subsequent revisions to be identified and compared.

I took the "must not" out of the second para, too, as there's no way to enforce no changes to the date unless you compare two manifests. Probably best not to start adding aspirational requirements.

@baldurbjarnason
Copy link
Contributor

Works for me.

@tcole3
Copy link
Contributor

tcole3 commented Oct 10, 2017

In my experience, dates of 'creation' are even harder to pin down than publication dates. For a journal article, are you talking about dateSubmitted, dateAccepted, dateCopyrighted, all of which tend to happen prior to 'publication' (and all of which happen to be date properties in Dublin Core)? Preprints (arxiv.org) are also more problematic if you say created. If you left out created, would be better.

The other problem you may have (but is at least in part unavoidable) is with derivative and reformatted works, aka (in print world) editions, as distinct from versions or printings, especially given your second paragraph. In print each edition has its own publication date, but each printing tends not to. Large format is sometimes a separate edition, sometimes not. People seem to have an even harder time with this idea in the digital world, especially when publishing an item that was previously published in print or in another digital format. Preprints as mentioned, but even the question of whether a PDF of an article published originally in TeX sometimes gets a distinct publication date. Mostly this issue will have to be left to be sorted by publishers, and we should anticipate variations. But I'm a little concerned that the second paragraph will muddy the waters more than help.

@mattgarrish
Copy link
Member Author

Yes, this is the general trouble with dates. No short explanation is going to cover all cases and issues.

What if we stay silent on requirements entirely and just use the description of the purpose instead:

The publication date is the date on which the Web Publication was originally published. It represents a static event in the lifecycle of a Web Publication and allows subsequent revisions to be identified and compared.

The exact moment of publication is intentionally left open to interpretation: it could be when the Web Publication is first made available online, or could be a point in time before publication when the Web Publication is considered final.

@RachelComerford
Copy link

I like that solution - set the standard and then we can separately document implementation suggestions for appropriate audiences.

@laudrain
Copy link

@mattgarrish I also support your proposal

@mattgarrish
Copy link
Member Author

I'll argue that we have at least a rough consensus now, so I'm going to go ahead and merge these changes. Please open new issues to address any additional specific concerns you might have.

@mattgarrish mattgarrish merged commit e060e2a into master Oct 11, 2017
@mattgarrish mattgarrish deleted the infoset-ext branch October 11, 2017 12:20
@eshellman
Copy link

In a few days, if no one else has already done so, I will summarize the "publication date" discussion and write up a proper issue with background and context.

@rdeltour rdeltour removed their request for review October 30, 2018 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants