-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for modern handling of errata-corrige or greater changes in ePub3 publications #24
Comments
In my reading, this is out of scope for this Working Group, for two reasons
|
There is a specification for interoperable distribution/interchange of annotations in EPUBs, but it died with the IDPF and no one has shown any interest in reviving it: http://idpf.org/epub/oa/ The change from the CG's "Open Annotation" to the WG's "Web Annotation" didn't materially affect the EPUB spec., as far as I recall, but it did get CFIs into the list of selectors in the W3C version. At any rate, instead of closing shouldn't this just be transferred to the CG's repository to let them decide how to pursue? |
I saw that, but it seems to be about annotations at publication level, correct me if I am wrong. My proposal is mainly about user's annotations in regard to that. In the case when it is not possible to put the annotation in the exact place, the RS could present the user a possible comparison between the two versions of the HTML element or structure. This is also related to epubcfi values because they allow to point even to a x,y position on images, for example, being that some RSs can even annotate up to that level of precision. But epubcfis values are very prone to being disrupted by changes in HTML code, even with the hypotetical improvement of using IDs instead of XML indexes. Regards |
If this is what you are talking about, then this means introducing a mechanism within (X)HTML and not specifically to EPUB. The approach of this WG is to (try to) keep away from adding any mechanism into HTML and rely on, instead, the HTML standard as is. (Older versions of EPUB have gone down the route of EPUB-specific features, like
of course, but only if @P5music agrees. |
I do not mean adding new features to (X)HTML. This can be handled at the ePub level, I think. Maybe some methods could be devised and discussed informally here before submitting a proposal to another group. For example, what about: Thanks |
The open annotation specification isn't restricted to publisher-authored annotations, if that's what you're getting at. It allows anyone to author and distribute annotations and also allows for annotations to be interchanged between reading systems regardless of who they were authored by. It doesn't handle changes to the document content, of course (i.e., as in providing a revision history).
That doesn't sound possible to me. Even if CFIs worked on the DOM (which they don't), you can't insert elements and then not expect a CFI that is based on element position to still work. Whether you hide the element or use a comment it's in the markup/DOM, so you're shifting the position of every sibling element that follows that new markup. CFIs already have methods for correction, though. Beyond IDs, you can also add text locators to help reposition the annotation in case of changes. Web Annotations has similar selectors that can adapt to minor changes in the markup. Flexibility generally has to be found at the text level, as selectors solely based on element structure are always going to be brittle. Packing more elements into a document to retain its history sounds like it will lead to greater and greater brittleness with every change made to a publication.
Maybe, but this sounds complicated to implement and is certainly beyond the scope of this group. As I understand it, you're effectively asking publishers to make their documents a record of every change that has ever occurred to it. There's also a missing component of how all these current and past fragments are linked together so a machine can understand what it's processing. That's really going beyond EPUB into devising a new model for HTML that allows you to view a document's change history over time. If that's the primary objective of your proposal, the proper route for proposing new HTML features is to go through WICG as it wouldn't be something this group could implement. |
I am interested in the open annotation document, I will read.
The most sensible method among the proposed ones above could be appending further elements beyond the end of the official document. No change required for old-style readers or publishers. In that region of the document, there can be also older versions of the same element (that is, older than the older version), indeed every element can be the older version of another element, provided that the newer version has the special attributes that are needed like I see that annotaions have timestamps, as expected, so RSs have the possibility to manage that.
It is true but in common cases it is very lightweight and it is just matter of keeping elements in some tidy fashion at the end of the documents. Usually minor changes are involved.
I do not think it is about HTML pages at all, but just ePub3, it can be optional, explaining how to comply, that is, for publishers for readers or epubcfi processors I think that it would not be impossible for a epubcfi library. I just know that from the library of one of the readium siblings. It's good for calculating epubcfis from elements but not vice-versa, however it could be my fault not to understand how it works. I know this is huge, it is just a proposal. Regards |
Nice...:)
…On Tue, 14 Dec 2021, 15:24 P5music, ***@***.***> wrote:
@mattgarrish <https://github.com/mattgarrish>
@iherman <https://github.com/iherman>
I am interested in the open annotation document, I will read.
Even if CFIs worked on the DOM (which they don't), you can't insert
elements and then not expect a CFI that is based on element position to
still work. Whether you hide the element or use a comment it's in the
markup/DOM, so you're shifting the position of every sibling element that
follows that new markup.
There's also a missing component of how all these current and past
fragments are linked together so a machine can understand what it's
processing.
The most sensible method among the proposed ones above could be appending
further elements beyond the end of the official document. No change
required for old-style readers or publishers.
The appended elements are referred to by an ID.
The publisher knows that "beyond the end of the document" means that those
elements are hidden and they are not calculated when XML indexes are
considered, because they are beyond the last official element
(corresponding to the last possible official index).
In that region of the document, there can be also older versions of the
same element (that is, older than the older version), indeed every element
can be the older version of another element, provided that the newer
version has the special attributes that are needed like
oldVersionID='chap01_par7_ver2' versionTimestamp='2014-03-20T09:32:30Z'
All the older elements (and older than the older ones) are at the end of
the document. They are in a sequence, it's not particularly error-prone.
I see that annotaions have timestamps, as expected, so RSs have the
possibility to manage that.
Packing more elements into a document to retain its history sounds like it
will lead to greater and greater brittleness with every change made to a
publication.
It is true but in common cases it is very lightweight and it is just
matter of keeping elements in some tidy fashion at the end of the documents.
Rare cases are also acceptable because it is a method whose usefulness is
that it does not break annotations if willfully implemented: sometimes it
could be very important and felt.
The rare cases could include rewriting an entire section. That can be huge
but that the annotations positions are not lost is very good, especially
when they are important part of the publication but are a separate
publication themselves, or they simply are the user's content.
Even if it adds a footprint to the publication, it is up to the publisher
not to put many mistakes or huge ones. This should only be a spec for being
able to implement it if necessary as an emergency device.
Usually minor changes are involved.
If an entire section is changed, especially its structure, epubcfis are
lost in great extent, but as I said, in that case most powerful or
institutional RSs can inform the user with a complete comparison, and
retrieval.
That's really going beyond EPUB into devising a new model for HTML
I do not think it is about HTML pages at all, but just ePub3, it can be
optional, explaining how to comply, that is,
for publishers
-put old elements as hidden elements beyond the end of the "official
document"
-put two attributes like oldVersionID='chap01_par7_ver2'
versionTimestamp='2014-03-20T09:32:30Z'
for readers or epubcfi processors
-when transversing the DOM (or just as a matter of algorithm) please know
that when those attributes are encounterd epubcfis are not valid anymore
after that point (= in that DOM tree brach), you have to use special
features to read the old structure if your epubcfis or annotating positions
are there.
I think that it would not be impossible for a epubcfi library. I just know
that from the library of one of the readium siblings. It's good for
calculating epubcfis from elements but not vice-versa, however it could be
my fault not to understand how it works.
An official JS library for handling epubcfis should be created, maybe
encompassing the new improvements.
I know this is huge, it is just a proposal.
Regards
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/w3c/epub-specs/issues/1960#issuecomment-993654852>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS2UXSG4QRYD3GZUZYHGXBTUQ5OURANCNFSM5J64UNYA>
.
|
... and, therefore, out of scope for this WG at this moment (I think I have already said that before). Could you, please, move this to the Community Group? That is where incubation ought to happen! |
Right, we may want to note more prominently on the repository main page that this WG is not the correct place for incubating new ideas. @P5music this Working Group was created to standardize the core EPUB 3 specifications in W3C. We're using the Community Group to develop new ideas and find traction for IDPF specifications without much adoption before bringing them to this group to standardize. I'm going to transfer this issue across to the CG's repository. |
Proposal for modern handling of errata-corrige or greater changes in ePub3 publications
ePubs can have subsequent editions, but here I mean a different kind of changes that can happen to be made to ePubs:
I know that when you create an ePub publication on some important ePub self-publishing firm, it is possible to upload a corrected version of the same ePub, that customers could download, to update their copy if they want.
This means that the HTML code could change in XHTML documents with no change in ISBN or other metadata publication ID.
The above mentioned firms also have annotating systems (I do not know what is their approach to this problem). And many RSs also have annotation systems.
This proposal is to overcome the epubcfi limitations when something changes.
Or it is general and does not deal with the epubcfi system at all, given its limitations.
This would encourage publishers to make errata-corrige changes respecting reasonable constraints, or providing the old version of the modified elements.
I would like to know, and hear from you, whether it could be possible to devise a method of handling errata-corriges in ePubs that
(this is the proposal)
encompasses including the old version of an HTML element or structure along the new one, in a way that it does not break the eoubcfi values (or it does) but that allows that still the old element is there, like hidden, still in the DOM but not for display,
and even subsequent versions are possible as well (like versioning but inside the XHTML document, not like a it was a website with file versioning).
I think this could be tricky, but maybe HTML5 has something for that.
It could be something "included" in the new element, like a special attribute string with HTML inside (tricky for escaping with subsequent versions maybe?).
Or an attribute that, if present, it refers to another element (it means "this element has a previous version, please see ID='old_chap01_par27' at the end of the DOM").
Or a special enclosing recursive tag I do not know.
The main goal should be that the old epubcfi (or other kind of positioning value) the annotation system has got, is still useful to retrieve the new element, but the RS know that it has changed, and it could even understand what changed and whether the old position is within that range or not.
It would be useful also when a mistake in the HTML layout appearance was made when releasing the publication.
That should be a method that also would be applied to an HTML structure, if necessary (this is vague but I mean something that is "bigger" than just an HTML element).
This proposal is mainly for the RSs to be allowed to retrieve the position of some annotation system even if something changes.
Epubcfi values are very prone to disruption when those changes happens. I made a previous proposal about forcing publishers to put id on every HTML element so that it is redundant to put XML indexes. The proposal was abandoned. But it was not even enough in fact.
This new proposal is maybe better because
-it is not so expensive when single HTML elements change (that is the common case)
-it makes a sort of versioning system available.
It would be an improvement.
This is my idea, maybe it is so simple, but other ideas are welcome because I think this would be an important addition to ePub3 (or ePub4) specs.
Regards
The text was updated successfully, but these errors were encountered: