-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locator Fragment(s), processing model #98
Comments
If anything, this group has shown that there's a total lack of consistency in which fragment will be used by various apps. If we had common agreement on this issue, I would be open to that argument, but I don't think that's the case at all. I'll also quote a comment I've made in the other issue:
Also to answer another of your comment:
I don't think that's true:
|
If the fragment syntax introduces string tokens with special significance (e.g. |
Could you please explain what you mean by this? I am not sure I understand. |
For the helpers, and taking JavaScript as an example, it could be to add functions/properties to the I don't have a strong preference for any of the two proposed models, both would work pretty well at least on mobile. However, after pondering your arguments, I don't think that DOM range should be a "first class citizen". IMHO we should either have all the locations as objects, or everything as a string fragment with property helpers to parse it behind the scene. |
+1 to that, consistency is always better. "Everything as object" though would be harder to achieve IMO and would require:
These are all quite similar to how metadata are handled in RWPM. The main difference compared to RWPM is that these locators will need to be stored in a DB most of the time, which means that each app will most likely have to serialize/deserialize JSON as a string. I don't expect the "everything as object" route to require less work behind the scene, the outcome will be mostly similar with some additional complexity. |
Consistency is good. So, Full example: Evidently; based on the overall state of the discussion, mainly between Mickael and Hadrien, and myself; my opinion falls into the minority bucket, so naturally I will follow the group consensus :) In recent discussions we have been dealing with several tangential issues such as: DOMRange specificities (structured object), the overall unclear processing model for when several kinds of fragment resolutions are expressed in a single Locator payload (e.g. CSS selectors / fragment-ID vs. progression vs. CFI character-level position, etc.), and the notion of I do very much appreciate the need to strike a reasonable balance in the Locator model in terms of generalization vs. specialization. However, a an implementor I will be disappointed to follow a design pattern that seems to run counter to the otherwise pragmatic architectural principles that underpin most of the Readium2 models / JSON serialization. If This approach not only introduces some degree of runtime overhead (i.e. in terms of performance and memory allocations), but there is also increased code complexity and testing costs due to the additional parsing / escaping rules. These downsides would be absolutely fine if the API consumers actually made use of the URI-fragment-friendly syntax, but none of the known implementations do. To borrow from the idiom "premature optimization": to me this feels like "premature encoding". If there is indeed a real, identified use-case for URI-friendlyness (like there is in the TOC For example, in the desktop implementation Locator objects are central to several critical features, and although the volume of JSON traffic during event messaging is not huge, the processing cycles are not trivial (i.e. it "all adds up"). The reading location is emmited at a low rate from the "navigator" component (i.e. at every detected user interaction with the rendered document), but handling bookmarks, highlights/annotations, search results, etc. represents a more significant execution trace to debug into. Right now, the |
Progression as a concept works across all the different types of publication that we've considered so far (EPUB, PDF, audiobooks, comics) and isn't defined in any specification (it would have been a good candidate for Media Fragments URI 1.0 IMO). This is why having You can hardly make the same assessment for DOM ranges or CSS selectors, which are extremely specialized.
IMO this has nothing to do with the discussion about fragments, since you could have locations that contradict one another. For many reasons that we've listed before (we need context along with multiple fallback options), we can't live in the naïve world of "CFI only" anymore, therefore there is always a risk of having a location that contradicts either
I don't think that's true. The current situation with
As I've said before, I think that the "everything is an object" approach would be every bit as costly in terms of code complexity and parsing/escaping rules but it would also make the overall model more complex and would put the burden on maintaining a registry of locations entirely on us (vs the fragment approach where each media type is responsible for registering fragments).
In the context of Readium Web, there is: this could be a way to reference specific portions of a publication or jump to a location that doesn't have an id. It won't be as powerful as what we can do when passing a Locator to a navigator, but at least we'll be able to do something. A Publication Viewer will be able to use the exact same API that we use to jump to a locator. |
Let's take a concrete example - {
"locations": {
"cssSelector": "body.rootClass div:nth-child(2) p#para-id"
}
} Example Javascript consumer code running inside the web browser runtime that displays an HTML document (publication document): const paraElement = document.querySelector(locator.locations.cssSelector); Now with the {
"locations": {
"fragments": [
"cssSelector(body.rootClass div:nth-child\\(2\\) p#para-id)"
]
}
} Example crappy Javascript code (sanity checks removed, for brevity): const cssSelector = helperExtractCssSelector(locator);
const paraElement = document.querySelector(cssSelector) function helperExtractCssSelector(locator) {
for (const fragment of locator.locations.fragments) {
// the "fragmentIsCssSelector" piece of code returns true
// if the fragment string matches the `cssSelector(xxx)` syntax
if (fragmentIsCssSelector(fragment)) {
// the "removeCssSelectorScheme" piece of code
// removes the `cssSelector(xxx)` scheme in the string,
// returns the `xxx` value which may contain escaped characters
const escapedCssSelector = removeCssSelectorScheme(fragment);
// the "unescapeCssSelector" piece of code ensures escaped characters
// are un-escaped (e.g. double-backlashed parentheses for `:nth-child(yyy)` pseudo-class)
// and returns the ready-to-use CSS Selector value
return unescapeCssSelector(escapedCssSelector);
}
}
} Same logic for CFI expressions. |
I've already replied to that example @danielweck, it's only relevant if we have a known quantity of locations, which is definitely not the case. Readium Desktop seems to primarily use CSS Selectors and DOM Ranges (neither of which have a standardized representation as a fragment), but other implementations may use completely different locations (they can even be media specific, as we've seen with PDF, which is extremely useful). In practice, we won't be able to use My recommendation in such cases would be to simply write something something specific to your own implementation (Readium Desktop), which is a bit of a shame but would be the only way to deal with the issue that you're facing. I don't believe that changing the model will solve the problem that you're facing. |
It is absolutely fine to have "helper" code that inspects objects for available fields, i.e. to test the presence of specific discoverable properties. Conversely, "helper" code that parses arbitrary strings in order to match supported tokens and to parse out meaningful values, is neither trivial nor efficient. |
If the iOS/Android implementation also have to produce/consume the flat linearized |
We are discussing this subject right now on the call, are you available to join @danielweck ? |
Regarding the escaping, maybe we don't need to do it at all? This unofficial spec doesn't do it:
|
If I understand some of your points correctly, @danielweck You don't want to have to build Classifiers/Declassifiers for each "raw" value you'll get. |
"Unofficial Draft 02 March 2012" |
Quite a bad omen... 😄 |
I fail to understand the counter-proposal though. Let me try to figure where this might go:
|
Why not :) The information about reading-locations/bookmarks/search-results/highlights-annotations is transported in Locator objects (for example, wrapped in marshalled event payloads that transit across the boundaries of webviews and platform-specific application layers, e.g. via postMessage / IPC). |
Regarding extensibility and discovery of the various "types" of locations / fragments, the elepphant in the room is https://www.w3.org/TR/annotation-model/#selectors so let's lay down a concrete example, for illustration / discussion purposes. Here is a sample Locator object's JSON serialization (ouch, verbose and ugly, but follows the existing W3C spec.): {
"locations": {
"progression": 0.8,
"fragments": [
{
"type": "CssSelector",
"value": "body.rootClass div:nth-child(2) p#paragraph-id"
},
{
"type": "FragmentSelector",
"conformsTo": "http://tools.ietf.org/rfc/rfc3236",
"value": "paragraph-id"
},
{
"type": "FragmentSelector",
"conformsTo": "http://www.idpf.org/epub/linking/cfi/epub-cfi.html",
"value": "/4/2/8/6[paragraph-id]"
},
{
"type": "FragmentSelector",
"conformsTo": "http://www.w3.org/TR/media-frags/",
"value": "t=15"
},
]
}
} |
Instead, we could cherry-pick the {
"locations": {
"progression": 0.8,
"fragments": [
{
"type": "CssSelector",
"value": "body.rootClass div:nth-child(2) p#paragraph-id"
},
"#paragraph-id",
{
"type": "PartialCfi",
"value": "/4/2/8/6[paragraph-id]"
},
"t=15"
]
}
} |
But, as mentioned on the conference call as a thought exercise (not an actual proposal), why encode Media Fragment URI timestamps with
...this potentially opens a whole can of worms because the |
I don't have one yet :) Like I said, I am happy to go with the group consensus to use |
This makes logical sense, but I would like to hear from other implementors too. Encoding information in a However, I am under the impression that most R2 apps do in fact transmit information in structured form, so linearizing parts of it seems counter-productive. Therefore I like the idea of enabling a well-recognized set of constructs (like CSS Selectors and Partial CFIs ... perhaps even timestamp in seconds?), elevating them as first-class citizen. |
Having first class linearization plus haiving |
Purely from an abstract data model point of view, the generalized Also, as suggested by Hadrien, there could be an extension point in However, the data type could also be object, for example to encode the 3 distinct fields of Personally, I would even push this reasoning as far as So, in practice, I don't see a valid reason for an implementation to produce the equivalent of Thoughts? |
I'll rephrase things a bit differently:
This means that the core model will remain the same and those three will be covered as part of our extensibility. |
Closing. See: #99 |
I have expressed reservations (see #95 ) about the suboptimal processing model proposed for the
locations.fragments
array of string values (see https://github.com/readium/architecture/blob/master/locators/README.md#fragments ).For example, the current proposal requires producer/consumer code to generate/parse ad-hoc fragment schemes for CSS Selectors and (partial)CFIs which lack precise definitions of escaping rules (see for example the (full)CFI border cases https://www.idpf.org/epub/linking/cfi/epub-cfi.html#sec-epubcfi-escaping ).
Consumer code must parse each string value in the
locations.fragments
array in order to extract meaningful information (i.e. to differentiate CFI from CSS Selectors, and even time Media Fragments, etc.). This introspection step seems unnecessary and costly, in the context of R2 implementations which ultimately do not rely on native browser engine support for the intended URI fragments. Even time Media Fragments will be deconstructed in order to extract timestamps and feed those values to underlying playback APIs.Such "fragments" in their linearized / flattened form are certainly useful at the point at which the information gets encoded into actual URI fragments (which usually involves syntactic escaping that can be quite error-prone, see for example encodeURI vs. encodeURIComponent), for instance in
src
orhref
properties with JSON Schematype
=string
andformat
=uri-reference
. However, I fail to see the advantages of prematurely serializing high-level constructs such as CSS Selectors and (partial)CFI into a format compatible with URI fragments, when consumers in the R2 architecture are likely to make use of them directly (e.g. CSS Selectordocument.querySelector()
API).Finally, in light of the DOMRange discussion (see #95 ) which suggests a "first class citizen" JSON property in order to express the DOMRange construct in a structured manner (i.e. non-flattened/linearized for the purpose of URI encoding), I am wondering where the line in the sand lies between "locator/fragment" constructs that belong in the generalized model (i.e. array of generic strings that need to be parsed in order to extract meaning, or otherwise collated together into a URI fragment), versus other constructs that warrant a directly-identifiable / discoverable data type (e.g.
locations.cssSelector="body div#id.class p#paragraph"
+locations.partialCfi="/4/6/2[paragraph]"
+locations.fragments=["#paragraph"]
instead oflocations.fragments=["#paragraph", "partialcfi(/4/6/2[paragraph])", "cssselector(body div#id.class p#paragraph)"]
).The text was updated successfully, but these errors were encountered: