-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML Serialization Use Cases #147
Comments
Use case: Dynamically extending a Web page with annotation(s)An interactive annotation system incorporates the annotation into the annotated HTML page. This is done by extending the DOM of the HTML page at load (or interaction) time (eg, retrieving the annotations from a server), using the (DOM version of the) HTML serialization of each annotation. By doing so, it leaves the display of the annotation to the browser's (or the reading system's) display engine. The user/reader of the HTML page can add CSS statements to style the annotations themselves. Because the annotations use a standard, the CSS can refer to the standard set of elements and attributes; the effect will be the same regardless of which annotation system is used. In effect, by using a standard HTML serialization, the content and the style becomes strictly separated. Characterizations
|
## 3 Annotation Use Cases – 3 Serialization Options The JSON-LD 1.0 Specification in section 6.20 [1] says, "HTML script tags can be used to embed blocks of data in documents. This way, JSON-LD content can be easily embedded in HTML by placing it in a script element with the type attribute set to application/ld+json." (This section is non-normative.) Elsewhere, in appendices [2] [3] also non-normative, the specification does provide illustrations of how JSON-LD would transform to RDFa or Microdata; but importantly for us, the JSON-LD specification in the Microdata appendix says that, "the JSON-LD representation of the Microdata information stays true to the desires of the Microdata community to avoid contexts and instead refer to items by their full IRI." (The same is effectively true for RDFa serializations.) Because our @context shortens some vocabulary items and hides the namespaces from which we borrow properties and classes, this means that developers choosing RDFa or Microdata to serialize annotations in HTML would need to use full property names and namespaces, e.g., oa:hasBody, oa:hasTarget, dcterms:creator. etc. For this reason, if we write a WG note regarding the serialization of annotations in HTML documents, we may want to recommend or highlight the approach of embedding JSON serializations of annotations in HTML as JSON-LD-in-script elements, rather than the use of RDFa or Microdata to serialize annotations. (Of course there could be 4th, 5th, etc. option(s) -- e.g., extending HTML directly with new elements and/or attributes -- for serializing annotations in HTML that we might prefer to JSON-LD-in-script element – please suggest.) I illustrate the approach of using JSON-LD-in-script elements in HTML to embed annotations [4] for the 3 HTML annotation use cases described below. I also illustrate Microdata [5] and RDFa [6] options for the first use case, but only for the first use case, since I think it likely we will prefer JSON-LD-in-script or something else to RDFa or Microdata. Finally, though this gets into issues of interface (not our bailiwick), to help illustrate the JSON-LD-in-script approach, I include in [4] some JavaScript that dynamically modifies the HTML based on the annotations that have been added to the HTML in script elements (e.g., adding anchors, tooltips, footnotes as appropriate). This code was written with Janina Sarol and is not meant to be generic, but is provided simply as proof-of-concept. Use cases illustrated: A. Viewing an HTML blog entry about the birthday of Frances Scott Key [7], Tim wants to annotate the mention of the HMS Tonnant with a link to its Wikipedia page. The target is the mention of the Tonnant in the HTML page and the body is the Wikipedia page (Resource). The annotation is embedded in the Web page in a script element with id='Anno1'. An HTML fragment (i.e., #Anno1) is then appended to the page URL to create the Annotation URI (is this acceptable practice? What are the identity requirements for annotations serialized in HTML?). B. Viewing this same HTML blog entry, Tim wants to annotate the portrait of Key embedded in the HTML page with a Textual Body (noting that the portrait was painted many years after the death of its subject). The target is a SpecificResource having the image as its source and the HTML Page URI as its scope. The script element id is 'Anno2' and the Annotation URI is created in the same way. C. Viewing this same HTML blog entry, Tim wants to annotate the mention of the first publication of what became the US National Anthem with its full citation – essentially add a footnote. The TextualBody is the text of the citation, and the target is text in the HTML page. A second body, the link to the digitized article, is included. The script element containing the third annotation has attribute id='Anno3'. [1] https://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents |
dokieli - a decentralised authoring, annotations, and social notifications tool. It stores all articles and Web Annotations natively in HTML+RDFa by default in personal data stores. https://www.youtube.com/watch?v=tH_wMWSEzlE is a 1 minute screencast demonstrating an annotation interaction:
There can be variations to the process and mechanism above, but it works the same for replies, footnotes, references, bookmarking, and other social interactions. For more details on dokieli: http://csarven.ca/dokieli . |
|
@iherman Correct, the HTML+RDFa is done by the system. The user is only involved in highlighting the text which they want to annotate, write their content in natural language, select and assign the license (from the dropdown) and submit. The dokieli JavaScript assembles the information (in HTML+RDFa), and sends it to the user's access controlled personal storage. The "complexity" of the source of the information is never made visible to the user who is 1) publishing the annotation, and 2) viewing the article with the annotation. |
Looking at the comment put in by @tcole3 : First of all, what I like in all these approaches is that they work out of the box today, without any need for an extension of HTML. That is a major plus. However I believe that, for practical purposes, we could cross microdata from the list. Microdata, as far as I know, is used only by schema.org (which is of course important!); I do not know of any other environments, tools, etc, that would process microdata. One of the main complications (maybe the major complication) of the RDFa encoding is that, being a true RDF serialization, it relies on a number of namespaces (duly set in a If we expect the RDFa encoding ever being done by human users and not only by machines behind the scenes, we may have to address this. There is an approach to do that, but I have to ask my RDF friends to hold their nose:-): we can define a single namespace vocabulary that consists of nothing else than a series of
etc. It is an ugly hack from an RDF point of view, although perfectly "legal". But it works, and may become then a fairly acceptable way of encoding an annotation in RDFa. Take a deep breath before you answer:-) |
Speaking as an old RDF person... it's fine. Not even that ugly. I have been considering it for ages. Isn't it what schema.org uses? |
|
Huh - interesting. I thought the datamodel mapped back to the original ontologies when that was appropriate. Guess not. |
Let's be concrete. Our json-ld context document currently maps 10 classes, 23 properties and 1 attribute to a total of 8 namespaces (based on a quick count). I may have missed a few enumerations and/or values we draw from these and other namespaces (there are 12 namespaces in addition to our own reference in current draft of our json-ld context document), but may not matter since arguably you might want to keep these. It's the borrowed properties that are the main issue. as: "http://www.w3.org/ns/activitystreams#" dc: "http://purl.org/dc/elements/1.1/" dcterms: "http://purl.org/dc/terms/" dctypes: "http://purl.org/dc/dcmitype/" foaf: "http://xmlns.com/foaf/0.1/" rdf: "http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdfs: "http://www.w3.org/2000/01/rdf-schema#" schema: "http://schema.org/" That's a lot of owl:sameAs assertions, but assuming no name collisions (I don't think we have any), I personally have no strong opinions one way or another. Namespaces are convenient in XML and some other serializations, not really so much in JSON. There are advantages in not being seen to re-invent the wheel, and not having to maintain vocabulary terms in parallel, but as long as we acknowledge inspiration, I could live with this kind of change given a strong enough rationale. By the way, http://schema.org/docs/schema_org_rdfa.html does acknowledge source, it just doesn't use owl:sameAs. Personally I'd rather be more explicit and link term-by-term to original namespace using owl:sameAs, as was suggested. However, because we also shorten some names in our JSON-LD context document, I'm not sure just addressing the namespaced classes and properties issue alone is sufficient to fully facilitate the mapping between RDFa and JSON-LD in HTML, if that's really what we want to do. In our own namespace we have about a dozen of these shortened aliases: body, hasBody We already argued a bit about these. Not sure we want to re-open this discussion at this late date. I personally think it desirable to maintain backward compatibility. But in keeping with idea of eliminating keys in foreign namespaces, if compelling enough case could be made, we could maintain 'superseded' terms as schema.org does (e.g., schema:review supersedes schema:reviews, schema:provider supersedes schema:carrier) or in some other way maintain longer term while preferring shorter term in RDFa as well as in json-ld. For those going back and forth between json-ld and other rdf serializations, it would make life a tiny bit easier. All in all, seems like a lot work preceded by extensive discussion (and potentially heated argument). Do not want to get derailed. But if there is consensus that we want the RDFa to look more like our json-ld (which is what schema.org clearly wanted) in order to facilitate serialization in HTML, these changes would go a long way in that direction. Ultimately may depend on the strength of disagreements within the Group and the balance we settle on for HTML Serialization note between JSON-LD, RDFa, and extensions to HTML (the latter would presumably require its own distinct mapping). What do others think? Go ahead. Be honest (but keep it clean and no personal attacks). |
I do not want to get into this argument either. I would see we have two alternative strategies here, and we should strictly limit ourselves to these two.
I do not think we should reopen any terminology issue on these, it is bikeshedding at this point. There are pros and cons for both. I am personally tempted to go for (2), because if users may want to mix the possibilities to include JSON-LD in the script and also use RDFa (which is a perfectly viable option) then (2) allows a mess. On the other hand, hard core RDF people would want to rely on the formal vocabulary terms (but, then again, hard core RDF people would have no problem using namespaces, ie, this aliasing exercise may be of no interest for them in the first place.) |
👍 to the "use the terms of JSON-LD" option. Despite the nose holding from our RDF friends. 😉 |
Discussed at F2F, 18.05.16: agreed to move on with a note documenting the existing possibilities (ie, JSON LD in script, and RDFa with one giant namespace document). |
The use cases were needed to help write the HTML Serialization Note. Completion and ratification of this Note by the WG (final ed. draft: http://w3c.github.io/web-annotation/serialization-html-note/) closes this issue. |
The issue of how to serialize Web annotations in HTML has come up a number of times on the WG list, in WG calls, and in regard to at least one open issue (#87). However, as became clear during the WG call on 27 Jan 2016, what we collectively mean by HTML Serialization needs more definition. In particular we need well-defined use cases and better definition of scope before we can provide guidance to implementers and discuss HTML serialization in WG Recommendations or our other documents. We also need to determine which facets of the HTML Serialization issue need to be dealt with before model and vocabulary go to CR, and which facets should be deferred to next Charter.
Please add HTML serialization use cases to this issue. This will allow us to cluster and raise new issues dealing with specific aspects of HTML serialization as required and to better identify the technologies and approaches that can be used to best address the full range of HTML Serialization use cases.
To help get you started, for this discussion all of the following should be considered potentially in scope (pending review of use cases submitted), and likely you will think of additional categories of HTML Serialization use cases:
• HTML as another serialization format for Web Annotations – for example, expressing an annotation using RDFa, RDFa Lite, microdata.
• Expressing a Web Annotation by mapping our vocabulary directly to HTML
• Web Annotations embedded in HTML documents which also contain the annotation target(s) and/or body(ies).
• Use cases that require dynamically updating the HTML DOM as footnotes, comments and other forms of annotation of the HTML are added.
Technologies implementing some of these use cases exist and may be referenced in use cases submitted, but at this point in the process, our focus should be on defining and describing the use cases you think the WG should address. The best approach(es) to use will best be hashed out in more specific issue threads once we have a better sense of scope and have done some sorting and prioritizing.
The text was updated successfully, but these errors were encountered: