Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding RDF relationships in TEI (TEI+RDFa and alternatives) #1860

Open
chiarcos opened this issue Mar 3, 2019 · 18 comments
Open

Encoding RDF relationships in TEI (TEI+RDFa and alternatives) #1860

chiarcos opened this issue Mar 3, 2019 · 18 comments

Comments

@chiarcos
Copy link

chiarcos commented Mar 3, 2019

Motivation is to achieve a representation of RDF relations in the TEI which is unambiguous in vocabulary and semantics. Note that this does not pertain to cases where native TEI vocabulary elements could be interpreted as triples, but to cases that are not covered by TEI semantics, e.g., the linking between a passage in a edition and a terminology repository or a CTS urn. A similar restriction can be found in the definition of <link>.

At the moment, there are at least three different possibilities to express RDF triples inline in TEI:
<relation> (#311)
<fs>
<link>

Each of these are problematic as they conflate pre-RDF and RDF semantics, and that they are analogy-driven ("tag abuse") rather than explicitly defined. The currently preferred solution with <relation> is restricted to named entities, example 4 in the guidelines thus breaks the TEI schema (see my comment on #311).

Several alternatives are possible (see email thread in http://tei-l.970651.n3.nabble.com/Best-practice-for-W3C-Web-Annotations-generated-based-on-TEI-names-and-dates-module-tags-td4031445.html). One possibility, RDFa, has great appeal due to being an established W3C standard that comes with off-the-shelf tooling (e.g., https://www.w3.org/2012/pyRdfa/ and http://www.sparql.org/sparql.html which can directly run against TEI documents or derived XML formats that maintain [rather than generate] RDFa information).

In the past, RDFa has been ruled out, partially because of fears it would evolve and this would have a negative impact on the TEI (http://tei-l.970651.n3.nabble.com/TEI-and-RDFa-was-Re-SAWS-and-LOD-was-Re-Cross-references-among-segs-in-TEI-td4025195.html). Since its W3C standardization (2015, https://www.w3.org/TR/rdfa-core/), this risk does no longer exist.

In 2018, two successful applications of TEI+RDFa in two independent projects have been reported (http://lrec-conf.org/workshops/lrec2018/W23/pdf/10_W23.pdf, http://e-spacio.uned.es/fez/eserv/bibliuned:363-Pruiz3/Ruiz_Fabo_Pablo_DISCO.pdf), thus motivating project-independent specifications, ideally as part of the TEI. I suggest to follow the modeling of https://github.com/postdataproject/disco/#rdfa-attributes.

Note1: This is a follow-up to #311, but a different approach.

Note2: One possible alternative is to redefine <link>, <relation> or (not and) <relation> to provide unambiguous RDF semantics and to couple this with GRDDL/XSLT scripts to generate RDFa attributes (cf. http://www.ancientwisdoms.ac.uk/media/ontology/tei_to_rdf.xsl).

Note3: Third possibility is to sandbox RDFa attributes by restricting them to <ab> and <seg> (i.e., same contexts as for <relation> in the SAWS proposal: http://www.ancientwisdoms.ac.uk/media/documents/Markup_Guidelines_for_Gnomologia.html#TEI.relation)

@lb42
Copy link
Member

lb42 commented Mar 8, 2019

Just for completeness, I ask again: what about <graph>? (especially, since I understand graph-theoretic ontologies are replacing RDF in some ecosystems)

@chiarcos
Copy link
Author

chiarcos commented Mar 8, 2019

RDF and graphs are closely related, indeed. On a theoretical level, RDF formalizes labelled directed multi-graphs. A technical difference is that RDF is based on URIs and W3C standards whereas graph databases are usually not.
But <graph> in TEI is not meant to provide graphs as a data structure, but only visualizations of such data structures. At least this is what the examples under https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-graph.html and https://www.tei-c.org/release/doc/tei-p5-doc/en/html/GD.html look like. It's more like GraphViz/Dot than like RDF, and of course both could be used to draw RDF graphs as illustrations.

@lb42
Copy link
Member

lb42 commented Mar 8, 2019

Why do you think tei:graph is not intended to provide a way of encoding a graph data structure? The beginning of chapter 19 would seem to indicate that it is: "Among the types of qualitative relations often represented by graphs are organizational hierarchies, flow charts, genealogies, semantic networks, transition networks, grammatical relations, tournament schedules, seating plans, and directions to people's houses. In developing recommendations for the encoding of graphs of various types, we have relied on their formal mathematical definitions and on the most common conventions for representing them visually. However, it must be emphasized that these recommendations do not provide for the full range of possible graphical representations, and deal only partially with questions of design, layout, and placement."

@PietroLiuzzo
Copy link
Contributor

in Beta Masaheft we also transform, like SAWS, the TEI in RDF triples of different flavours. However, I now think that perhaps these semantic mapping could be defined in a project ODD rather than in the transformation, with something like models and @behaviour, and that an XSLT or any other script making that transform from TEI to RDF should be able to rely on that information in the ODD in order to do so.
Because in most cases people will make different decisions on what classes and properties to use in their RDF also where their TEI is the same, even nicer would be if there were suggested associations for each element in the TEI modules already which could be customised in the ways all other parts of TEI are customisable, including them or not, adding or changing them. I could then define in my ODD the precise semantics, and opt for seg, relation, link, etc. and have it clearly defined in my custom ODD and in relation to the standard set in the original modules.

@chiarcos
Copy link
Author

chiarcos commented Mar 8, 2019

Wrt <graph>: In fact, my interpretation of "it must be emphasized that these recommendations do not provide for the full range of possible graphical representations, and deal only partially with questions of design, layout, and placement" would indeed be that <graph> deals with graphical representations of graphs, with the "partially" clause referring to the fact that the rendering itself is beyond TEI (as it is beyond the dot language).
We should probably elicit feedback on actual uses of <graph>, but it should definitely not be used for both purposes, because of their different functions: A conceptual graph is normally not to be rendered whereas graph visualizations have to.

@chiarcos
Copy link
Author

chiarcos commented Sep 11, 2019

We should probably elicit feedback on actual uses of <graph>, but it should definitely not be used for both purposes, because of their different functions: A conceptual graph is normally not to be rendered whereas graph visualizations have to.

Public responses under http://tei-l.970651.n3.nabble.com/Current-and-historical-uses-of-lt-graph-gt-td4031618.html. Neither there nor in the private responses, any actual and current use of <graph> has been confirmed, only its historical use for drawing network graphs and its potential use for representing graph data structures. If indeed, the use of <graph> as a data structure (rather than a graphical representation) would be endorsed by the TEI, I would strongly suggest to rephrase its definition accordingly, and to provide alternative vocabulary for the representation use (e.g., by recommending/enabling the embedding of SVG [or GraphML], following the spirit of the suggestion in https://wiki.tei-c.org/index.php/TEI_to_SVG#Using_SVG_with_TEI).

For pragmatic reasons, I would prefer an RDFa-compliant solution (even if possibly sandboxed by restricting it to container elements such as <seg> and <ab>) because it comes with off-the-shelf tooling whereas anything based on <graph> would have to be rebuilt by every data provider individually (and as a new XML-based solution, it is highly unlikely to find any support outside the DH community). More important than this (personal) preference is, however, to have clear instructions for expressing RDF triples (or at least, RDF properties and objects) in TEI and to have that in the guidelines, and with respect to this, I'd be happy with any clear guidance.

@martindholmes
Copy link
Contributor

@chiarcos For a very straightforward solution, have you considered just putting RDFa inside a <xenoData> element and pointing to/from the TEI? That would leave your RDFa clean, straightforward and easily processable, while tightly linking it to the TEI content.

@chiarcos
Copy link
Author

chiarcos commented Sep 16, 2019 via email

@martindholmes
Copy link
Contributor

@chiarcos Thanks for the clarification.

@chiarcos
Copy link
Author

chiarcos commented Jan 18, 2020

As an afterthought: Where it is not possible/necessary to provide RDF statements in inline XML, the standard solution (i.e., the only solution that is both TEI-compliant and W3C- [or otherwise] standardized) would be to use a standoff annotation with Web Annotation (JSON-LD) over a TEI/XML document. This works nicely as long as the underlying TEI/XML doesn't change anymore (such that URIs, resp. XPaths or offsets -- whatever selector is used for Web Annotation -- still point to the right element), but it is not feasible for content under production.

Permitting RDFa in TEI is actually conceptually compatible with the recommendation to use Web Annotation for standoff annotation, as an RDFa serialization of Web Annotation has been developed, too: https://www.w3.org/community/openannotation/wiki/RDFa, resp. https://www.w3.org/TR/annotation-html/#annotations-embedded-as-rdfa

@peterstadler
Copy link
Member

We discussed that issue briefly during our virtual f2f this weekend. If I understand correctly, the current issue is about expressing "RDF triples inline in TEI" where the straightforward solution would be to add RDFa attributes to (nearly?) all TEI elements. While this might not be a proper solution to be incorporated into the TEI standard, would it still be helpful to have that as an example customization at https://tei-c.org/guidelines/customization/ (in analogy to TEI + SVG or TEI + Math)?

@chiarcos
Copy link
Author

chiarcos commented May 5, 2020 via email

@martinascholger
Copy link
Member

@peterstadler and I discussed the issue in a meeting on July, 1. Based on the discussion, Peter started with a first draft for an example customization.

@chiarcos
Copy link
Author

chiarcos commented Jul 7, 2020 via email

@peterstadler
Copy link
Member

Just for the record: The current draft of the customisation ODD is added in the branch issue-1860 at 151136c.
It simply adds all RDFa attributes to a new class att.global.analytic.rdfa and hooks this class into att.global.analytic.

@RobertoRDT
Copy link

Dear all, any new developments on this? Has anyone tested the new customisation? Would you suggest that the RDFa attributes are a good solution? I would like to do some experimental work with ontologies and RDF-like triples, hope that the "clear guidance" mentioned by Christian arrives at some point in time.

Thank you for your work,

R

@chiarcos
Copy link
Author

chiarcos commented Jul 5, 2021 via email

@peterstadler peterstadler removed their assignment Sep 12, 2022
@JanelleJenstad
Copy link
Contributor

JanelleJenstad commented May 8, 2023

Revisited at Guelph 2023 F2F. Peter Stadler has rotated off Council. @HelenaSabel (whose work is mentioned in this ticket) will review the draft ODD and get things moving again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants