a major mode for GNU Emacs for annotations in a stand-off manner
Emacs Lisp XSLT Makefile
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
arb
docs
test
.gitattributes
.gitignore
LICENSE
Makefile
README.md
dir
fdl.texi
mistake-owl4config.xsl
owl2config.xsl
owl2manual.xsl
standoff-api.el
standoff-de.texi
standoff-dummy.el
standoff-en.texi
standoff-json-file.el
standoff-json.el
standoff-log.el
standoff-mark.el
standoff-mode-pkg.el
standoff-mode.el
standoff-relations.el
standoff-util.el
standoff-xml.el
standoff.owl
version.texi

README.md

License GPLv3

MELPA

standoff-mode

standoff-mode turns GNU Emacs into a tagger and lets you create (semantic) annotations on texts in a stand-off manner. It is written for use in the field of digital humanities and the manual annotation of training data for named-entity recognition.

There are several tools for creating stand-off markup. Most of them need to be deployed on a server in a network environment, which may be a barrier. In contrast standoff-mode does not need a networking environment. It wants to enable one to get hands on annotating texts right away. It produces a local file with external markup but connecting a relational database is on the roadmap of development.

Your annotation data don't get locked in with standoff-mode. There is a commandline program that internalizes your external markup into the source document again: standoff-tools.

standoff-mode doesn't want to integrate everything under one cover. It's just a tagger, a tool for the manual annotation of texts. Statistical analysis etc. must be done by another tool.

Since it was written for the field of digital humanities, literature studies in particular, standoff-mode works not only with plain text input (source) files, but also with XML. So semantic stand-off markup produced with it may reference structural markup coded in TEI/P5, which may be of advantage for further processing.

Stand-off Markup

Stand-off markup is also known as external markup and means:

  • Stand-off markup refers to a source document by some kind of pointers. standoff-mode uses character offsets.

  • It is contained in an external document (or a database).

  • The source document is left unchanged and may be read-only.

  • The source document may contain markup too, called internal markup. Stand-Off Mode facilitates reading of XML source documents by hiding tags and showing glyphs for character references.

Cf. the TEI/P5 guidelines on stand-off markup and the OpenAnnotation ontology.

Features of standoff-mode

  • allows discontinuous markup

  • allows relations between markup elements (RDF-like directed graphs)

  • allows attributes on markup elements

  • allows text comments anchored on one or several markup elements

  • generate config for your annotation schema from OWL by XSLT

  • allows to customize the restrictiveness of the annotations, either to the annotation schema plugged in via config (apriori), or the schema already used (a posteriori), or free

  • offers completion of user input of markup types, relation predicates and attribute names

  • hide the fdq-names (IRIs) of markup types, predicates and attributes behind labels (from OWL or RDFS), customizable

  • customization of highlighting faces

  • everything can be done with the keyboard an key-codes, no need to use the mouse

  • several pluggable back-ends: json file back-end for storing annotations in a JSON file on local disk; dummy back-end for storing annotations in s-expressions (dumped emacs-lisp data); json rest back-end is under development and will store annotations in a database exposed through a restful webservice.

  • manual based on GNU Texinfo, English (still lacks some sections) and German

  • makes XML-files readable for humans by optionally hiding XML-tags and by substituting character references with glyphs. (Note: Switching between hide/show is a computationally expensive task for Emacs and may consume a minute of your time if you're working on a file with some hundred kBytes.)

Roadmap

standoff-mode is under active development. Here's the roadmap:

  • back-end for storing annotations in a database

  • text comments

  • improve UI

Requirements

Only GNU Emacs is required. After the installation of the editor the standoff-mode package has to be installed. It was tested on Windows, Linux and Mac, with versions >= 24.3.

Installation

standoff-mode is on the Milkypostman's Emacs Lisp Package Archive (MELPA), what makes it very easy to install.

If MELPA is not already in your list of package repositories add the following line to your init file:

(add-to-list 'package-archives
             '("melpa" . "https://melpa.org/packages/"))

On windows you may run into problems using the https location. Use http instead.

Then make the packages from melpa available using package-refresh-contents and call package-install:

M-x package-refresh-contents RET
M-x package-install RET standoff-mode RET

Alternatively you can use the menu for installing packages:

M-x package-list-packages RET

Then search standoff-mode, use i to mark it for installation and x to install the marked package(s).

To activate standoff-mode on a source document and start annotating, open that that document and type

M-x standoff-mode RET

Before working with stand-off mode, the back-end has to be choosen. Using it for the first time, you will be redirected to a customization dialog, when trying to create an annotation. For starters, the json file back-end is recommended. Write "standoff-json-file" (without quotes) into the form of the customization dialog, then apply and save the changes. You can also configure stand-off mode to use this back-end by putting the following line in your Emacs init file:

(setq standoff-backend 'standoff-json-file)

Documentation

There is an english manual and a german manual. And there are german installation instructions for both, GNU Emacs and standoff-mode, addressing absolute beginners.

License

GPL 3