Skip to content

Commit

Permalink
more documentation fixes #43
Browse files Browse the repository at this point in the history
  • Loading branch information
proycon committed Mar 12, 2019
1 parent b10265f commit a0564d6
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 11 deletions.
2 changes: 1 addition & 1 deletion docs/source/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ with it.
available. For example; don't use a part-of-speech subset in a lemma set, because there is already :ref:`pos_annotation` for that.

A more thorough example for part-of-speech tags with features will be explained
in Section~\ref{sec:posfeat}.
in the section on :ref:`pos_annotation`.

Some annotation types take *predefined subsets* because some features are very commonly used. These subsets have clearly
defined semantics. However, it still depends on the set on whether these can be used, and which classes these take.
Expand Down
8 changes: 3 additions & 5 deletions docs/source/fql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
FoLiA Query Language (FQL)
=============================

.. todo:: DOCUMENT PROVENANCE

Whereas XPath is a very generic query language, the FoLiA Query Language (FQL)
is a very specific language, designed purely for FoLiA. It allows advanced querying and
document editing.
Expand Down Expand Up @@ -171,8 +169,8 @@ Here is an *EDIT* query that changes all nouns in the document to verbs (assumin

* ``EDIT pos WHERE class = "n" WITH class "v" AND annotator = "johndoe"``

The query is fairly crude as it still lacks a *target expression*: A \emph{target
expression} determines what elements the focus is applied to, rather than to
The query is fairly crude as it still lacks a *target expression*: A target
expression determines what elements the focus is applied to, rather than to
the document as a whole, it starts with the keyword *FOR* and is followed by
either an annotation type (i.e. a FoLiA XML element tag) *or* the ID of an
element. The target expression also determines what elements will be returned.
Expand Down Expand Up @@ -242,7 +240,7 @@ are mandatory for a *HAS* statement::
Target expressions can be former with either *FOR* or with *IN*, the
difference is that *IN* is much stricter, the element has to be a direct
child of the element in the *IN* statement, whereas *FOR* may skip
intermediate elements. In analogy with XPath, *FOR* corresponds to \texttt{//} and
intermediate elements. In analogy with XPath, *FOR* corresponds to ``//`` and
*IN* corresponds to ``/``. *FOR* and *IN* may be nested and mixed at
will. The following query would most likely not yield any results because there are
likely to be paragraphs and/or sentences between the wod and event structures::
Expand Down
6 changes: 3 additions & 3 deletions docs/source/reference_annotation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Specification
:Version History: Since v0.11, external references since v1.2
:**Element**: ``<ref>``
:API Class: ``Reference``
:Required Attributes:
:Required Attributes:
:Optional Attributes: * ``xml:id`` -- The ID of the element; this has to be a unique in the entire document or collection of documents (corpus). All identifiers in FoLiA are of the `XML NCName <https://www.w3.org/TR/1999/WD-xmlschema-2-19990924/#NCName>`_ datatype, which roughly means it is a unique string that has to start with a letter (not a number or symbol), may contain numers, but may never contain colons or spaces. FoLiA does not define any naming convention for IDs.
* ``set`` -- The set of the element, ideally a URI linking to a set definition (see :ref:`set_definitions`) or otherwise a uniquely identifying string. The ``set`` must be referred to also in the :ref:`annotation_declarations` for this annotation type.
* ``class`` -- The class of the annotation, i.e. the annotation tag in the vocabulary defined by ``set``.
Expand All @@ -27,7 +27,7 @@ Specification
* ``confidence`` -- A floating point value between zero and one; expresses the confidence the annotator places in his annotation.
* ``datetime`` -- The date and time when this annotation was recorded, the format is ``YYYY-MM-DDThh:mm:ss`` (note the literal T in the middle to separate date from time), as per the XSD Datetime data type.
* ``n`` -- A number in a sequence, corresponding to a number in the original document, for example chapter numbers, section numbers, list item numbers. This this not have to be an actual number but other sequence identifiers are also possible (think alphanumeric characters or roman numerals).
* ``space`` -- This attribute indicates whether spacing should be inserted after this element (it's default value is always ``yes``, so it does not need to be specified in that case), but if tokens or other structural elements are glued together then the value should be set to ``no``. This allows for reconstruction of the detokenised original text.
* ``space`` -- This attribute indicates whether spacing should be inserted after this element (it's default value is always ``yes``, so it does not need to be specified in that case), but if tokens or other structural elements are glued together then the value should be set to ``no``. This allows for reconstruction of the detokenised original text.
* ``src`` -- Points to a file or full URL of a sound or video file. This attribute is inheritable.
* ``begintime`` -- A timestamp in ``HH:MM:SS.MMM`` format, indicating the begin time of the speech. If a sound clip is specified (``src``); the timestamp refers to a location in the soundclip.
* ``endtime`` -- A timestamp in ``HH:MM:SS.MMM`` format, indicating the end time of the speech. If a sound clip is specified (``src``); the timestamp refers to a location in the soundclip.
Expand All @@ -54,7 +54,7 @@ explicitly present in the text. The ``<ref>`` element, however, carries an extra
</s>
<ref id="mynote" />
Another example in tokenised data, and now we add the \emph{optional} \texttt{type}
Another example in tokenised data, and now we add the *optional* ``type``
attribute, which holds the type of the FoLiA element that is referred to:

.. code-block:: xml
Expand Down
3 changes: 1 addition & 2 deletions docs/source/set_definitions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ format can be indicated on the declarations in the document metadata using the
* ``text/turtle`` -- `Turtle <https://www.w3.org/TeamSubmission/turtle/>`_ (for RDF) (assumed for ``ttl`` extensions)
* ``text/n3`` -- Notation 3 (for RDF) (assumed for ``n3`` extensions)
* ``application/foliaset+xml`` - Legacy FoLiA Set Definition format (XML) (assumed for ``xml`` extensions and in most other cases)
\end{itemize}

FoLiA applications should attempt to autodetect the format based on the extension.
Not all applications may be able to deal with all formats/serialisations, however.
Expand Down Expand Up @@ -114,7 +113,7 @@ After this preamble, we can define a set as follows:
fsd:open false .
The first two lines state that ``http://your/namespace/#your-set-i`` is
\emph{a} [#ftype]_ SKOS Collection, which is what we use for FoLiA Sets. The ``skos:notation``
*a* [#ftype]_ SKOS Collection, which is what we use for FoLiA Sets. The ``skos:notation``
property corresponds to the ID of the Set, only one is allowed [#fnotation]_ .

A set can be either open or closed (default), an open set allows any classes,
Expand Down

0 comments on commit a0564d6

Please sign in to comment.