Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
331 lines (215 sloc) 10.7 KB
PX - parsed text including clause connections
The following definitions are an extension to the syntax
given in ps4.p(5).
eov = <eleven spaces> `*' star_line <newline>.
star_line = relation_sequence instructions number
clause_block clause_constituent sentence
text_type paragraph type.
relation_sequence = `0' opt_relation_list `0' `0'.
instructions = string integer.
number = `LineNr' integer.
clause_block = `ClauseNr' integer `:' integer `:'
integer `:' integer.
clause_constituent =
`:' integer integer.
sentence = `SentenceNr' integer.
text_type = `TxtType' `:' string.
paragraph = `Pargr' `:' dotnum.
type = `ClType' `:' string.
opt_relation_list = <empty> | relation_list.
relation_list = relation | relation_list relation.
dotnum = integer | dotnum `.' integer.
relation = integer integer.
In the above, a string is a sequence of one or more
isgraph(3c) characters, after `LineNr' restricted to
alphanumeric characters and the `?'.
In comparison with ps4.p(5) files, PX(5) files have extra
information after the asterisk that concludes the clause
atom. This so-called `star line' holds two types of infor-
mation. In the first place information about clause atom
relations. They survey the textual hierarchy in terms of
distances and descriptive labels. In the second place, the
Werkgroep Informatica Last change: 14/02/27 1
star line provides information about the functional units
the clause atom is part of. The information of all clause
atoms combined gives a complete picture of the functional
hierarchy of the text.
A text is conceived logically as a sequence of segments,
which are composed of paragraphs, which are composed of sen-
tences, and so on. Text segments do not exhibit embedding,
which means that they coincide with their segment atom. In
terms of clause atoms, a text segment consists of a single
clause atom hierarchy with its root marked `N' at tab 0.
Within a clause atom hierarchy, an instruction `N' (new
domain, no syntactic relation) at tab > 0 implies a virtual
clause atom relation that connects the new domain to the
context to which it relates semantically.
The functional units, paragraph, sentence, clause, and
phrase, are numbered within the encompassing units; atomic
units are numbered consecutively throughout the whole text
implicitly. The only atomic units that are numbered expli-
citly in PX(5) files, are clause atoms. Units are numbered
from 1 upwards.
PX-files are produced by the interactive programs
syn04types(1) and syn05Syr(1).
The clause atom instructions are directions to
syn04types(1) and consist of a string of usually two
characters representing the status of the clause atom
in the text, and a number indicating the indentation
from the main text.
The four numbers in the clause block are: The clause
number, the least and greatest phrase number, and the
clause type. For the least and greatest phrase number,
only phrases that start in this clause atom are taken
into account, unless there are none, in which case all
phrases having word atoms in this clause atom are con-
sidered. The values for clause type are listed in the
table below in the paragraph on clause atom type. In
clauses that consist of more than one clause atom, the
value for clause type is repeated with every subsequent
clause atom of the clause.
Clause constituent relation is a clause feature. With
clauses that consist of more than one clause atom, the
value for clause constituent relation is recorded with
the first clause atom of the clause only. The two
numbers are the relation code and the distance code.
Werkgroep Informatica Last change: 14/02/27 2
The relation codes are summarised in the table below.
-13 Attr Attributive clause
-6 Coor Coordinated clause
-5 Spec Specification clause
-2 RgRc Regens/rectum connection
502 Subj Subject clause
503 Objc Object clause
504 Cmpl Complement clause
505 Adju Adjunctive clause
521 PreC Predicate complement clause
525 PrAd Predicative adjunct clause
562 ReVo Referral to vocative
572 Resu Resumptive clause
The distance code reflects both the distance and the
unit in which the distance is measured. The unit is
either word, phrase atom, clause atom, or sentence
atom. The following piece of Pascal code illustrates
how distance and unit can be extracted from the code.
(The function Sign returns 1 for positive and -1 for
negative numbers.)
if c div 1000 <> 0 then begin
d := c - Sign(c) * 1000;
u := 'W'
end else if c div 100 <> 0 then begin
d := c - Sign(c) * 100;
u := 'P'
end else if c div 10 <> 0 then begin
d := c - Sign(c) * 10;
u := 'C'
end else if c div 1 <> 0 then begin
d := c - Sign(c) * 1;
u := 'S'
end else begin
d := 0;
u := '.'
Text type is a feature of a clause. In clauses that
consist of more than one clause atom, the value for
text type is repeated with every subsequent clause atom
of the clause. The values are constructed in a cumula-
tive manner. The clause L> T>KLW M-KL <Y H-GN in
Genesis 3:1, for instance, is a quotation within a quo-
tation in a narrative passage, and therefore marked
NQQ. Text type is a concatenation of any of the fol-
lowing characters.
Werkgroep Informatica Last change: 14/02/27 3
? Unknown
D Discursive
N Narrative
Q Quotation
Paragraphs are numbered within a text segment. Other
than the fields with the phrase, clause, and sentence
number, the paragraph field holds a string of digits
that shows the paragraph number followed by the numbers
of the nested subparagraphs. Traditionally these
digits were not separated, which gave rise to ambi-
guity. The current format allows the digits to be
separated by a full stop.
type Unlike clause type, clause atom type is marked by a
string. The table below lists the numerical and string
values for clause type and clause atom type. A capital
`X' refers to a stretch of constituents containing an
explicit subject, whereas a lower case `x' refers to a
stretch of constituents without a subject. There is no
clause type `Defc'.
0 Defc Defective clause atom
99 Unkn Unknown
101 ZYq0 Zero-yiqtol-null clause
102 ZQt0 Zero-qatal-null clause
103 ZIm0 Zero-imperative-null clause
104 InfC Infinitive construct clause
105 InfA Infinitive absolute clause
106 Ptcp Participle clause
111 ZYqX Zero-yiqtol-X clause
112 ZQtX Zero-qatal-X clause
113 ZImX Zero-imperative-X clause
121 XYqt X-yiqtol clause
122 XQtl X-qatal clause
123 XImp X-imperative clause
131 xYq0 x-yiqtol-null clause
132 xQt0 x-qatal-null clause
133 xIm0 x-imperative-null clause
141 xYqX x-yiqtol-X clause
142 xQtX x-qatal-X clause
143 xImX x-imperative-X clause
151 WYq0 We-yiqtol-null clause
152 WQt0 We-qatal-null clause
153 WIm0 We-imperative-null clause
157 Way0 Wayyiqtol-null clause
161 WYqX We-yiqtol-X clause
162 WQtX We-qatal-X clause
163 WImX We-imperative-X clause
Werkgroep Informatica Last change: 14/02/27 4
167 WayX Wayyiqtol-X clause
171 WXYq We-X-yiqtol clause
172 WXQt We-X-qatal clause
173 WXIm We-X-imperative clause
181 WxY0 We-x-yiqtol-null clause
182 WxQ0 We-x-qatal-null clause
183 WxI0 We-x-imperative-null clause
191 WxYX We-x-yiqtol-X clause
192 WxQX We-x-qatal-X clause
193 WxIX We-x-imperative-X clause
200 NmCl Nominal clause
213 AjCl Adjective clause
301 Voct Vocative clause
302 CPen Casus pendens
303 Ellp Ellipsis
304 MSyn Macrosyntactic sign
305 Reop Reopening
306 XPos Extraposition
The verbal clause types used to be numbered dif-
ferently. The scheme has been changed to achieve a
higher degree of consistency, which is documented in A
Note on Clause Types.
A clause atom relation is denoted by two numbers which
indicate the distance and the relation code. The
clause atom relation codes are explained on a separate
manual page CARC(5). The distance to the clause atom
with which the relation is maintained, is expressed in
units of clause atoms.
CARC(5), isgraph(3c), ps4.p(5), syn04types(1), syn05Syr(1).
``A Note on Clause Types''
Werkgroep Informatica Last change: 14/02/27 5