/
phonology.twolc
56 lines (45 loc) · 1.23 KB
/
phonology.twolc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Alphabet
!! This file documents the [phonology.twolc file](http://github.com/giellalt/lang-fin/blob/main/src/fst/phonology.twolc)
! Basic finnish alphabet
a b c d e f g h i j k l m n o p q r s š t u v w x y z ž å ä ö
A B C D E F G H I J K L M N O P Q R S Š T U V W X Y Z Ž Å Ä Ö
‐ – %-
’ ' %" ” »
! Others...
%#
! Boundaries to kill
%_:0
%#:%-
%#:%#
%
! Literal quotes and angles must be escaped (cf morpheme boundaries further down), will be converted to proper symbols later:
»7 ! »
«7 ! «
#7 ! #
%_7 ! _
%[%>%] ! >
%[%<%] ! <
! Morpheme boundaries to be killed later:
« ! Derivational prefix
» ! Derivational suffix
%< ! Inflectional prefx
%> ! Inflectional suffix
# ! Word boundary for both lexicalised and dynamic compounds
%^ ! (exceptional) soft hyphenation point
! these may leak from omorfi conversion...
%{DB%}:0
%{hyph%?%}:0 %{hyph%}:%-
%{MB%}:0
%{WB%}:0
%{STUB%}:0
%{wB%}:0
%{XB%}:0
;
Sets
Vowels = a e i o u y å ä ö á à â é è ê î ô û í ï ó ò ú ù A E I O U Y Ä Ö È ;
MeaningfulPunct = %# %_ %> » ;
Rules
"Force hyphen between vowels" !!= * @CODE@
!%#:%- <= VV _ VV ; where VV in Vowels ;
%#:%- <=> VV _ VV ; where VV in Vowels ;
! vim:set ft=xfst-twol: