-
Notifications
You must be signed in to change notification settings - Fork 1
/
hugo.lang
262 lines (253 loc) · 13.4 KB
/
hugo.lang
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
-- ********************************
-- * Hugo Interactive Fiction * v1.1.0 || by Tristano Ajmone:
-- * www.generalcoffee.com/hugo * 2019/11/14 || https://github.com/tajmone
-- ******************************** public domain || http://unlicense.org
--------------------------------------------------------------------------------
Description = "Hugo" Categories = {"source", "interactive fiction"}
--------------------------------------------------------------------------------
-- file extensions:
-- .hug -- adventure source
-- .h -- library source
-- .g -- grammar source
--------------------------------------------------------------------------------
-- Syntax definition for Hugo language v3.1.03 (2006).
-- http://www.generalcoffee.com/hugo/gethugo.html
-- The Hugo Interactive Fiction Development System (1995-2006) is a language and
-- a set of cross-platform tools for creating text-adventures with sound and
-- graphics, developed by Kent Tessman for The General Coffee Company Film
-- Productions, released under BSD-2-Clause License.
--------------------------------------------------------------------------------
-- Syntax elements:
-- * Comments -- single line (!) and block (!/ .. /!).
-- * Strings -- double quotes (") escapable with (\")
-- * Escape -- various + ASCII escapes.
-- * Interpolation -- special non-ASCII characters.
-- * PreProcessor
-- * Operators
-- * Digits -- decimal integers only.
-- * Keywords 1 -- Hugo reserved keywords.
-- * Keywords 2 -- Predefined Hugo stored values:
-- * Built-in Global Variables.
-- * Built-in Properties.
-- * Built-in Engine Variables.
-- * Keywords 3 -- Various elements that didn't fit elsewhere:
-- * Chars constants.
-- * System Words.
-- * Properties Qualifiers.
-- * Boolean Constants (true/false).
-- * Keywords 4 -- Limit Settings.
--------------------------------------------------------------------------------
IgnoreCase = true EnableIndentation = false
Identifiers = [[ [\$\~]?[a-zA-Z_]\w*\$? ]]
Comments = {{
Block = true,
Nested = false,
Delimiter = { [[ ^(?:\s*)!\\ ]],
[[ \\!$ ]]}},{
Block = false,
Delimiter = { [[ (?<!\\)!(?!\\) ]]}}}
Strings = {
Delimiter = [=[ (?<!\\)" ]=],
AssertEqualLength = true,
--[[----------------------------------------------------------------------------
ESCAPE SEQUENCES
--------------------------------------------------------------------------------
Hugo allows various escape sequences inside strings, some of them have been
defined as Interpolation for visual improvement.
Basic escapes:
\" quotation marks
\\ a literal backslash character
\_ a forced space, overriding left-justification for the rest of the string
\n a newline
Formatting sequences for styles:
\B boldface on
\b boldface off
\I italics on
\i italics off
\P proportional printing on
\p proportional printing off
\U underlining on
\u underlining off
ASCII Escapes:
\#xxx any ASCII or Latin-1 character where xxx represents the three-digit
ASCII number (or Latin-1 code). --]]
Escape = [=[ (\\(?:["\\_nBbIiPpUu]|#\d{3})) ]=],
--[[----------------------------------------------------------------------------
INTERPOLATION
--------------------------------------------------------------------------------
We define the special char sequences as Interpolation to allow visual separation
between them and the other escape sequences, which will make the code easier to
read since in real-code the prose strings might contain many of both in a same
string, side by side.
Special characters formatting sequences (ISO-8859-1):
\` accent grave followed by a letter (e.g. "\`a" -> "à")
\’ accent acute followed by a letter (e.g. "\’E" -> "É")
\~ tilde followed by a letter (e.g. "\~n" -> "ñ")
\^ circumflex followed by a letter (e.g. "\^i" -> "î")
\: umlaut followed by a letter (e.g. "\:u" -> "ü")
\, cedilla followed by c or C (e.g. "\,c" -> "ç")
\< or \> Spanish quotation marks (« »)
\! upside-down exclamation point (¡)
\? upside-down question mark (¿)
\ae ae ligature (æ)
\AE AE ligature (Æ)
\c cents symbol (¢)
\L British pound (£)
\Y Japanese Yen (¥)
NOTE: The RegEx below defines twice the acute accent (´) char because depending
on whether the source is in ASCII/ISO-8859-1 or UTF-8 its encoding will
differ (the former is the expected encoding for Hugo sourceS, but the
latter might be encountered in documentation projects). --]]
Interpolation = [=[ (?x)(\\(?:
\xC2\xB4[a-zA-Z] | # Acute accent (´) in UTF-8 docs will be $c2 $b4.
[`´~\^:][a-zA-Z] | # Note: acute accent in ASCII format also found here.
,[cC] | # Cedilla.
[<>!?] | # Square brackets and upside-down ¡ ¿ marks.
ae|AE | # Æ ligatures.
[cLY] # Currencies: ¢ £ ¥.
)) ]=] }
PreProcessor = {
Prefix = [=[ \A(?!x)x ]=], -- never matching RegEx!
Continuation = "\\" }
Operators = [[ \&|\#|<|>|\||\=|\/|\*|\+|\-|~ ]]
Digits = [[ \d+ ]]
Keywords = {{
------------------------------------------------------------------------------
Id = 1, List = { -- Hugo keywords # 1
------------------------------------------------------------------------------
"addcontext", "alias", "and", "anything", "array", "attribute", "break",
"call", "capital", "case", "child", "children", "class", "cls", "color",
"colour", "compound", "constant", "dict", "do", "elder", "eldest", "else",
"elseif", "enumerate", "event", "for", "global", "graphics", "held", "hex",
"if", "in", "input", "is", "jump", "local", "locate", "move", "multi",
"multiheld", "multinotheld", "music", "nearby", "newline", "not", "notheld",
"number", "or", "parent", "pause", "picture", "playback", "print",
"printchar", "property", "punctuation", "quit", "random", "readfile",
"readval", "recordoff", "recordon", "removal", "remove", "repeat",
"replace", "resource", "restart", "restore", "return", "routine", "run",
"runevents", "save", "scriptoff", "scripton", "select", "sibling", "sound",
"start", "step", "string", "synonym", "system", "text", "to", "undo",
"verb", "video", "while", "window", "word", "writefile", "writeval",
"xverb", "younger", "youngest",
}},{
------------------------------------------------------------------------------
Id = 2, List = { -- Predefined Engine Stored Values # 2
------------------------------------------------------------------------------
--| Built-in Global Variables |---------------------------------------------
----------------------------------------------------------------------------
"actor", "endflag", "location", "object", "objects", "player", "prompt",
"self", "system_status", "verbroutine", "words", "xobject",
----------------------------------------------------------------------------
--| Built-in Properties |---------------------------------------------------
----------------------------------------------------------------------------
-- NOTE: "adjectives" and "nouns" are aliases defined by Hugo library, and
-- not tokens defined in the Hugo engine and compiler.
----------------------------------------------------------------------------
"adjective", "after", "article", "before", "name", "noun",
----------------------------------------------------------------------------
--| Built-in Engine Variables |---------------------------------------------
----------------------------------------------------------------------------
"parse$", "serial$",
}},{
------------------------------------------------------------------------------
Id = 3, -- Chars Constants + System Words + Properties Qualifiers # 3
------------------------------------------------------------------------------
--| ASCII Chars constants |-------------------------------------------------
----------------------------------------------------------------------------
Regex = [=[ '[\x00-\x7F]' ]=] , },{
----------------------------------------------------------------------------
--| System Words |----------------------------------------------------------
----------------------------------------------------------------------------
Id = 3, List = {
"~all", "~and", "~any", "~except", "~oops",
----------------------------------------------------------------------------
--| Properties Qualifiers |-------------------------------------------------
----------------------------------------------------------------------------
"$additive", "$complex",
----------------------------------------------------------------------------
--| Boolean Constants |-----------------------------------------------------
----------------------------------------------------------------------------
"true", "false",
}},{
------------------------------------------------------------------------------
Id = 4, -- Limit Settings # 4
------------------------------------------------------------------------------
Regex = [=[ (?x-i)
(\$MAX (?: ALIASES | ARRAYS | ATTRIBUTES | CONSTANTS | DICTEXTEND | DICT |
DIRECTORIES | EVENTS | FLAGS | GLOBALS | LABELS | LOCALS | OBJECTS |
PROPERTIES | ROUTINES | SPECIALWORDS )) ]=],
Group = 0
},{
------------------------------------------------------------------------------
Id = 5, -- PreProcessor
------------------------------------------------------------------------------
-- These tokens are captured as keywords but then thrown back as PreProcessor
-- via OnStateChange(). This is needed because setting '#' as PreProcessor
-- delimiter would prevent capturing the '#' token for counting properties.
Regex = [=[ (?x-i)
(\#(?:
if(clear | set | defined | undefined) | if | elseif | else | endif |
clear | set | include | link | message | switches | version
)\b) ]=],
Group = 0
}}
function OnStateChange(oldState, newState, token, kwgroup)
--============================================================================
-- #01 -- Ignore Escape Sequences Outside Strings
--============================================================================
if newState == HL_ESC_SEQ and -- An escape seq. must follow either:
oldState ~= HL_STRING and -- * a string
oldState ~= HL_ESC_SEQ and -- * an escape sequence
oldState ~= HL_INTERPOLATION then -- * an interpolation
return HL_REJECT -- otherwise, reject it.
--============================================================================
-- #02 -- Ignore Interpolations Inside Preprocessor Strings
--============================================================================
elseif
newState == HL_INTERPOLATION and
oldState == HL_PREPROC_STRING then
return HL_REJECT
--============================================================================
-- #03 -- Throw Back Keywords from Group 5 as PreProcessor
--============================================================================
elseif
newState == HL_KEYWORD and
kwgroup == 5 then
return HL_PREPROC
end
return newState
end
--[[============================================================================
KNOWN ISSUES
================================================================================
FilePath Strings:
Escape sequences and interpolations shouldn't show up inside file path strings
following keywords like `resource`, `picture` and other similar keywords which
expect a file string after them. The syntax should track strings immediately
following these keywords and discard escapes/interpolations accordingly. The
`resource` keyword is going to be trickier because it allows multiple strings
inside a `{..}` block (and strings might be followed by comments).
================================================================================
CHANGELOG
================================================================================
v1.1.0 (2019/11/14) | Highlight v3.54
- Polish source.
- List keywords one kwd per line to simplify WIP and tracking changes.
- NEW:
- Added missing keyword tokens.
- Removed keyword "adjectives" and "nouns" (library defined aliases).
- Kwd Group 2 now hosts predefined Engine Values:
- Built-in Global Variables. (moved here from Group 1)
- Built-in Properties. (moved here from Group 1)
- Kwd Groups 3 and 4 are shifted and become 4 and 5.
- Kwd Group 3: (was 2) added also, along with Char Constants:
- System Words.
- Properties Qualifiers.
- Boolean Constants (true/false: moved from Group 1).
- FIXES:
- Identifiers patterns tweaked to include tokens starting with tilde.
- Ignore interpolations inside preprocessor strings.
- PreProcessor: prevent matching the '#' for proprieties count as the
beginning of a preprocessor directive.
v1.0.0 (2019/05/24) | Highlight v3.51
- First release. --]]