Summary
GocciaScript currently has no regular expression support. This is documented in docs/language-restrictions.md under "Deferred Built-ins". String methods like replace, replaceAll, and split work with string patterns only.
What's needed
Lexer (Goccia.Lexer.pas)
- Add
gttRegex token type to Goccia.Token.pas
- Context-sensitive scanning to distinguish
/pattern/flags from the division operator /
- After keywords,
(, [, {, ,, ;, operators → regex
- After identifiers,
), ], numbers, strings → division
- Parse flags:
i, g, m, s, u, y
Parser (Goccia.Parser.pas)
- Add
TGocciaRegexLiteralExpression to Goccia.AST.Expressions.pas
- Emit regex literal node when
gttRegex token is encountered
Runtime
- New
TGocciaRegExpValue in Goccia.Values.RegExpValue.pas:
- Properties:
source, flags, lastIndex, global, ignoreCase, multiline, dotAll, sticky
- Methods:
test(string), exec(string), toString()
- New
Goccia.Builtins.GlobalRegExp.pas for RegExp(pattern, flags?) constructor
- Bytecode support in Souffle VM for regex literal opcode
String method updates (Goccia.Values.StringObjectValue.pas)
replace(regex|string, replacement|fn) — already supports callbacks, add regex dispatch
replaceAll(regex|string, replacement|fn) — same
split(regex|string) — add regex dispatch
- New:
match(regex) — return match array
- New:
matchAll(regex) — return iterator of matches
- New:
search(regex) — return index of first match
Implementation proposal
Use FreePascal's RegExpr unit (bundled with FPC, no external dependency). It provides PCRE-compatible regex with named groups, lookahead, and Unicode support.
-
Phase 1 — Core runtime: TGocciaRegExpValue, RegExp constructor, test(), exec(). No literal syntax yet — construct via new RegExp("pattern", "flags").
-
Phase 2 — Lexer integration: Context-sensitive /pattern/flags literal scanning. This is the trickiest part due to the division ambiguity.
-
Phase 3 — String integration: Update replace, replaceAll, split to accept regex. Add match, matchAll, search.
-
Phase 4 — Testing matcher: Add toMatch(regex) to the test API (ties into the Vitest-compatible testing API issue).
Summary
GocciaScript currently has no regular expression support. This is documented in
docs/language-restrictions.mdunder "Deferred Built-ins". String methods likereplace,replaceAll, andsplitwork with string patterns only.What's needed
Lexer (
Goccia.Lexer.pas)gttRegextoken type toGoccia.Token.pas/pattern/flagsfrom the division operator/(,[,{,,,;, operators → regex),], numbers, strings → divisioni,g,m,s,u,yParser (
Goccia.Parser.pas)TGocciaRegexLiteralExpressiontoGoccia.AST.Expressions.pasgttRegextoken is encounteredRuntime
TGocciaRegExpValueinGoccia.Values.RegExpValue.pas:source,flags,lastIndex,global,ignoreCase,multiline,dotAll,stickytest(string),exec(string),toString()Goccia.Builtins.GlobalRegExp.pasforRegExp(pattern, flags?)constructorString method updates (
Goccia.Values.StringObjectValue.pas)replace(regex|string, replacement|fn)— already supports callbacks, add regex dispatchreplaceAll(regex|string, replacement|fn)— samesplit(regex|string)— add regex dispatchmatch(regex)— return match arraymatchAll(regex)— return iterator of matchessearch(regex)— return index of first matchImplementation proposal
Use FreePascal's
RegExprunit (bundled with FPC, no external dependency). It provides PCRE-compatible regex with named groups, lookahead, and Unicode support.Phase 1 — Core runtime:
TGocciaRegExpValue,RegExpconstructor,test(),exec(). No literal syntax yet — construct vianew RegExp("pattern", "flags").Phase 2 — Lexer integration: Context-sensitive
/pattern/flagsliteral scanning. This is the trickiest part due to the division ambiguity.Phase 3 — String integration: Update
replace,replaceAll,splitto accept regex. Addmatch,matchAll,search.Phase 4 — Testing matcher: Add
toMatch(regex)to the test API (ties into the Vitest-compatible testing API issue).