Skip to content

Commit

Permalink
[doc/ref] Skeleton for eggex topics
Browse files Browse the repository at this point in the history
i.e. the re-* topics

Could be egg-literal, egg-primitive, etc.
  • Loading branch information
Andy C committed Dec 18, 2023
1 parent a2bfe38 commit 379a8ce
Show file tree
Hide file tree
Showing 4 changed files with 63 additions and 18 deletions.
38 changes: 33 additions & 5 deletions doc/ref/chap-expr-lang.md
Expand Up @@ -317,23 +317,51 @@ You can specify flags passed to libc regcomp():

var pat = / d+ ; reg_icase reg_newline /

You can specify a translation preference:
You can specify a translation preference after a second semi-colon:

var pat = / d+ ; reg_icase reg_newline ; ERE /
var pat = / d+ ; ; ERE /

Right now the translation preference does nothing. It could be used to
translate eggex to PCRE or Python syntax.

### re-compound

### re-primitive

### named-class
%zero 'sq'
Subpattern @subpattern

### class-literal

[c a-z 'abc' @str_var \\ \xFF \u0100]

Negated:

![a-z]

### named-class

dot
digit space word
d s w

Negated:

!digit !space !word

### re-compound

pat|alt pat seq (group)

### re-capture

<capture d+ as name: int>

### re-flags

Valid ERE flags, which are passed to libc's `regcomp()`:

- `reg_icase` aka `i` (ignore case)
- `reg_newline` (4 changes regarding newlines)

### re-multiline

Not implemented.
Expand Down
17 changes: 8 additions & 9 deletions doc/ref/toc-ysh.md
Expand Up @@ -78,17 +78,16 @@ Siblings: [OSH Topics](toc-osh.html), [Data Topics](toc-data.html)
ysh-attr mydict.key
ysh-slice a[1:-1] s[1:-1]
func-call f(x, y)
thin-arrow s->pop()
fat-arrow s => startswith('prefix')
thin-arrow mylist->pop()
fat-arrow mystr => startswith('prefix')
match-ops ~ !~ ~~ !~~
[Eggex] re-literal / d+ ; i ; ERE /
re-compound pat|alt pat seq (group)
<capture> <capture :name>
re-primitive %zero Subpattern @subpattern 'sq'
char-class ! char-class
named-class dot digit space word d s w
[Eggex] re-literal / d+ ; re-flags ; ERE /
re-primitive %zero 'sq' Subpattern @subpattern
class-literal [c a-z 'abc' @str_var \\ \xFF \u0100]
X re-flags ignorecase etc.
named-class dot digit space word d s w
re-compound pat|alt pat seq (group)
re-capture <capture d+ as name: int>
re-flags reg_icase reg_newline
X re-multiline ///
```

Expand Down
22 changes: 19 additions & 3 deletions doc/ysh-regex-api.md
Expand Up @@ -8,8 +8,12 @@ YSH Regex API - Convenient and Powerful
YSH has [Egg Expressions](eggex.html), a composable and readable syntax for
regular expressions. You can use *Eggex* with both:

- Convenient Perl-like operators: `'mystr' ~ / [a-z]+/ `
- A powerful Python-like API: `'mystr' => search(/ [a-z]+ /`
- A convenient Perl-like operator: `'mystr' ~ / [a-z]+/ `
- access submatches with global `_group()` &nbsp; `_start()` &nbsp; `_end()`

- A powerful Python-like API: `'mystr' => search(/ [a-z]+ /)` and `leftMatch()`
- access submatches with `Match` object methods `m => group()` &nbsp; `m =>
start()` &nbsp; `m => end()`

You can also use plain POSIX regular expressions ([ERE]($xref)) instead of
Eggex.
Expand Down Expand Up @@ -195,7 +199,8 @@ Eggexes can be composed by *splicing*. Splicing works on expressions, not
strings.

Replacement will use shell's string literal syntax, rather than a new
`printf`-like mini-language.

printf`-like mini-language.

## Appendix: Python-like wrappers around the API

Expand Down Expand Up @@ -226,3 +231,14 @@ similar to the lexer example above:
### Split by Pattern

Python's `re.split()` can also be emulated by using `search()` in a loop.

## Eggex Help Topics

- [re-literal](ref/chap-expr-lang.html#re-literal)
- [re-primitive](ref/chap-expr-lang.html#re-primitive)
- [class-literal](ref/chap-expr-lang.html#class-literal)
- [named-class](ref/chap-expr-lang.html#named-class)
- [re-compound](ref/chap-expr-lang.html#re-compound)
- [re-capture](ref/chap-expr-lang.html#re-capture)
- [re-flags](ref/chap-expr-lang.html#re-flags)

4 changes: 3 additions & 1 deletion ysh/expr_eval.py
Expand Up @@ -1350,9 +1350,11 @@ def _EvalRegex(self, node, parent_flags):
def EvalEggex(self, node):
# type: (Eggex) -> value.Eggex

# Splice and check flags consistency
spliced = self._EvalRegex(node.regex, node.canonical_flags)

# as_ere and name_types filled in during translation
# as_ere and capture_names filled in during translation
# TODO: func_names should be done above
return value.Eggex(spliced, node.canonical_flags, None, [], [])


Expand Down

0 comments on commit 379a8ce

Please sign in to comment.