`regex`

regex is a JavaScript library intended for parsing, generation and AST-construction of various regular expressions, as per the JavaScript variety's definition.

NOTE: the library depends upon the parsers.js package for parser-making

Installation

npm install @hgargg-0710/regex

Documentation

The package has the following exports:

parse (function)
generate (function)
parser (submodule)
generator (submodule)
tree (submodule)
tokens (submodule)

`parse`

function parse(regex: string): Flags

A function taking in a string containing a regular expression, and returning an AST of it.

`generate`

function generate(AST: Flags): string

Takes in the given AST node (not necessariliy Flags, but too long to express here), and returns a string representing it.

NOTE: partial nodes will give only partial results. For example, passing a PatternEnd will give "$".

`parser`

Various parsing layers APIs

export	description
`ExpressionParser`	Function. Parses an `Expression`, initially tokenizing it
`boundry`	Submodule. Handles parsing of boundries
`chars`	Submodule. Handles tokenization
`classes`	Submodule. Handles parsing of character classes
`deflag`	Submodule. Handles removal of flags
`disjunction`	Submodule. Handles parsing of disjunction expressions
`escaped`	Submodule. Handles parsing of escape-sequences
`group`	Submodule. Handles recursion within a regular expression
`nogreedy`	Submodule. Handles the "no-greedy" quantifiers
`quantifier`	Submodule. Handles the quantifiers

The submodule exports are a part of the parse function's final definition.

The order in which they (layers) are passed within the parse function are:

deflag
chars
classes
escaped
boundry
group (recursive, looped)
quantifier
nogreedy
disjunction

`deflag`

export	description
`DeFlag`	Functions for the de-flagging of a `string` with regular expression in it. Returns a `Flags` object, with the `.expression` field containing the expressions's string
`flagTable`	Table for identification of flags with appropriate `TokenInstance`s
`flagInstance`	Function based off `flagTable`. Returns the `TokenType` of a given flag `string`
`identifyFlags`	Maps `flagInstance` to an array of `string`s

`chars`

export	description
`ExpressionTokenizer`	A `PatternTokenizer` for tokenizing the given `Pattern` with a regular expression in it
`tokenizerMap`	The `RegExpMap`, on which `ExpressionTokenizer` is based

`classes`

export	description
`CharacterClassParser`	Main parser for character classes
`classLimit`	Limits the given stream up to the next `RectOp` from the current element
`classMap`	`TypeMap`, on which `CharacterClassParser` is based
`HandleClass`	The handler for the `RectOp` token inside the `classMap`
`ClassHandler`	A multistep function, serving as the main component of `HandleClass`
`EscapeInner`	A parser function, first component of the `ClassHandler`. Escapes inside characters
`HandleEscaped`	Handler for the escaped characters, main part of the `EscapeInner`
`IdentifyRanges`	Second parsing function of `ClassHandler`. Identifies and parsers ranges
`HandleRange`	The main component of `IdentifyRanges`, parses encountered ranges
`InClassEscapedHandler`	A slightly modified version of the `escapedMap` from `escaped` module for escaping

`escaped`

export	description
`EscapedParser`	Main parser of the escaped characters
`escapePreface`	The `TypeMap`, on which `EscapedParser` is based
`escapeMap`	The `ValueMap`, on which defines the global-scope escaping
`escapedHandler`	Creates a function for handling escaped characters based off given map
`parseBackreference`	Returns a `Backreference` based on given arguments of `curr, input`
`parseMultControl`	Returns a `ControlCharacter` of lengths 4-5 based on `curr, input`
`parseDoubleControl`	Returns a `ControlCharacter` of length 2 based on `curr, input`
`parseSingleControl`	Returns a `ControlCharacter` of length 1 based on `curr, input`
`readUnicodeClassProperty`	Parses a `UnicodeClassProperty` based on `curr, input`
`readBraced`	Reads the given `Stream`, until a `ClBrace` is encountered
`readNamedBackreference`	Reads a `NamedBackreference` based on `readIdentifier`
`readUBrace`	Reads a sequence of `{hhhh}` or `{hhhhh}` where `isHex(h) === true`
`readu`	Reads a sequence of `hhhh`, where `isHex(h) === true`
`readx`	Reads a sequence of `hh`, where `isHex(h) === true`
`isHex`	Returns whether a character given is a hexidecimal

`boundry`

export	description
`BoundryParser`	Main parser of the submodule. Separates boundries into `TokenInstance`s
`boundryMap`	The `TypeMap`, on which the `BoundryParser` is based
`HandleEscaped`	Handles the `NonWordBoundry` `TokenInstance`s

`group`

export	description
`EndParser`	The main parser of the submodule. The `ExpressionParser` ends with it
`GroupParser`	The first parsing layer of the `EndParser`. Recursive. Handles recursion, groups/captures, look-aheads/-behinds
`groupMap`	The `TypeMap`, on which the `GroupParser` is based
`GroupHandler`	The main component of the `groupMap`
`nestedBrack`	Function for limiting the current-level nested bracket-expression
`CollectionHandler`	Function for handling current collection
`HandleQMark`	Function for handling "collections" starting with `?` (`(?<!...)`, `(?<...>...)`, ...)
`HandleCollectionBase`	Function for recursively handling a capture group
`QMarkHandler`	Underlying `TableParser` of `HandleQMark`
`HandleQMarkExclMark`	Handles a negative look-ahead
`HandleQMarkEq`	Handles a look-ahead
`HandleLeftAngular`	Handles all "collections" starting with `<` (`(?<...>...)`, `(?<=...)`, ...)
`HandleColon`	Handles a no-capture group
`LeftAngularHandler`	Underlying `TableParser` for `HandleLeftAngular`
`HandleLeftAngularBase`	Handles a named capture
`HandleLeftAngularExclMark`	Handles a negative look-behind
`HandleLeftAngularEq`	Handles a look-behind
`readIdentifier`	Reads an identifier (for the named capture/backreference)

`quantifier`

export	description
`QuantifierParser`	Main parser of the submodule. Parses quantifiers
`QuantifierHandler`	A `TableParser`, main component of the `QuantifierParser`
`HandlePlus`	Handles a `Plus` token encountered
`HandleStar`	Handles a `Star` token encountered
`HandleQMark`	Handles a `QMark` token encountered
`BraceHandler`	Handles a `OpBrace` token encountered
`HandleBraced`	Returns a handling function for either one of `NtoM`, `NPlus`, or `NOnly`
`readNumber`	Reads a number from the given `Stream` (note: up to the first `isNaN` token)
`limitBraced`	Limits the given `Stream` up to the point of the first encountered `ClBrace`

`nogreedy`

export	description
`ParseNoGreedy`	Main parser of the submodule. Parsers `NoGreedy` tokens
`noGreedyMap`	The `TypeMap`, on which `ParseNoGreedy` is based
`HandleQuantifier`	Handler for quantifiers
`QuantifierHandler`	The underlying `TableParser`-function of `HandleQuantifiers`
`HandleQMark`	Handles `QMark` following a quantifier (no-greedy quantifiers)

`disjunction`

export	description
`DisjunctionParser`	The main export of the submodule. Parses disjunctions
`EmptyFixer`	First parsing layer of `DisjunctionParser`. Fixes empty expressions `\|\|`
`DisjunctionTokenizer`	Second parsing layer of `DisjunctionParser`. Puts non-`Pipe` bits of current `Stream` into `DisjucntionArgument`s
`DisjunctionDelimiter`	Third and final parsing layer of `DisjunctionParser`. Delimits the `Stream` based off `Pipe` tokens
`hasDisjunctions`	Checks whether a given `Stream` has disjunctions to parse from given point on
`limitPipe`	Limits the given `Stream` until the moment the next `Pipe` is encountered
`skipTilPipes`	Skips `Stream` until a `Pipe` is discovered

`generator`

Provides regex-generation related exports based off the package's AST

export	description
`RegexGenerator`	The `SourceGenerator` for the package's AST (`generate` is based on it)
`generatorMap`	The `TypeMap`, on which `RegexGenerator` is based
`GenerateBackspaceClass`	Generates a regex for `BackspaceClass`
`GenerateWordBoundry`	Generates a regex for `WordBoundry`
`GenerateNonWordBoundry`	Generates a regex for `NonWordBoundry`
`GenerateNewline`	Generates a regex for `Newline`
`GenerateCarriageReturn`	Generates a regex for `CarriageReturn`
`GenerateWordClass`	Generates a regex for `WordClass`
`GenerateNonWordClass`	Generates a regex for `NonWordClass`
`GenerateFormFeed`	Generates a regex for `FormFeed`
`GenerateDigitClass`	Generates a regex for `DigitClass`
`GenerateNonDigitClass`	Generates a regex for `NonDigitClass`
`GenerateNULClass`	Generates a regex for `NULClass`
`GenerateVerticalTab`	Generates a regex for `VerticalTab`
`GenerateHorizontalTab`	Generates a regex for `HorizontalTab`
`GenerateNonWhitespaceClass`	Generates a regex for `NonWhitespaceClass`
`GenerateWhitespaceClass`	Generates a regex for `WhitespaceClass`
`GenerateEmptyExpression`	Generates a regex for `EmptyExpression`
`GenerateMatchIndicies`	Generates a regex for `MatchIndicies` flag
`GenerateGlobalSearch`	Generates a regex for `GlobalSearch` flag
`GenerateCaseInsensitive`	Generates a regex for `CaseInsensitive` flag
`GenerateMultline`	Generates a regex for `Multline` flag
`GenerateDotAll`	Generates a regex for `DotAll` flag
`GenerateUnicode`	Generates a regex for `Unicode` flag
`GenerateUnicodeSets`	Generates a regex for `UnicodeSets` flag
`GenerateSticky`	Generates a regex for `Sticky` flag
`GeneratePatterStart`	Generates a regex for `PatternStart`
`GeneratePatternEnd`	Generates a regex for `PatternEnd`
`GenerateFlags`	Generates a regex for `Flags`
`GenerateExpression`	Generates an regex for `Expression`
`GenerateNOnly`	Generates an regex for `NOnly`
`GenerateNtoM`	Generates an regex for `NtoM`
`GenerateNPlus`	Generates an regex for `NPlus`
`GenerateEscaped`	Generates an regex for `Escaped`
`GenerateBackreference`	Generates a regex for `Backreference`
`GenerateUnicodeClassProperty`	Generates a regex for `UnicodeClassProperty`
`GenerateControlCharacter`	Generates a regex for `ControlCharacter`
`GenerateNamedBackreference`	Generates a regex for `NamedBackreference`
`GenerateClassRange`	Generates a regex for `ClassRange`
`GenerateNoGreedy`	Generates a regex for `NoGreedy`
`GenerateOptional`	Generates anregex for `Optional`
`GenerateZeroPlus`	Generates a regex for `ZeroPlus`
`GenerateOnePlus`	Generates a regex for `OnePlus`
`GenerateClass`	Generates a regex for `CharacterClass`
`GenerateNegClass`	Generates a regex for `NegCharacterClass`
`GenerateDisjunction`	Generates a regex for `Disjunction`
`GenerateDisjunctionArgument`	Generates a regex for `DisjunctionArgument`
`GenerateNonCaptureGroup`	Generates a regex for `NonCaptureGroup`
`GenerateCaptureGroup`	Generates a regex for `CaptureGroup`
`GenerateLookAhead`	Generates a regex for `LookAhead`
`GenerateLookBehind`	Generates a regex for `LookBehind`
`GenerateNegLookAhead`	Generates a regex for `NegLookAhead`
`GenerateNegLookBehind`	Generates a regex for `NegLookBehind`
`GenerateNamedCapture`	Generates a regex for `NamedCapture`
`GenerateWildcard`	Generates a regex for `Wildcard`
`GeneratePipe`	Generates a regex for `Pipe`
`GenerateComma`	Generates a regex for `Comma`
`GenerateTrivial`	Generates a regex for anything else not in the table already (with a `typeof .value === 'string'`)

`tree`

export	description
`RegexStream`	A `TreeStream` for the library's AST (note: accepts THE AST ITSELF)
`RegexTree`	A `Tree` interface implementation for the library's AST
`treeMap`	The `TypeMap`, on which `RegexTree` is based
`NamedCaptureTree`	The function for conversion of a `NamedCapture` to a `Tree`
`ExpressionTree`	The function for conversion of an `Expression` to a `Tree`
`FlagTree`	The function for convertsion of a `Flags` to a `Tree`
`SeveralTree`	The function for conversion of `NOnly`, `NtoM` and `NPlus` to a `Tree`
`SingleTree`	The function for conversion of `ZeroPlus`, `OnePlus`, `Optional`, `LookAhead`, `LookBehind`, `NegLookAhead`, `NegLookBehind`, `NamedBackreference` to a `Tree`
`ValueTree`	The function for conversion of `ClassRange`, `DisjunctionArgument`, `CharacterClass`, `NegCharacterClass` and `Disjunction` to a `Tree`
`ChildlessTree`	The function for conversion of the rest of the tokens to a `Tree`

`tokens`

The tokens module has the same submodule structure as the parser module.

submodule	description
`boundry`	Various boundry tokens
`chars`	Various basic (first-order) tokens
`classes`	Tokens for representation of character classes
`deflag`	Flags and expressions representation tokens
`disjunction`	Disjunction-related tokens
`escaped`	Escape-sequence-related tokens
`group`	Tokens for groups and other recursive structures
`nogreedy`	Tokens for non-greedy quantifiers
`quantifier`	Tokens for quantifiers

`deflag`

`TokenType`/`TokenInstance`	represents	`type`
`MatchIndicies`	The `d` flag	`"indicies"`
`GlobalSearch`	The `g` flag	`"global"`
`CaseInsensitive`	The `i` flag	`"case-insensitive"`
`Multiline`	The `m` flag	`"multiline"`
`DotAll`	The `s` flag	`"dot-all"`
`Unicode`	The `u` flag	`"unicode"`
`UnicodeSets`	The `v` flag	`"unicode-sets"`
`Sticky`	The `y` flag	`"sticky"`
`Flags`	The complete regular expression with flags	`"flags"`
`Expression`	A partial expression, without flags (can have other `Expression`s inside)	`"expression"`

`chars`

`TokenType`	represents	`type`
`Escape`	`\\`	`"escape"`
`RectOp`	`[`	`"rop"`
`RectCl`	`]`	`"rcl"`
`Hyphen`	`-`	`"hyphen"`
`Pipe`	`\|`	`"pipe"`
`OpBrack`	`(`	`"opbrack"`
`ClBrack`	`)`	`clbrack`
`QMark`	`?`	`"qmark"`
`ExclMark`	`!`	`"emark`
`Eq`	`=`	`"eq"`
`Wildcard`	`.`	`"wildcard"`
`Star`	`*`	`"star"`
`Plus`	`+`	`"plus"`
`OpBrace`	`{`	`"opbrc"`
`ClBrace`	`}`	`"clbrc"`
`Colon`	`:`	`"colon"`
`Comma`	`,`	`"comma"`
`LeftAngular`	`<`	`"lang"`
`RightAngular`	`>`	`"rang"`
`Dollar`	`$`	`"dollar"`
`Xor`	`^`	`"xor"`
`RegexSymbol`	everything else	`"symbol"`

`classes`

`TokenType`	represents	`type`
`CharacterClass`	A character class `[...]`	`"charclass"`
`NegCharacterClass`	A negative character class `[^...]`	`"neg-charclass"`
`ClassRange`	A character class range `X-Y`	`"class-range"`

`escaped`

`TokenType`/`TokenInstance`	represents	`type`
`ControlCharacter`	`\cX`, `\xhh`, `\uhhhh`, `\u{hhhh}` or `\u{hhhhh}`	`"control-char"`
`Backreference`	`\N` - numeric backreference	`"backref"`
`NamedBackreference`	`\k<name>` - named backreference	`"named-backref"`
`UnicodeClassProperty`	`\p{...}` - unicode class property	`"uniprop"`
`RegexIdentifier`	`name` - identifier in named captures/backreferences	`"identifier"`
`CarriageReturn`	`\r` - carriage return	`"cr"`
`NonWordBoundry`	`\B` - non-word boundry (outside classes)	`"non-word-boundry"`
`WordBoundry`	`\b` - word-boundry	`"word-boundry"`
`NULClass`	`\0` - NUL class	`"nul-class"`
`FormFeed`	`\f` - form feed	`"form-feed"`
`DigitClass`	`\d` - digit class	`"digit-class"`
`NonDigitClass`	`\D` - non-digit class	`"non-digit-class"`
`WordClass`	`\w` - word-class	`"word-class"`
`NonWordClass`	`\W` - nonw-word-class	`"non-word-class"`
`WhitespaceClass`	`\s` - whitespace class	`"whitespace-class"`
`NonWhitespaceClass`	`\S` - non-whitespace class	`"non-whitespace-class"`
`HorizontalTab`	`\t` - horizontal tab	`"tab"`
`VerticalTab`	`\v` - vertical tab	`"vtab"`
`BackspaceClass`	`\b` - backspace	`"backspace"`
`Newline`	`\n` - newline	`"newline"`
`Escaped`	Any other escaped character	`"escaped"`

`boundry`

`TokenInstance`	represents	`type`
`PatternStart`	`^`	`"start"`
`PatternEnd`	`$`	`"end"`

`group`

`TokenType`	represents	`type`
`CaptureGroup`	`(...)`	`"capture"`
`NoCaptureGroup`	`(?:...)`	`"non-capture"`
`NamedCapture`	`(<name>...)`	`"named-capture"`
`LookAhead`	`(?=...)`	`"lookahead"`
`LookBehind`	`(?<=...)`	`"lookbehind"`
`NegLookAhead`	`(?!...)`	`"neg-lookahead"`
`NegLookBehind`	`(?<!...)`	`"neg-lookbehind"`

`quantifier`

`TokenType`	represents	`type`
`ZeroPlus`	`...*`	`"zero-plus"`
`OnePlus`	`...+`	`"one-plus"`
`Optional`	`...?`	`"optional"`
`NOnly`	`...{...}`	`"n-only"`
`NPlus`	`...{...,}`	`"n-plus"`
`NtoM`	`...{...,...}`	`"n-to-m"`

`nogreedy`

export	description	`type`
`NoGreedy`	A `TokenType` representing no-greedy opertors	`"nogreedy"`
`isQuantifier`	A predicate returning `true` only for tokens with types from the `quantifier` module

`disjunction`

`TokenType`/`TokenInstance`	represents	`type`
`Disjunction`	`...\|...\|...`	`"disjunction"`
`DisjunctionArgument`	An element of a `Disjunction`	`"disjunction-arg"`
`EmptyExpression`	An empty element of a `Disjunction` (`\|\|`)	`"empty"`

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
src		src
test		test
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
regex.mjs		regex.mjs
todos.txt		todos.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`regex`

Installation

Documentation

`parse`

`generate`

`parser`

`deflag`

`chars`

`classes`

`escaped`

`boundry`

`group`

`quantifier`

`nogreedy`

`disjunction`

`generator`

`tree`

`tokens`

`deflag`

`chars`

`classes`

`escaped`

`boundry`

`group`

`quantifier`

`nogreedy`

`disjunction`

About

Releases

Languages

License

HGARgG-0710/regex

Folders and files

Latest commit

History

Repository files navigation

regex

Installation

Documentation

parse

generate

parser

deflag

chars

classes

escaped

boundry

group

quantifier

nogreedy

disjunction

generator

tree

tokens

deflag

chars

classes

escaped

boundry

group

quantifier

nogreedy

disjunction

About

Resources

License

Stars

Watchers

Forks

Releases

Languages

`regex`

`parse`

`generate`

`parser`

`deflag`

`chars`

`classes`

`escaped`

`boundry`

`group`

`quantifier`

`nogreedy`

`disjunction`

`generator`

`tree`

`tokens`

`deflag`

`chars`

`classes`

`escaped`

`boundry`

`group`

`quantifier`

`nogreedy`

`disjunction`