Regular Expressions Cheatsheet
A concise cheatsheet for using Regular Expressions in JavaScript
Follows my mental model.
Intentionally non-comprehensive. Only includes syntax and parts of the API that I actually use.
Certain concepts are imprecisely defined. (For example, some definitions do not account for characters that are very rarely encountered in input strings.)
There should be no capturing groups and no g flag in regexp.
/ y / . test ( 'xx' ) //=> false
(Pitfall) Match, with g flag
regexp.lastIndex may change with each call to regexp.test(string).
const regexp = / x / g
regexp . lastIndex //=> 0
regexp . test ( 'xx' ) //=> true
regexp . lastIndex //=> 1
regexp . test ( 'xx' ) //=> true
regexp . lastIndex //=> 2
regexp . test ( 'xx' ) //=> false
regexp . lastIndex //=> 0
There should be no capturing groups if there’s also a g flag in regexp.
'xx' . match ( / y / ) //=> null
'xx' . match ( / y / g) //=> null
const matches = 'xx' . match ( / x / )
matches [ 0 ] //=> 'x'
matches . index //=> 0
matches . length //=> 1
const matches = 'xx' . match ( / x / g)
matches [ 0 ] //=> 'x'
matches [ 1 ] //=> 'x'
matches . index //=> undefined
matches . length //=> 2
Match, with capturing group, without g flag
const matches = 'xyxy' . match ( / x ( y ) / )
matches [ 0 ] //=> 'xy'
matches [ 1 ] //=> 'y'
matches . index //=> 0
matches . length //=> 2
(Pitfall) Match, with capturing group, with g flag
Capturing groups in regexp are ignored; returns the matches only.
const matches = 'xyxy' . match ( / x ( y ) / g)
matches [ 0 ] //=> 'xy'
matches [ 1 ] //=> 'xy'
matches . index //=> undefined
matches . length //=> 2
There should always be a g flag in regexp.
const iterator = 'xx' . matchAll ( / y / g)
const result = [ ]
for ( const match of iterator ) {
result . push ( match [ 0 ] )
}
result //=> []
const iterator = 'xx' . matchAll ( / x / g)
const result = [ ]
for ( const match of iterator ) {
result . push ( match [ 0 ] )
}
result //=> ['x', 'x']
Match, with capturing group
const iterator = 'xyxy' . matchAll ( / x ( y ) / g)
const result = [ ]
for ( const match of iterator ) {
result . push ( [ match [ 0 ] , match [ 1 ] ] )
}
result //=> [['xy', 'y'], ['xy', 'y']]
string.replace(regexp, newSubString)
There should be no capturing groups in regexp.
'xx' . replace ( / y / , 'z' ) //=> 'xx'
'xx' . replace ( / y / g, 'z' ) //=> 'xx'
'xx' . replace ( / x / , 'z' ) //=> 'zx'
'xx' . replace ( / x / g, 'z' ) //=> 'zz'
string.replace(regexp, callback)
function callback ( match ) {
return match . toUpperCase ( )
}
'xx' . replace ( / y / , callback ) //=> 'xx'
'xx' . replace ( / y / g, callback ) //=> 'xx'
function callback ( match ) {
return match . toUpperCase ( )
}
'xx' . replace ( / x / , callback ) //=> 'Xx'
'xx' . replace ( / x / g, callback ) //=> 'XX'
Match, with capturing group
function callback ( _ , y ) {
return y . toUpperCase ( )
}
'xyxy' . replace ( / x ( y ) / , callback ) //=> 'Yxy'
'xyxy' . replace ( / x ( y ) / g, callback ) //=> 'YY'
Expression
Description
. or [^\n\r]
any character excluding a newline or carriage return
[A-Za-z]
alphabet
[a-z]
lowercase alphabet
[A-Z]
uppercase alphabet
\d or [0-9]
digit
\D or [^0-9]
non-digit
_
underscore
\w or [A-Za-z0-9_]
alphabet, digit or underscore
\W or [^A-Za-z0-9_]
inverse of \w
\S
inverse of \s
Expression
Description
space
\t
tab
\n
newline
\r
carriage return
\s
space, tab, newline or carriage return
Expression
Description
[xyz]
either x, y or z
[^xyz]
neither x, y nor z
[1-3]
either 1, 2 or 3
[^1-3]
neither 1, 2 nor 3
Think of a character set as an OR operation on the single characters that are enclosed between the square brackets.
Use ^ after the opening [ to “negate” the character set.
Within a character set, . means a literal period.
Characters that require escaping
Expression
Description
\.
period
\^
caret
\$
dollar sign
|
pipe
\\
back slash
\/
forward slash
\(
opening bracket
\)
closing bracket
\[
opening square bracket
\]
closing square bracket
\{
opening curly bracket
\}
closing curly bracket
Expression
Description
\\
back slash
\]
closing square bracket
A ^ must be escaped only if it occurs immediately after the opening [ of the character set.
A - must be escaped only if it occurs between two alphabets or two digits.
Expression
Description
{2}
exactly 2
{2,}
at least 2
{2,7}
at least 2 but no more than 7
*
0 or more
+
1 or more
?
exactly 0 or 1
The quantifier goes after the expression to be quantified.
Expression
Description
^
start of string
$
end of string
\b
word boundary
How word boundary matching works:
At the beginning of the string if the first character is \w.
Between two adjacent characters within the string, if the first character is \w and the second character is \W.
At the end of the string if the last character is \w.
Expression
Description
foo|bar
match either foo or bar
foo(?=bar)
match foo if it’s before bar
foo(?!bar)
match foo if it’s not before bar
(?<=bar)foo
match foo if it’s after bar
(?<!bar)foo
match foo if it’s not after bar
Expression
Description
(foo)
capturing group; match and capture foo
(?:foo)
non-capturing group; match foo but without capturing foo
(foo)bar\1
\1 is a backreference to the 1st capturing group; match foobarfoo
Capturing groups are only relevant in the following methods:
string.match(regexp)
string.matchAll(regexp)
string.replace(regexp, callback)
\N is a backreference to the Nth capturing group. Capturing groups are numbered starting from 1.
Flag
Description
g
global search
i
case-insensitive search
m
multi-line search
If the g flag is used, regexp.lastIndex may change with each call to regexp.test(string).
If the m flag is used, ^ and $ will match the start and end of each line.
MIT