/
index.md
300 lines (253 loc) · 10.8 KB
/
index.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
---
title: Assertions
slug: Web/JavaScript/Guide/Regular_expressions/Assertions
page-type: guide
---
{{jsSidebar("JavaScript Guide")}}
Assertions include boundaries, which indicate the beginnings and endings of lines and words, and other patterns indicating in some way that a match is possible (including look-ahead, look-behind, and conditional expressions).
{{EmbedInteractiveExample("pages/js/regexp-assertions.html", "taller")}}
## Types
### Boundary-type assertions
<table class="standard-table">
<thead>
<tr>
<th scope="col">Characters</th>
<th scope="col">Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>^</code></td>
<td>
<p>
Matches the beginning of input. If the multiline flag is set to true,
also matches immediately after a line break character. For example,
<code>/^A/</code> does not match the "A" in "an A", but does match the
first "A" in "An A".
</p>
<div class="notecard note">
<p>
<strong>Note:</strong> This character has a different meaning when
it appears at the start of a
<a
href="/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes"
>character class</a
>.
</p>
</div>
</td>
</tr>
<tr>
<td><code>$</code></td>
<td>
<p>
Matches the end of input. If the multiline flag is set to true, also
matches immediately before a line break character. For example,
<code>/t$/</code> does not match the "t" in "eater", but does match it
in "eat".
</p>
</td>
</tr>
<tr>
<td><code>\b</code></td>
<td>
<p>
Matches a word boundary. This is the position where a word character
is not followed or preceded by another word-character, such as between
a letter and a space. Note that a matched word boundary is not
included in the match. In other words, the length of a matched word
boundary is zero.
</p>
<p>Examples:</p>
<ul>
<li><code>/\bm/</code> matches the "m" in "moon".</li>
<li>
<code>/oo\b/</code> does not match the "oo" in "moon", because "oo"
is followed by "n" which is a word character.
</li>
<li>
<code>/oon\b/</code> matches the "oon" in "moon", because "oon" is
the end of the string, thus not followed by a word character.
</li>
<li>
<code>/\w\b\w/</code> will never match anything, because a word
character can never be followed by both a non-word and a word
character.
</li>
</ul>
<p>
To match a backspace character (<code>[\b]</code>), see
<a
href="/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes"
>Character Classes</a
>.
</p>
</td>
</tr>
<tr>
<td><code>\B</code></td>
<td>
<p>
Matches a non-word boundary. This is a position where the previous and
next character are of the same type: Either both must be words, or
both must be non-words, for example between two letters or between two
spaces. The beginning and end of a string are considered non-words.
Same as the matched word boundary, the matched non-word boundary is
also not included in the match. For example,
<code>/\Bon/</code> matches "on" in "at noon", and
<code>/ye\B/</code> matches "ye" in "possibly yesterday".
</p>
</td>
</tr>
</tbody>
</table>
### Other assertions
> **Note:** The `?` character may also be used as a quantifier.
<table class="standard-table">
<thead>
<tr>
<th scope="col">Characters</th>
<th scope="col">Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>x(?=y)</code></td>
<td>
<p>
<strong>Lookahead assertion: </strong>Matches "x" only if "x" is
followed by "y". For example, <code>/Jack(?=Sprat)/</code> matches
"Jack" only if it is followed by "Sprat".<br /><code
>/Jack(?=Sprat|Frost)/</code
>
matches "Jack" only if it is followed by "Sprat" or "Frost". However,
neither "Sprat" nor "Frost" is part of the match results.
</p>
</td>
</tr>
<tr>
<td><code>x(?!y)</code></td>
<td>
<p>
<strong>Negative lookahead assertion: </strong>Matches "x" only if "x"
is not followed by "y". For example, <code>/\d+(?!\.)/</code> matches
a number only if it is not followed by a decimal point. <code
>/\d+(?!\.)/.exec('3.141')</code
>
matches "141" but not "3".
</p>
</td>
</tr>
<tr>
<td><code>(?<=y)x</code></td>
<td>
<p>
<strong>Lookbehind assertion: </strong>Matches "x" only if "x" is
preceded by "y". For example,
<code>/(?<=Jack)Sprat/</code> matches "Sprat" only if it is
preceded by "Jack". <code>/(?<=Jack|Tom)Sprat/</code> matches
"Sprat" only if it is preceded by "Jack" or "Tom". However, neither
"Jack" nor "Tom" is part of the match results.
</p>
</td>
</tr>
<tr>
<td><code>(?<!y)x</code></td>
<td>
<p>
<strong>Negative lookbehind assertion: </strong>Matches "x" only if
"x" is not preceded by "y". For example,
<code>/(?<!-)\d+/</code> matches a number only if it is not
preceded by a minus sign. <code>/(?<!-)\d+/.exec('3')</code>
matches "3". <code>/(?<!-)\d+/.exec('-3')</code> match is not
found because the number is preceded by the minus sign.
</p>
</td>
</tr>
</tbody>
</table>
## Examples
### General boundary-type overview example
```js
// Using Regex boundaries to fix buggy string.
buggyMultiline = `tey, ihe light-greon apple
tangs on ihe greon traa`;
// 1) Use ^ to fix the matching at the beginning of the string, and right after newline.
buggyMultiline = buggyMultiline.replace(/^t/gim, "h");
console.log(1, buggyMultiline); // fix 'tey' => 'hey' and 'tangs' => 'hangs' but do not touch 'traa'.
// 2) Use $ to fix matching at the end of the text.
buggyMultiline = buggyMultiline.replace(/aa$/gim, "ee.");
console.log(2, buggyMultiline); // fix 'traa' => 'tree.'.
// 3) Use \b to match characters right on border between a word and a space.
buggyMultiline = buggyMultiline.replace(/\bi/gim, "t");
console.log(3, buggyMultiline); // fix 'ihe' => 'the' but do not touch 'light'.
// 4) Use \B to match characters inside borders of an entity.
fixedMultiline = buggyMultiline.replace(/\Bo/gim, "e");
console.log(4, fixedMultiline); // fix 'greon' => 'green' but do not touch 'on'.
```
### Matching the beginning of input using a ^ control character
Use `^` for matching at the beginning of input. In this example, we can get the fruits that start with 'A' by a `/^A/` regex. For selecting appropriate fruits we can use the [filter](/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/filter) method with an [arrow](/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions) function.
```js
const fruits = ["Apple", "Watermelon", "Orange", "Avocado", "Strawberry"];
// Select fruits started with 'A' by /^A/ Regex.
// Here '^' control symbol used only in one role: Matching beginning of an input.
const fruitsStartsWithA = fruits.filter((fruit) => /^A/.test(fruit));
console.log(fruitsStartsWithA); // [ 'Apple', 'Avocado' ]
```
In the second example `^` is used both for matching at the beginning of input and for creating negated or complemented character class when used within [character classes](/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes).
```js
const fruits = ["Apple", "Watermelon", "Orange", "Avocado", "Strawberry"];
// Selecting fruits that do not start by 'A' with a /^[^A]/ regex.
// In this example, two meanings of '^' control symbol are represented:
// 1) Matching beginning of the input
// 2) A negated or complemented character class: [^A]
// That is, it matches anything that is not enclosed in the square brackets.
const fruitsStartsWithNotA = fruits.filter((fruit) => /^[^A]/.test(fruit));
console.log(fruitsStartsWithNotA); // [ 'Watermelon', 'Orange', 'Strawberry' ]
```
### Matching a word boundary
```js
const fruitsWithDescription = ["Red apple", "Orange orange", "Green Avocado"];
// Select descriptions that contains 'en' or 'ed' words endings:
const enEdSelection = fruitsWithDescription.filter((descr) =>
/(en|ed)\b/.test(descr),
);
console.log(enEdSelection); // [ 'Red apple', 'Green Avocado' ]
```
### Lookahead assertion
```js
// JS Lookahead assertion x(?=y)
const regex = /First(?= test)/g;
console.log("First test".match(regex)); // [ 'First' ]
console.log("First peach".match(regex)); // null
console.log("This is a First test in a year.".match(regex)); // [ 'First' ]
console.log("This is a First peach in a month.".match(regex)); // null
```
### Basic negative lookahead assertion
For example, `/\d+(?!\.)/` matches a number only if it is not followed by a decimal point. `/\d+(?!\.)/.exec('3.141')` matches "141" but not "3.
```js
console.log(/\d+(?!\.)/g.exec("3.141")); // [ '141', index: 2, input: '3.141' ]
```
### Different meaning of '?!' combination usage in assertions and character classes
The `?!` combination has different meanings in assertions like `/x(?!y)/` and [character classes](/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes) like `[^?!]`.
```js
const orangeNotLemon =
"Do you want to have an orange? Yes, I do not want to have a lemon!";
// Different meaning of '?!' combination usage in Assertions /x(?!y)/ and Ranges /[^?!]/
const selectNotLemonRegex = /[^?!]+have(?! a lemon)[^?!]+[?!]/gi;
console.log(orangeNotLemon.match(selectNotLemonRegex)); // [ 'Do you want to have an orange?' ]
const selectNotOrangeRegex = /[^?!]+have(?! an orange)[^?!]+[?!]/gi;
console.log(orangeNotLemon.match(selectNotOrangeRegex)); // [ ' Yes, I do not want to have a lemon!' ]
```
### Lookbehind assertion
```js
const oranges = ["ripe orange A", "green orange B", "ripe orange C"];
const ripeOranges = oranges.filter((fruit) => /(?<=ripe )orange/.test(fruit));
console.log(ripeOranges); // [ 'ripe orange A', 'ripe orange C' ]
```
## See also
- [Regular expressions](/en-US/docs/Web/JavaScript/Guide/Regular_expressions) guide
- [Character classes](/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes) guide
- [Quantifiers](/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Quantifiers) guide
- [Groups and backreferences](/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Groups_and_backreferences) guide
- [`RegExp`](/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp)