Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 175 lines (128 sloc) 6.583 kb
2380768 Steven Levithan add new explicit capture mode via flag n
authored
1 [XRegExp](http://xregexp.com/)
2 ==============================
a6acc47 Steven Levithan first commit
authored
3
845273d Steven Levithan readme: add tools/ author credit
authored
4 XRegExp provides augmented, extensible JavaScript regular expressions. You get new syntax, flags, and methods beyond what browsers support natively. XRegExp is also a regular expression utility belt with tools to make your client-side grepping simpler and more powerful, while freeing you from worrying about pesky cross-browser inconsistencies and the dubious `lastIndex` property.
a6acc47 Steven Levithan first commit
authored
5
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
6
2380768 Steven Levithan add new explicit capture mode via flag n
authored
7 ## Usage examples
8
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
9 Note that these examples take advantage of new features in XRegExp v2.0.0-beta ([details](https://github.com/slevithan/XRegExp/wiki/Roadmap)).
9e909c6 Steven Levithan added Unicode category aliases
authored
10
11 ~~~ js
7bb6cfd Steven Levithan change sticky and replaceAll bools to strs; augment native RegExps with ...
authored
12 // Using named capture and flag x (free-spacing and line comments)
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
13 var date = XRegExp('(?<year> [0-9]{4}) -? # year \n\
14 (?<month> [0-9]{2}) -? # month \n\
15 (?<day> [0-9]{2}) # day ', 'x');
9e909c6 Steven Levithan added Unicode category aliases
authored
16
17 // XRegExp.exec gives you named backreferences on the match result
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
18 var match = XRegExp.exec('2012-02-22', date);
9e909c6 Steven Levithan added Unicode category aliases
authored
19 match.day; // -> '22'
20
7bb6cfd Steven Levithan change sticky and replaceAll bools to strs; augment native RegExps with ...
authored
21 // It also includes optional pos and sticky arguments
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
22 var pos = 2, result = [];
bd1ab36 Steven Levithan compromising with myself: sticky mode triggered by bool or str 'sticky'
authored
23 while (match = XRegExp.exec('<1><2><3><4>5<6>', /<(\d+)>/, pos, 'sticky')) {
9e909c6 Steven Levithan added Unicode category aliases
authored
24 result.push(match[1]);
25 pos = match.index + match[0].length;
9e6858b Steven Levithan readme tweaks
authored
26 } // result -> ['2', '3', '4']
9e909c6 Steven Levithan added Unicode category aliases
authored
27
28 // XRegExp.replace allows named backreferences in replacements
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
29 XRegExp.replace('2012-02-22', date, '${month}/${day}/${year}'); // -> '02/22/2012'
9e6858b Steven Levithan readme tweaks
authored
30 XRegExp.replace('2012-02-22', date, function (match) {
31 return match.month + '/' + match.day + '/' +match.year;
32 }); // -> '02/22/2012'
9e909c6 Steven Levithan added Unicode category aliases
authored
33
34 // In fact, all XRegExps are RegExps and work perfectly with native methods
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
35 date.test('2012-02-22'); // -> true
9e909c6 Steven Levithan added Unicode category aliases
authored
36
9e6858b Steven Levithan readme tweaks
authored
37 // The *only* caveat is that named captures must be referred to using numbered backreferences
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
38 '2012-02-22'.replace(date, '$2/$3/$1'); // -> '02/22/2012'
9e909c6 Steven Levithan added Unicode category aliases
authored
39
40 // If you want, you can extend native methods so you don't have to worry about this
9e6858b Steven Levithan readme tweaks
authored
41 // Doing so also fixes numerous browser bugs in the native methods
9e909c6 Steven Levithan added Unicode category aliases
authored
42 XRegExp.install('natives');
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
43 '2012-02-22'.replace(date, '${month}/${day}/${year}'); // -> '02/22/2012'
44 '2012-02-22'.replace(date, function (match) {
9e909c6 Steven Levithan added Unicode category aliases
authored
45 return match.month + '/' + match.day + '/' +match.year;
46 }); // -> '02/22/2012'
29f5613 Steven Levithan improving comments; de-indent anon function wrappers
authored
47 date.exec('2012-02-22').day; // -> 22
9e909c6 Steven Levithan added Unicode category aliases
authored
48
43d2a5d Steven Levithan readme: remove a verbose example
authored
49 // Extract every other digit from a string using XRegExp.forEach
8826058 Steven Levithan readme: add another forEach example
authored
50 XRegExp.forEach("1a2345", /\d/, function (match, i) {
51 if (i % 2) this.push(+match[0]);
9e6858b Steven Levithan readme tweaks
authored
52 }, []); // -> [2, 4]
8826058 Steven Levithan readme: add another forEach example
authored
53
cc470e8 Steven Levithan reorder custom tokens; add Unicode token placeholder
authored
54 // Get numbers within <b> tags using XRegExp.matchChain
e6562a6 Steven Levithan updating comments
authored
55 XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [
2380768 Steven Levithan add new explicit capture mode via flag n
authored
56 XRegExp('(?is)<b>.*?<\\/b>'),
e6562a6 Steven Levithan updating comments
authored
57 /\d+/
9e6858b Steven Levithan readme tweaks
authored
58 ]); // -> ['2', '4', '56']
9e909c6 Steven Levithan added Unicode category aliases
authored
59
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
60 // You can also pass forward and return specific backreferences
ed8404d Steven Levithan improving comments; allow custom flags in mode modifier
authored
61 var html = '<a href="http://xregexp.com/">XRegExp</a>\
62 <a href="http://www.google.com/">Google</a>';
63 XRegExp.matchChain(html, [
9e909c6 Steven Levithan added Unicode category aliases
authored
64 {regex: /<a href="([^"]+)">/i, backref: 1},
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
65 {regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}
9e6858b Steven Levithan readme tweaks
authored
66 ]); // -> ['xregexp.com', 'www.google.com']
9e909c6 Steven Levithan added Unicode category aliases
authored
67
68 // XRegExp regexes get call and apply methods
69 // To demonstrate, let's first create the function we'll be using...
70 function filter (array, fn) {
71 var res = [];
72 array.forEach(function (el) {if (fn.call(null, el)) res.push(el);});
73 return res;
74 }
75 // Now we can filter arrays using functions and regexes
9e6858b Steven Levithan readme tweaks
authored
76 filter(['a', 'ba', 'ab', 'b'], XRegExp('^a')); // -> ['a', 'ab']
9e909c6 Steven Levithan added Unicode category aliases
authored
77 ~~~
78
9e6858b Steven Levithan readme tweaks
authored
79 These examples should give you an idea of what's possible, but XRegExp has a lot more tricks that aren't shown here. You can even augment XRegExp's regular expression syntax with addons (see below) or write your own. For the full scoop, see [API](http://xregexp.com/api/), [syntax](http://xregexp.com/syntax/), [flags](http://xregexp.com/flags/), and [browser fixes](http://xregexp.com/cross_browser/).
9e909c6 Steven Levithan added Unicode category aliases
authored
80
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
81
2380768 Steven Levithan add new explicit capture mode via flag n
authored
82 ## XRegExp Unicode Base
9e909c6 Steven Levithan added Unicode category aliases
authored
83
84 First include the Unicode Base script:
85
86 ~~~ html
4515d50 Mathias Bynens README: Use consistent indentation
mathiasbynens authored
87 <script src="xregexp.js"></script>
efee9fe Steven Levithan rename plugins to addons
authored
88 <script src="addons/unicode/xregexp-unicode-base.js"></script>
9e909c6 Steven Levithan added Unicode category aliases
authored
89 ~~~
90
91 Then you can do this:
92
93 ~~~ js
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
94 var unicodeWord = XRegExp('^\\p{L}+$');
9e909c6 Steven Levithan added Unicode category aliases
authored
95 unicodeWord.test('Русский'); // -> true
96 unicodeWord.test('日本語'); // -> true
97 unicodeWord.test('العربية'); // -> true
98 ~~~
99
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
100 The base script adds `\p{L}` (and its alias, `\p{Letter}`), but other Unicode categories, scripts, and blocks require addon packages. Try these next examples after additionally including `xregexp-unicode-scripts.js`:
9e909c6 Steven Levithan added Unicode category aliases
authored
101
102 ~~~ js
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
103 XRegExp('^\\p{Hiragana}+$').test('ひらがな'); // -> true
104 XRegExp('^[\\p{Latin}\\p{Common}]+$').test('Über Café.'); // -> true
9e909c6 Steven Levithan added Unicode category aliases
authored
105 ~~~
a6acc47 Steven Levithan first commit
authored
106
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
107 XRegExp uses the Unicode 6.1 character database (released January 2012).
108
109 More details: [Addons: Unicode](http://xregexp.com/plugins/#unicode).
30de2aa Steven Levithan readme update
authored
110
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
111
2380768 Steven Levithan add new explicit capture mode via flag n
authored
112 ## XRegExp Match Recursive
30de2aa Steven Levithan readme update
authored
113
9e909c6 Steven Levithan added Unicode category aliases
authored
114 First include the Match Recursive script:
30de2aa Steven Levithan readme update
authored
115
9e909c6 Steven Levithan added Unicode category aliases
authored
116 ~~~ html
30de2aa Steven Levithan readme update
authored
117 <script src="xregexp.js"></script>
118 <script src="addons/xregexp-matchrecursive.js"></script>
9e909c6 Steven Levithan added Unicode category aliases
authored
119 ~~~
120
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
121 You can then match recursive constructs using XRegExp patterns as left and right delimiters:
9e909c6 Steven Levithan added Unicode category aliases
authored
122
123 ~~~ js
124 var str = '(t((e))s)t()(ing)';
125 XRegExp.matchRecursive(str, '\\(', '\\)', 'g');
126 // -> ['t((e))s', '', 'ing']
127
128 // Extended information mode with valueNames
129 str = 'Here is <div>a <div>nested</div> tag</div> example.';
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
130 XRegExp.matchRecursive(str, '<div\\s*>', '</div>', 'gi', {
9e909c6 Steven Levithan added Unicode category aliases
authored
131 valueNames: ['between', 'left', 'match', 'right']
132 });
133 // -> [['between', 'Here is ', 0, 8],
134 // ['left', '<div>', 8, 13],
135 // ['match', 'a <div>nested</div> tag', 13, 37],
136 // ['right', '</div>', 36, 42],
137 // ['between', ' example.', 42, 51]]
138
139 // Omitting unneeded parts with null valueNames, and using escapeChar
140 str = '...{1}\\{{function(x,y){return y+x;}}';
141 XRegExp.matchRecursive(str, '{', '}', 'g', {
142 valueNames: ['literal', null, 'value', null],
143 escapeChar: '\\'
144 });
145 // -> [['literal', '...', 0, 3],
146 // ['value', '1', 4, 5],
147 // ['literal', '\\{', 6, 8],
148 // ['value', 'function(x,y){return y+x;}', 9, 35]]
149
7bb6cfd Steven Levithan change sticky and replaceAll bools to strs; augment native RegExps with ...
authored
150 // Sticky mode via flag y
9e909c6 Steven Levithan added Unicode category aliases
authored
151 str = '<1><<<2>>><3>4<5>';
152 XRegExp.matchRecursive(str, '<', '>', 'gy');
153 // -> ['1', '<<2>>', '3']
154 ~~~
a6acc47 Steven Levithan first commit
authored
155
75aede5 Steven Levithan added XRegExp.install/uninstall/isInstalled
authored
156 More details: [Addons: Match Recursive](http://xregexp.com/plugins/#matchRecursive).
5ebad91 Steven Levithan readme: link to match recursive api doc
authored
157
5ab8962 Steven Levithan comments and alpha/beta version nums
authored
158
9e909c6 Steven Levithan added Unicode category aliases
authored
159 ## Changelog
a6acc47 Steven Levithan first commit
authored
160
4f91f60 Steven Levithan README: update intro
authored
161 * Historical changes: [Version history](http://xregexp.com/history/).
162 * Planned changes: [Roadmap](https://github.com/slevithan/XRegExp/wiki/Roadmap).
30de2aa Steven Levithan readme update
authored
163
9ef0118 Steven Levithan copyediting
authored
164
52b17d9 Steven Levithan README: About
authored
165 ## About
9ef0118 Steven Levithan copyediting
authored
166
52b17d9 Steven Levithan README: About
authored
167 XRegExp and addons copyright 2007-2012 by [Steven Levithan](http://stevenlevithan.com/).
845273d Steven Levithan readme: add tools/ author credit
authored
168
9e6858b Steven Levithan readme tweaks
authored
169 Tools: Unicode range generators by [Mathias Bynens](http://mathiasbynens.be/). Source file concatenator by [Bjarke Walling](http://twitter.com/walling).
9ef0118 Steven Levithan copyediting
authored
170
b028743 Steven Levithan +MIT License in README, url->homepage
authored
171 All code released under the [MIT License](http://opensource.org/licenses/mit-license.php).
172
362838d Steven Levithan lots o' tweaks +package.json
authored
173 Fork me to show support, fix, and extend.
9ef0118 Steven Levithan copyediting
authored
174
Something went wrong with that request. Please try again.