Skip to content

Regular expressions matching different classes of unicode characters and some related utility functions

License

Notifications You must be signed in to change notification settings

One-com/unicoderegexp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

unicoderegexp

Various regular expressions for unicode character classes (letter, punctuation, number, etc.) and helper functions for composing them.

Used by the purify library.

The module exports a bunch of useful RegExps each with a single character class in them:

  • letter
  • mark
  • number
  • punctuation
  • symbol
  • separator
  • other
  • visible
  • printable
unicodeRegExp.visible.test("a"); // true
unicodeRegExp.visible.test(" "); // false
unicodeRegExp.visible.test("\u00a0"); // false -- a non-breaking space is not visible

To validate an entire string you need to build a new RegExp:

var visibleStringRegExp = new RegExp('^' + unicodeRegExp.visible.source + '*$');
visibleStringRegExp.test("foobar"); // true
visibleStringRegExp.test("foo bar"); // false because of the space

unicodeRegExp.removeCharacterFromCharacterClassRegExp(/[æøå]/, 'æ'); // /[\u00f8\u00e5]/
unicodeRegExp.spliceCharacterClassRegExps(/[a-b]/, /[c-d]/); // /[a-bc-d]/

The info about which characters belong to which classes was taken from the XRegExp library and its Unicode plugin.

License

Unicoderegexp is licensed under a standard 3-clause BSD license -- see the LICENSE file for details.

About

Regular expressions matching different classes of unicode characters and some related utility functions

Resources

License

Stars

Watchers

Forks

Packages

No packages published