Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Renamed readme to readme.markdown. Added regex and test page with tests

  • Loading branch information...
commit c50a56faa99701187d25b8b7ac922aaee176e1af 1 parent 8951bc9
@chadmyers authored
Showing with 218 additions and 0 deletions.
  1. +31 −0 README.markdown
  2. +187 −0 start.html
View
31 README.markdown
@@ -0,0 +1,31 @@
+UrlRegex
+========
+
+_UrlRegex_ is a regex pattern designed originally by John Gruber for detecting
+URLs that occur among a blog of wild and crazy human-entered text.
+
+It's not perfect, but the intent is to handle most common and uncommon scenarios.
+
+Getting Started
+---------------
+Just clone the source and open the start.html in your favorite browser to see the Regex in action.
+
+Credits
+-------
+[John Gruber](http://daringfireball.net) is the originator of this regex.
+You can read his
+[blog post](http://daringfireball.net/2010/07/improved_regex_for_matching_urls)
+for more background on how he developed it.
+
+According to his blog post, he has released his original regex as public domain.
+
+This project is also released as public domain, as-is, no warranties implied or
+expressed, etc. None of the commiters of this project are responsible for any
+harm (real or imagined) that may occur from the use of this pattern, code, or
+anything in this respository.
+
+You may use this pattern and all code in this repository for all purposes except
+committing evil (crimes, harm, etc). You do not have to attribute your usage
+to anyone, though we think that you should probably give John Gruber a credit
+for his work in this area. At the least, send him an email or do a blog post
+and give him some link love.
View
187 start.html
@@ -0,0 +1,187 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html lang="en">
+ <head>
+ <meta charset="UTF-8" />
+ <title>UrlRegex</title>
+ <style type="text/css">
+ html,body{
+ color: black;
+ }
+
+ body{
+ background-color: white;
+ font: 14px helvetica,arial,freesans,clean,sans-serif;
+ *font-size: small;
+ }
+
+ .test-set {
+ border: 1px solid black;
+ margin: 5px;
+ }
+
+ .test-set div{
+ padding: 5px;
+ }
+
+ .test-wrapper {
+ background-color: #DDD;
+ font-weight: bold;
+ }
+
+ span {
+ font-weight: normal;
+ font-style: italics;
+ }
+
+ .failure {
+ background-color: red;
+ }
+
+ #successes{
+ color: green;
+ font-weight: bold;
+ }
+
+ #failures{
+ color: red;
+ font-weight: bold;
+ }
+
+ textarea {
+ display: block;
+ width: 100%;
+ }
+ </style>
+ <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js"></script>
+ <script type="text/javascript">
+
+ // This is the money right here. This is the line you want to copy.
+ var urlReg = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/ig;
+
+ // Tests -- each test case is an array part of [example, expected]
+ var urlTests = [
+ ['http://www.RegExr.com',
+ 'http://www.RegExr.com'],
+ ['http://RegExr.com',
+ 'http://RegExr.com'],
+ ['http://www.RegExr.com?2rjl6',
+ 'http://www.RegExr.com?2rjl6'],
+ ['http://www.RegExr.com?2rjl6&x=99&498',
+ 'http://www.RegExr.com?2rjl6&x=99&498'],
+ ['http://www.RegExr.com?2rjl6(v85)&x=99&498',
+ 'http://www.RegExr.com?2rjl6(v85)&x=99&498'],
+ ['http://RegExr.com#foo',
+ 'http://RegExr.com#foo'],
+ ['http://RegExr.com#foo(bar)',
+ 'http://RegExr.com#foo(bar)'],
+ ['This is my website (http://www.microsoft.com)',
+ 'http://www.microsoft.com'],
+ ['This is my website [http://www.microsoft.com]',
+ 'http://www.microsoft.com'],
+ ['http://foo.com/blah_blah',
+ 'http://foo.com/blah_blah'],
+ ['http://foo.com/blah_blah/',
+ 'http://foo.com/blah_blah/'],
+ ['(Something like http://foo.com/blah_blah)',
+ 'http://foo.com/blah_blah'],
+ ['http://foo.com/blah_blah_(wikipedia)',
+ 'http://foo.com/blah_blah_(wikipedia)'],
+ ['http://foo.com/more_(than)_one_(parens)',
+ 'http://foo.com/more_(than)_one_(parens)'],
+ ['(Something like http://foo.com/blah_blah_(wikipedia))',
+ 'http://foo.com/blah_blah_(wikipedia)'],
+ ['http://foo.com/blah_(wikipedia)#cite-1',
+ 'http://foo.com/blah_(wikipedia)#cite-1'],
+ ['http://foo.com/blah_(wikipedia)_blah#cite-1',
+ 'http://foo.com/blah_(wikipedia)_blah#cite-1'],
+ ['http://foo.com/unicode_(✪)_in_parens',
+ 'http://foo.com/unicode_(✪)_in_parens'],
+ ['http://foo.com/(something)?after=parens',
+ 'http://foo.com/(something)?after=parens'],
+ ['http://foo.com/blah_blah.',
+ 'http://foo.com/blah_blah'],
+ ['http://foo.com/blah_blah/.',
+ 'http://foo.com/blah_blah/'],
+ ['<http://foo.com/blah_blah>',
+ 'http://foo.com/blah_blah'],
+ ['<http://foo.com/blah_blah/>',
+ 'http://foo.com/blah_blah/'],
+ ['http://foo.com/blah_blah,',
+ 'http://foo.com/blah_blah'],
+ ['http://www.extinguishedscholar.com/wpglob/?p=364.',
+ 'http://www.extinguishedscholar.com/wpglob/?p=364'],
+ ['http://✪df.ws/1234',
+ 'http://✪df.ws/1234'],
+ ['rdar://1234',
+ 'rdar://1234'],
+ ['rdar:/1234',
+ 'rdar:/1234'],
+ ['x-yojimbo-item://6303E4C1-6A6E-45A6-AB9D-3A908F59AE0E',
+ 'x-yojimbo-item://6303E4C1-6A6E-45A6-AB9D-3A908F59AE0E'],
+ ['message://%3c330e7f840905021726r6a4ba78dkf1fd71420c1bf6ff@mail.gmail.com%3e',
+ 'message://%3c330e7f840905021726r6a4ba78dkf1fd71420c1bf6ff@mail.gmail.com%3e'],
+ ['http://➡.ws/䨹',
+ 'http://➡.ws/䨹'],
+ ['www.c.ws/䨹',
+ 'www.c.ws/䨹'],
+ ['<tag>http://example.com</tag>',
+ 'http://example.com'],
+ ['Just a www.example.com link.',
+ 'www.example.com'],
+ ['http://example.com/something?with,commas,in,url, but not at end',
+ 'http://example.com/something?with,commas,in,url'],
+ ['What about <mailto:gruber@daringfireball.net?subject=TEST> (including brokets).',
+ 'mailto:gruber@daringfireball.net?subject=TEST'],
+ ['mailto:name@example.com',
+ 'mailto:name@example.com'],
+ ['bit.ly/foo',
+ 'bit.ly/foo'],
+ ['“is.gd/foo/”',
+ 'is.gd/foo/'],
+ ['WWW.EXAMPLE.COM',
+ 'WWW.EXAMPLE.COM'],
+ ['http://www.asianewsphoto.com/(S(neugxif4twuizg551ywh3f55))/Web_ENG/View_DetailPhoto.aspx?PicId=752',
+ 'http://www.asianewsphoto.com/(S(neugxif4twuizg551ywh3f55))/Web_ENG/View_DetailPhoto.aspx?PicId=752'],
+ ['http://www.asianewsphoto.com/(S(neugxif4twuizg551ywh3f55))',
+ 'http://www.asianewsphoto.com/(S(neugxif4twuizg551ywh3f55))'],
+ ['http://lcweb2.loc.gov/cgi-bin/query/h?pp/horyd:@field(NUMBER+@band(thc+5a46634))',
+ 'http://lcweb2.loc.gov/cgi-bin/query/h?pp/horyd:@field(NUMBER+@band(thc+5a46634))']
+ ];
+
+ $(document).ready(function(){
+ $.each(urlTests, function () { testUrl(this[0], this[1]); });
+ $("#regextext").val(urlReg.source);
+ $("#successes").text( $(".match:not(.failure)").length );
+ $("#failures").text( $(".match.failure").length );
+ });
+
+ function testUrl(testText, expected) {
+ urlReg.lastIndex = 0;
+ var regexMatch = urlReg.exec(testText);
+
+ if (regexMatch != null) {
+ regexMatch = regexMatch[0];
+ }
+
+ var div = $('<div class="test-set"><div class="test-wrapper">Test: <span class="test"></span></div><div class="match-wrapper">Match: <span class="match"></span></div></div>');
+
+ $(".test", div).text(testText);
+ $(".match", div).text(regexMatch);
+
+ if( regexMatch != expected ){
+ $(".match", div).addClass("failure");
+ }
+
+ div.appendTo($("body"));
+
+ }
+ </script>
+ </head>
+ <body>
+ <h1>Test Cases</h1>
+ The regex:
+ <textarea id="regextext" readonly="true"></textarea>
+ <h2>Successes: <span id="successes">0</span> Failures: <span id="failures">0</span></h2>
+ </body>
+</html>
Please sign in to comment.
Something went wrong with that request. Please try again.