I wanted to open a ticket, but I think issues are disabled on your branch, so I made a pull request instead. :-)
I'm using the Regex class (having made much of its internals public) to parse tweets and figure out if they're going to be longer than 140 characters after twitter's url munging. There are a couple of differences between this class and (I'm assuming) twitter's own classes. Specifically I've found differences in the parsing of top level domains. Domains like github.io will not be auto linked by twitter.com, but will be by this class. Second level domains like chmac.github.io will be linked by twitter.com and by this class. But non existent domains like chmac.github.pp will not be linked by twitter.com, but will be by the class.
I'm guessing twitter's approach is to embed a list of all the valid TLDs into the code. Probably a lot of work, and possibly not worth the effort. However, mirroring twitter's behaviour around first and second level CC domains would be useful for my use case at least. If I ever get my head around the regexs, I'll try to submit a pull request that does that, or maybe adds it as an option.
Thanks for saving me a lot of time writing regexs yesterday. :-)
Added notes about CC domains to readme.