Skip to content

Added notes about CC domains to readme. #6

merged 1 commit into from Oct 10, 2013

2 participants

chmac commented Sep 27, 2013

I wanted to open a ticket, but I think issues are disabled on your branch, so I made a pull request instead. :-)

I'm using the Regex class (having made much of its internals public) to parse tweets and figure out if they're going to be longer than 140 characters after twitter's url munging. There are a couple of differences between this class and (I'm assuming) twitter's own classes. Specifically I've found differences in the parsing of top level domains. Domains like will not be auto linked by, but will be by this class. Second level domains like will be linked by and by this class. But non existent domains like chmac.github.pp will not be linked by, but will be by the class.

I'm guessing twitter's approach is to embed a list of all the valid TLDs into the code. Probably a lot of work, and possibly not worth the effort. However, mirroring twitter's behaviour around first and second level CC domains would be useful for my use case at least. If I ever get my head around the regexs, I'll try to submit a pull request that does that, or maybe adds it as an option.

Thanks for saving me a lot of time writing regexs yesterday. :-)

@mzsanford mzsanford merged commit d7b7aae into mzsanford:master Oct 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.