Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd test case for 2 letter top level domains #59
Comments
This comment has been minimized.
This comment has been minimized.
|
Good sleuthing! Signal to noise in tweets is often more towards noise so we only autolink if we're mostly certain it's a URL; it could be an emoticon or internet meme or have meaning in another language. Some domains are treated as especially strong signals like .com For now, you're right, we should have clear tests. Later we can revisit which domains should link and how changes would affect a sample set of tweets. |
This comment has been minimized.
This comment has been minimized.
chmac
commented
Sep 27, 2013
|
A little more sleuthing later, it looks like on both the API and the javascript frontend on twitter.com, the following happens:
There's presumably a list of valid TLDs in the js and other codebases. It's probably possible to tests from that. I'd suggest that domains like |
chmac commentedSep 26, 2013
I've been testing a little by posting to twitter, and domains like
neustar.usandcal.ioandgithub.ioare not auto linked. However, domains likechmac.comandarmy.milare. These differences are not covered in the test cases.I tried to dig into the javascript implementation to see what the actual behaviour is, but I spent enough time on PHP regexs today, maybe another day!
I just did another test and a domain
foo.dd.ukis auto linked, even though it's a non existent second level domain. Whilegov.ukwhich is a valid domain is not linked, butwww.gov.ukis.I'd hazard a guess and say that any 2 letter top level domain (
a.xx) is not linked, while domains likea.xxx.xxora.xx.xxare linked. Ironically,t.cois not linked!