Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Japanese edge cases, meet your match.

  • Loading branch information...
commit fd5d855b922a58c18b546fcfc293fd29d23e3468 1 parent d672eb6
@mzsanford mzsanford authored
Showing with 33 additions and 2 deletions.
  1. +2 −0  README.md
  2. +4 −1 autolink.yml
  3. +27 −1 extract.yml
View
2  README.md
@@ -45,6 +45,8 @@ If you are creating a new twitter-text library in a different programming langua
* [FIX] Japanese autolink including long vowel mark (chouon)
* [FIX] Japanese autolink after a full-width exclamation point
* [FIX] Japanese autolink including ideographic iteration mark
+ * [FIX] Add hashtag extraction with indices test for new language hashtags
+ * [FIX] Add hashtag extraction with indices test for multiple latin hashtags
* v1.4.2 - 2011-07-08 [ Git tag v1.4.2 ]
* [FIX] Additional Japanese hashtag autolinking tests
View
5 autolink.yml
@@ -347,11 +347,14 @@ tests:
text: "できましたよー!#日本語ハッシュタグ。"
expected: "できましたよー!<a href=\"http://twitter.com/search?q=%23日本語ハッシュタグ\" title=\"#日本語ハッシュタグ\" class=\"tweet-url hashtag\">#日本語ハッシュタグ</a>。"
-
- description: "Autolink a hashtag containing ideographic iteration mark"
text: "#云々"
expected: "<a href=\"http://twitter.com/search?q=%23云々\" title=\"#云々\" class=\"tweet-url hashtag\">#云々</a>"
+ - description: "Autolink multiple hashtags in multiple languages"
+ text: "Hashtags in #中文, #日本語, #한국말, and #русский! Try it out!"
+ expected: "Hashtags in <a href=\"http://twitter.com/search?q=%23中文\" title=\"#中文\" class=\"tweet-url hashtag\">#中文</a>, <a href=\"http://twitter.com/search?q=%23日本語\" title=\"#日本語\" class=\"tweet-url hashtag\">#日本語</a>, <a href=\"http://twitter.com/search?q=%23한국말\" title=\"#한국말\" class=\"tweet-url hashtag\">#한국말</a>, and <a href=\"http://twitter.com/search?q=%23русский\" title=\"#русский\" class=\"tweet-url hashtag\">#русский</a>! Try it out!"
+
urls:
- description: "Autolink URL with pipe character"
text: "text http://example.com/pipe|character?yes|pipe|character"
View
28 extract.yml
@@ -445,9 +445,35 @@ tests:
- hashtag: "hashtag"
indices: [7, 15]
- - description: "Extract a hastag in a string of multi-byte characters"
+ - description: "Extract a hashtag in a string of multi-byte characters"
text: "会議中 #hashtag 会議中"
expected:
- hashtag: "hashtag"
indices: [4, 12]
+ - description: "Extract multiple valid hashtags"
+ text: "One #two three #four"
+ expected:
+ - hashtag: "two"
+ indices: [4, 8]
+ - hashtag: "four"
+ indices: [15, 20]
+
+ - description: "Extract a non-latin hashtag"
+ text: "Hashtags in #русский!"
+ expected:
+ - hashtag: "русский"
+ indices: [12, 20]
+
+ - description: "Extract multiple non-latin hashtags"
+ text: "Hashtags in #中文, #日本語, #한국말, and #русский! Try it out!"
+ expected:
+ - hashtag: "中文"
+ indices: [12, 15]
+ - hashtag: "日本語"
+ indices: [17, 21]
+ - hashtag: "한국말"
+ indices: [23, 27]
+ - hashtag: "русский"
+ indices: [33, 41]
+
Please sign in to comment.
Something went wrong with that request. Please try again.