Permalink
Browse files

Merge pull request #22 from twitter/hashtag_more_char

Add test cases with hashtags containing Japanese Ditto mark and Turkish 'i'
  • Loading branch information...
2 parents e3c3e45 + 6b7a34a commit eba86473df34b7645a2797a5b3f614e96957d06c @keitaf keitaf committed Mar 27, 2012
Showing with 6 additions and 2 deletions.
  1. +6 −2 extract.yml
View
8 extract.yml
@@ -682,13 +682,17 @@ tests:
expected: ["日本語ハッシュタグ"]
- description: "Hashtag with ideographic iteration mark"
- text: "#云々 #学問のすゝめ #いすゞ #各〻"
- expected: ["云々", "学問のすゝめ", "いすゞ", "各〻"]
+ text: "#云々 #学問のすゝめ #いすゞ #各〻 #〃"
+ expected: ["云々", "学問のすゝめ", "いすゞ", "各〻", "〃"]
- description: "Hashtags with ş (U+015F)"
text: "Here’s a test tweet for you: #Ateş #qrşt #ştu #ş"
expected: ["Ateş", "qrşt", "ştu", "ş"]
+ - description: "Hashtags with İ (U+0130) and ı (U+0131)"
+ text: "Here’s a test tweet for you: #İn #ın"
+ expected: ["İn", "ın"]
+
- description: "Hashtag before punctuations"
text: "#hashtag: #hashtag; #hashtag, #hashtag. #hashtag! #hashtag?"
expected: ["hashtag", "hashtag", "hashtag", "hashtag", "hashtag", "hashtag"]

0 comments on commit eba8647

Please sign in to comment.