Skip to content
This repository
Browse code

Merge pull request #22 from twitter/hashtag_more_char

Add test cases with hashtags containing Japanese Ditto mark and Turkish 'i'
  • Loading branch information...
commit eba86473df34b7645a2797a5b3f614e96957d06c 2 parents e3c3e45 + 6b7a34a
Keita Fujii keitaf authored

Showing 1 changed file with 6 additions and 2 deletions. Show diff stats Hide diff stats

  1. +6 2 extract.yml
8 extract.yml
@@ -682,13 +682,17 @@ tests:
682 682 expected: ["日本語ハッシュタグ"]
683 683
684 684 - description: "Hashtag with ideographic iteration mark"
685   - text: "#云々 #学問のすゝめ #いすゞ #各〻"
686   - expected: ["云々", "学問のすゝめ", "いすゞ", "各〻"]
  685 + text: "#云々 #学問のすゝめ #いすゞ #各〻 #〃"
  686 + expected: ["云々", "学問のすゝめ", "いすゞ", "各〻", "〃"]
687 687
688 688 - description: "Hashtags with ş (U+015F)"
689 689 text: "Here’s a test tweet for you: #Ateş #qrşt #ştu #ş"
690 690 expected: ["Ateş", "qrşt", "ştu", "ş"]
691 691
  692 + - description: "Hashtags with İ (U+0130) and ı (U+0131)"
  693 + text: "Here’s a test tweet for you: #İn #ın"
  694 + expected: ["İn", "ın"]
  695 +
692 696 - description: "Hashtag before punctuations"
693 697 text: "#hashtag: #hashtag; #hashtag, #hashtag. #hashtag! #hashtag?"
694 698 expected: ["hashtag", "hashtag", "hashtag", "hashtag", "hashtag", "hashtag"]

0 comments on commit eba8647

Please sign in to comment.
Something went wrong with that request. Please try again.