Permalink
Browse files

now possible to search for pinyin(w/o tone) and traditional chinese

  • Loading branch information...
1 parent 07b5ce7 commit 96db7e83fbcd1404d5411ac343e8b6ae3ecee08a @bastien committed Apr 10, 2010
Showing with 16 additions and 1 deletion.
  1. +4 −0 README.mkd
  2. +2 −1 lib/zidian.rb
  3. +10 −0 test/test_zidian.rb
View
@@ -19,6 +19,10 @@ Examples of use
Zidian.find("文") # returns all the words that contain "文"
Zidian.find([653,34]) # returns the 2 words corresponding to the given ids
+
+ Zidian.find("wei2 cheng2") # returns the words corresponding to the given pinyin with tones
+
+ Zidian.find("wei cheng") # returns the words corresponding to the given pinyin without tones
Author: Bastien Vaucher
Version: 0.1.1
View
@@ -17,8 +17,9 @@ def self.find(expression)
protected
def self.find_word(word) #:nodoc:
+ words = word.split.map{|w| "#{w}[1-4]?"}.join(" ")
# adding the -i option allows to search independently from the case, but it makes it very slow
- `less #{File.dirname(__FILE__)}/cedict_ts.u8 | grep -n '[/\s]#{word.gsub(/\s/,"\s")}[/\s]'`
+ `less #{File.dirname(__FILE__)}/cedict_ts.u8 | grep -n -E '(^|[^a-zA-Z])#{words}($|[^a-zA-Z])'`
end
def self.get_line(line_number) #:nodoc:
View
@@ -29,6 +29,16 @@ def test_find_word_from_string
assert_equal("guai3", words.last.pinyin)
end
+ def test_find_word_from_pinyin
+ words = Zidian.find("wei cheng")
+ assert_equal("围城", words.first.simplified)
+ end
+
+ def test_find_word_from_pinyin_marked
+ words = Zidian.find("wei2 cheng2")
+ assert_equal("siege", words.first.english.first)
+ end
+
def test_raise_when_invalid_input_type
assert_raise(Zidian::InvalFindInputException) do
Zidian.find(:shanghai)

0 comments on commit 96db7e8

Please sign in to comment.