since then my English improved

yuex · Aug 8, 2015 · c56a09a · c56a09a
1 parent 7f28a48
commit c56a09a
Show file tree

Hide file tree

Showing 2 changed files with 35 additions and 25 deletions.
diff --git a/README.md b/README.md
@@ -1,16 +1,32 @@
 # CJK Auto Spacing
 
-A pelican plugin to insert spaces between Chinese/Japanese/Korean characters and English words.
+A pelican plugin to insert space between Chinese/Japanese/Korean character
+and English word.
 
 # Why
 
-For Chinese readers, it's reading for torture rather than pleasure if Chinese characters and English words are put together without spaces. (See [Effects](#effects), there's a comparison)
+For Chinese readers, it's reading for torture instead of pleasure if Chinese
+characters and English words are mingled togther without spaces. (See
+[Effects](#effects) for comparison)
 
-Moreover, research shows that those who love putting Chinese characters and English words together without space have more troubles in love (see [why space?][] in Chinese). Up to 70% marry one they don't love at 34 years old. And the other 30%, even worse, have nobody to inherit their legacies except cats.
+Moreover, research shows that those who love putting Chinese characters and
+English words together without space have more troubles in love ([why space?][]
+in Chinese). Up to 70% marry one they don't love at 34 years old. And the other
+30%, even worse, have nobody to inherit their legacies except cats.
 
-So I think it's not very hard to conclude the necessarity of space for Chinese users of pelican.
+So, to Chinese users, it is not hard to see the necessarity of using spaces.
 
-**Note:** I don't speak or write Japanese or Korean, but I feel we can get the same conclusion. They have lots in common. And perhaps that's why they are called CJK Unified Ideographs, together as a whole, in the unicode standards.
+**Note**: I know nothing about Japanese and Korean, but I feel confidently we
+can get the same conclusion. They have lots in common. And perhaps that's why
+they are called CJK Unified Ideographs together as a whole in the unicode
+standards.
+
+# Options
+
+By default it will only process the content. To process title, add following
+line to your `pelicanconf.py`
+
+    CJK_AUTO_SPACING_TITLE = True
 
 # Effects
 
@@ -28,12 +44,7 @@ With CJK Auto Spacing
 
 ![with spacing](./screenshot1.png)
 
-If you feel that the first image is fine, then read it again, again and again, until you feel it's not okay.
-
-# Options
-
-By default it will only process the content.
-
-You can set the ``CJK_AUTO_SPACING_TITLE`` parameter to True if you need the title to be processed as well. Default is False.
+If you find nothing wrong in the first picture, re-read it until you find it's
+wrong.
 
 [why space?]: https://github.com/vinta/paranoid-auto-spacing
diff --git a/cjk_auto_spacing.py b/cjk_auto_spacing.py
@@ -18,9 +18,13 @@
 ]
 
 
-def _chinese_auto_spacing(str):
+def _chinese_auto_spacing(text):
 
     def _with_range(char, check_range):
+        # XXX: actually this kind of searching will see a improvment from O(n)
+        # to O(1) when using patricia instead of list. but for a blog plugin
+        # processing offline, I think it doesn't matter too much. for those who
+        # writes a lot, this improvements may be expected.
         for start, end in check_range:
             if char >= start and char <= end:
                 return True
@@ -35,7 +39,7 @@ def is_punc(char):
     ret = u''
     prev = None
 
-    for char in str:
+    for char in text:
         sp = u''
         curr_is_cjk = is_cjk(char)
         curr_is_punc = is_punc(char)
@@ -44,7 +48,7 @@ def is_punc(char):
             prev_is_cjk, prev_is_punc = prev
 
             if curr_is_punc or prev_is_punc:
-                # do not add space to punctuation
+                # do not add space around punctuation
                 sp = u''
 
             elif prev_is_cjk != curr_is_cjk:
@@ -53,27 +57,22 @@ def is_punc(char):
         ret = ret + sp + char
         prev = (curr_is_cjk, curr_is_punc)
 
-    if ret:
-        return ret
-    else:
-        return str
+    return ret
 
 
 def process_content(content):
-    if content._content == None:
+    if content._content is None:
         return
 
     content._content = _chinese_auto_spacing(content._content)
 
 
 def process_title(generator, metadata):
-    if 'CJK_AUTO_SPACING_TITLE' not in generator.settings:
+    if ('CJK_AUTO_SPACING_TITLE' not in generator.settings
+            or not generator.settings['CJK_AUTO_SPACING_TITLE']):
         return
 
-    if not generator.settings['CJK_AUTO_SPACING_TITLE']:
-        return
-
-    if metadata.has_key(u'title'):
+    if u'title' in metadata:
         metadata[u'title'] = _chinese_auto_spacing(metadata[u'title'])