Permalink
Browse files

Add support for Java's native Normalizer.

The Unicode gem compiles on JRuby as a C extension, which is bad
for performance (and won't run, for example, on Travis-CI).
Detect when we're running under JRuby and switch to using Java's
native Normalizer.  Also, Java's normalizer does not normalize a
floating orphan diacritic tilde into a proper tilde (nor, I think,
should it).  Support \textasciitilde instead.
  • Loading branch information...
1 parent cdc143d commit 01fdc67f837d206e186a0d874303245ba830c247 @cpence cpence committed Jan 30, 2012
@@ -8,7 +8,6 @@ Feature: Decode LaTeX diacritics
Scenarios: Diacritics
| latex | unicode | description |
- | \\~{} | ~ | |
| \\\`{o} | ò | grave accent |
| \\\'{o} | ó | acute accent |
| \\^{o} | ô | circumflex |
@@ -15,6 +15,6 @@ Feature: Decode LaTeX special characters
| \\{ | { |
| \\} | } |
| \\_ | _ |
- | \\~{} | ~ |
+ | \\textasciitilde{} | ~ |
| \\textbackslash{} | \\ |
| \\textasciicircum{} | ^ |
View
@@ -17,8 +17,6 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#++
-require 'unicode'
-
require 'latex/decode/version'
require 'latex/decode/compatibility'
require 'latex/decode/base'
@@ -45,7 +43,7 @@ def decode (string)
Decode::Base.strip_braces(string)
- Unicode::normalize_C(string)
+ LaTeX.normalize_C(string)
end
end
end
@@ -18,4 +18,24 @@ def self.to_unicode (string); string; end
def ruby_18; false; end
def ruby_19; yield; end
+end
+
+if RUBY_PLATFORM == 'java'
+ require 'java'
+
+ module LaTeX
+ def self.normalize_C(string)
+ java.text.Normalizer.normalize(string, java.text.Normalizer::Form::NFC).to_s
+ end
+ end
+
+else
+ require 'unicode'
+
+ module LaTeX
+ def self.normalize_C(string)
+ Unicode::normalize_C(string)
+ end
+ end
+
end
@@ -29,6 +29,7 @@ class Punctuation < Decoder
rangle ⟩
textasciicircum ^
textbackslash \\
+ textasciitilde ~
}].freeze
@symbols = Hash[*%w[

0 comments on commit 01fdc67

Please sign in to comment.