Skip to content
This repository
Browse code

improve text extraction of built in fonts

* if the font includes a difference table (allowing it to use non ASCII
  chars) then we need to do some extra work to extract the glyph width
  from the relevant AFM file
  • Loading branch information...
commit f9c4635a5d2edc3f93841007f6cccc86d7bdf576 1 parent a661d3d
James Healy authored November 26, 2012
17  lib/pdf/reader/encoding.rb
@@ -111,6 +111,23 @@ def int_to_utf8_string(glyph_code)
111 111
       @string_cache[glyph_code] ||= internal_int_to_utf8_string(glyph_code)
112 112
     end
113 113
 
  114
+    # convert an integer glyph code into an Adobe glyph name.
  115
+    #
  116
+    #     int_to_name(65)
  117
+    #     => :A
  118
+    #
  119
+    # TODO: this needs to be expanded to return the appropriate name for standard
  120
+    #       glyph codes in the encoding. 65 to :A, etc. At the moment it only
  121
+    #       handles glyphs in the difference table
  122
+    #
  123
+    def int_to_name(glyph_code)
  124
+      if @enc_name == "Identity-H" || @enc_name == "Identity-V"
  125
+        nil
  126
+      else
  127
+        @differences[glyph_code]
  128
+      end
  129
+    end
  130
+
114 131
     private
115 132
 
116 133
     def internal_int_to_utf8_string(glyph_code)
14  lib/pdf/reader/width_calculator/built_in.rb
@@ -3,6 +3,16 @@
3 3
 require 'afm'
4 4
 require 'pdf/reader/synchronized_cache'
5 5
 
  6
+module AFM
  7
+  # this is a monkey patch for the AFM gem. hopefully my patch will be accepted
  8
+  # upstream and I can drop this
  9
+  class Font
  10
+    def metrics_for_name(name)
  11
+      @char_metrics[name.to_s]
  12
+    end
  13
+  end
  14
+end
  15
+
6 16
 class PDF::Reader
7 17
   module WidthCalculator
8 18
 
@@ -28,6 +38,10 @@ def glyph_width(code_point)
28 38
         return 0 if code_point.nil? || code_point < 0
29 39
 
30 40
         m = @metrics.metrics_for(code_point)
  41
+        if m.nil?
  42
+          name = @font.encoding.int_to_name(code_point)
  43
+          m = @metrics.metrics_for_name(name)
  44
+        end
31 45
         m[:wx]
32 46
       end
33 47
 

0 notes on commit f9c4635

Please sign in to comment.
Something went wrong with that request. Please try again.