Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

improve text extraction of built in fonts

* if the font includes a difference table (allowing it to use non ASCII
  chars) then we need to do some extra work to extract the glyph width
  from the relevant AFM file
  • Loading branch information...
commit f9c4635a5d2edc3f93841007f6cccc86d7bdf576 1 parent a661d3d
@yob authored
View
17 lib/pdf/reader/encoding.rb
@@ -111,6 +111,23 @@ def int_to_utf8_string(glyph_code)
@string_cache[glyph_code] ||= internal_int_to_utf8_string(glyph_code)
end
+ # convert an integer glyph code into an Adobe glyph name.
+ #
+ # int_to_name(65)
+ # => :A
+ #
+ # TODO: this needs to be expanded to return the appropriate name for standard
+ # glyph codes in the encoding. 65 to :A, etc. At the moment it only
+ # handles glyphs in the difference table
+ #
+ def int_to_name(glyph_code)
+ if @enc_name == "Identity-H" || @enc_name == "Identity-V"
+ nil
+ else
+ @differences[glyph_code]
+ end
+ end
+
private
def internal_int_to_utf8_string(glyph_code)
View
14 lib/pdf/reader/width_calculator/built_in.rb
@@ -3,6 +3,16 @@
require 'afm'
require 'pdf/reader/synchronized_cache'
+module AFM
+ # this is a monkey patch for the AFM gem. hopefully my patch will be accepted
+ # upstream and I can drop this
+ class Font
+ def metrics_for_name(name)
+ @char_metrics[name.to_s]
+ end
+ end
+end
+
class PDF::Reader
module WidthCalculator
@@ -28,6 +38,10 @@ def glyph_width(code_point)
return 0 if code_point.nil? || code_point < 0
m = @metrics.metrics_for(code_point)
+ if m.nil?
+ name = @font.encoding.int_to_name(code_point)
+ m = @metrics.metrics_for_name(name)
+ end
m[:wx]
end
Please sign in to comment.
Something went wrong with that request. Please try again.