Skip to content

Commit

Permalink
Fix some of rdoc's HTML worst typesetting issues:
Browse files Browse the repository at this point in the history
* The ' character will be converted into an apostrophe, opening single quote,
  or closing single quote correctly in most cases.
* The " character will be converted into an opening double quote or a closing
  double quote in most cases.

Note, however, that tons of issues with HTML typesetting remain.  Fixing these
properly will require a rewrite of the markup engine, which I will look at next.
This is a short term fix intended to ameliorate the worst of the issues.
  • Loading branch information
designingpatts committed Aug 12, 2008
1 parent 05fb19b commit 272eee1
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 46 deletions.
5 changes: 5 additions & 0 deletions History.txt
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
* Fixed main page for frameless template. Patch by Marcin Raczkowski. * Fixed main page for frameless template. Patch by Marcin Raczkowski.
* Fixed missing stylesheet in generated chm. Patch by Gordon Thiesfeld. * Fixed missing stylesheet in generated chm. Patch by Gordon Thiesfeld.
* Fixed the parsing of module names starting with '::'. * Fixed the parsing of module names starting with '::'.
* Fixed some (but not all!) of the issues with RDoc's HTML typesetting:
** RDoc now correctly converts ' characters to apostrophes, opening single
quotes, and closing single quotes in most cases (smart single quotes).
** RDoc now correctly converts " characters to opening double quotes and
and closing double quotes in most cases (smart double quotes).


=== 2.1.0 / 2008-07-20 === 2.1.0 / 2008-07-20


Expand Down
4 changes: 2 additions & 2 deletions lib/rdoc/markup/to_html.rb
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -318,10 +318,10 @@ def convert_string_fancy(item)
gsub(/'/, '‘'). gsub(/'/, '‘').


# convert double closing quote # convert double closing quote
gsub(%r{([^ \t\r\n\[\{\(])\'(?=\W)}, '\1”'). # } gsub(%r{([^ \t\r\n\[\{\(])\"(?=\W)}, '\1”'). # }


# convert double opening quote # convert double opening quote
gsub(/'/, '“'). gsub(/"/, '“').


# convert copyright # convert copyright
gsub(/\(c\)/, '©'). gsub(/\(c\)/, '©').
Expand Down
64 changes: 38 additions & 26 deletions lib/rdoc/markup/to_html_crossref.rb
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -25,6 +25,43 @@ class RDoc::Markup::ToHtmlCrossref < RDoc::Markup::ToHtml
CLASS_REGEXP_STR = '\\\\?((?:\:{2})?[A-Za-z]\w*(?:\:\:\w+)*)' CLASS_REGEXP_STR = '\\\\?((?:\:{2})?[A-Za-z]\w*(?:\:\:\w+)*)'
METHOD_REGEXP_STR = '(\w+[!?=]?)(?:\([\.\w+\*\/\+\-\=\<\>]*\))?' METHOD_REGEXP_STR = '(\w+[!?=]?)(?:\([\.\w+\*\/\+\-\=\<\>]*\))?'


# Regular expressions matching text that should potentially have
# cross-reference links generated are passed to add_special.
# Note that these expressions are meant to pick up text for which
# cross-references have been suppressed, since the suppression
# characters are removed by the code that is triggered.
CROSSREF_REGEXP = /(
# A::B::C.meth
#{CLASS_REGEXP_STR}[\.\#]#{METHOD_REGEXP_STR}
# Stand-alone method (proceeded by a #)
| \\?\##{METHOD_REGEXP_STR}
# A::B::C
# The stuff after CLASS_REGEXP_STR is a
# nasty hack. CLASS_REGEXP_STR unfortunately matches
# words like dog and cat (these are legal "class"
# names in Fortran 95). When a word is flagged as a
# potential cross-reference, limitations in the markup
# engine suppress other processing, such as typesetting.
# This is particularly noticeable for contractions.
# In order that words like "can't" not
# be flagged as potential cross-references, only
# flag potential class cross-references if the character
# after the cross-referece is a space or sentence
# punctuation.
| #{CLASS_REGEXP_STR}(?=[\s\)\.\?\!\,\;]|\z)
# Things that look like filenames
# The key thing is that there must be at least
# one special character (period, slash, or
# underscore).
| \w+[_\/\.][\w\/\.]+
# Things that have markup suppressed
| \\[^\s]
)/x

## ##
# We need to record the html path of our caller so we can generate # We need to record the html path of our caller so we can generate
# correct relative paths for any hyperlinks that we find # correct relative paths for any hyperlinks that we find
Expand All @@ -33,32 +70,7 @@ def initialize(from_path, context, show_hash)
raise ArgumentError, 'from_path cannot be nil' if from_path.nil? raise ArgumentError, 'from_path cannot be nil' if from_path.nil?
super() super()


# Regular expressions matching text that should potentially have @markup.add_special(CROSSREF_REGEXP, :CROSSREF)
# cross-reference links generated are passed to add_special.
# Note that these expressions are meant to pick up text for which
# cross-references have been suppressed, since the suppression
# characters are removed by the code that is triggered.

@markup.add_special(/(
# A::B::C.meth
#{CLASS_REGEXP_STR}[\.\#]#{METHOD_REGEXP_STR}
# Stand-alone method (proceeded by a #)
| \\?\##{METHOD_REGEXP_STR}
# A::B::C
| #{CLASS_REGEXP_STR}
# Things that look like filenames
# The key thing is that there must be at least
# one special character (period, slash, or
# underscore).
| \w*[_\/\.][\w\/\.]*
# Things that have markup suppressed
| \\[^\s]
)/x,
:CROSSREF)


@from_path = from_path @from_path = from_path
@context = context @context = context
Expand Down
28 changes: 14 additions & 14 deletions test/test_rdoc_markup_attribute_manager.rb
Original file line number Original file line Diff line number Diff line change
@@ -1,5 +1,6 @@
require "test/unit" require "test/unit"
require "rdoc/markup/inline" require "rdoc/markup/inline"
require "rdoc/markup/to_html_crossref"


class TestRDocMarkupAttributeManager < Test::Unit::TestCase class TestRDocMarkupAttributeManager < Test::Unit::TestCase


Expand Down Expand Up @@ -201,24 +202,23 @@ def test_protect
end end


def test_special def test_special
# class names, variable names, file names, or instance variables @am.add_special(RDoc::Markup::ToHtmlCrossref::CROSSREF_REGEXP, :CROSSREF)
@am.add_special(/(
\b([A-Z]\w+(::\w+)*)
| \#\w+[!?=]?
| \b\w+([_\/\.]+\w+)+[!?=]?
)/x,
:CROSSREF)


assert_equal(["cat"], @am.flow("cat")) #
# The apostrophes in "cats'" and "dogs'" suppress the flagging of these
# words as potential cross-references, which is necessary for the unit
# tests. Unfortunately, the markup engine right now does not actually
# check whether a cross-reference is valid before flagging it.
#
assert_equal(["cats'"], @am.flow("cats'"))


assert_equal(["cat ", crossref("#fred"), " dog"].flatten, assert_equal(["cats' ", crossref("#fred"), " dogs'"].flatten,
@am.flow("cat #fred dog")) @am.flow("cats' #fred dogs'"))


assert_equal([crossref("#fred"), " dog"].flatten, assert_equal([crossref("#fred"), " dogs'"].flatten,
@am.flow("#fred dog")) @am.flow("#fred dogs'"))


assert_equal(["cat ", crossref("#fred")].flatten, @am.flow("cat #fred")) assert_equal(["cats' ", crossref("#fred")].flatten, @am.flow("cats' #fred"))
end end


end end

16 changes: 14 additions & 2 deletions test/test_rdoc_markup_to_html.rb
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -10,11 +10,23 @@ def setup
end end


def test_tt_formatting def test_tt_formatting
assert_equal "<p>\n<tt>--</tt> &#8212; <tt>(c)</tt> &#169;\n</p>\n", assert_equal "<p>\n<tt>--</tt> &#8212; <tt>cats'</tt> cats&#8217;\n</p>\n",
util_format("<tt>--</tt> -- <tt>(c)</tt> (c)") util_format("<tt>--</tt> -- <tt>cats'</tt> cats'")
assert_equal "<p>\n<b>&#8212;</b>\n</p>\n", util_format("<b>--</b>") assert_equal "<p>\n<b>&#8212;</b>\n</p>\n", util_format("<b>--</b>")
end end


def test_convert_string_fancy
#
# The HTML typesetting is broken in a number of ways, but I have fixed
# the most glaring issues for single and double quotes. Note that
# "strange" symbols (periods or dashes) need to be at the end of the
# test case strings in order to suppress cross-references.
#
assert_equal "<p>\n&#8220;cats&#8221;.\n</p>\n", util_format("\"cats\".")
assert_equal "<p>\n&#8216;cats&#8217;.\n</p>\n", util_format("\'cats\'.")
assert_equal "<p>\ncat&#8217;s-\n</p>\n", util_format("cat\'s-")
end

def util_fragment(text) def util_fragment(text)
RDoc::Markup::Fragment.new 0, nil, nil, text RDoc::Markup::Fragment.new 0, nil, nil, text
end end
Expand Down
4 changes: 2 additions & 2 deletions test/test_rdoc_markup_to_html_crossref.rb
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -160,8 +160,8 @@ def verify_invariant_crossrefs(xref)
# The hyphen character is not a valid class/method separator character, so # The hyphen character is not a valid class/method separator character, so
# rdoc just generates a class cross-reference (perhaps it should not # rdoc just generates a class cross-reference (perhaps it should not
# generate anything?). # generate anything?).
result = "<a href=\"../classes/Ref_Class2/Ref_Class3.html\">Ref_Class2::Ref_Class3</a>-method(*)" result = "<a href=\"../classes/Ref_Class2/Ref_Class3.html\">Ref_Class2::Ref_Class3</a>;method(*)"
verify_convert xref, "Ref_Class2::Ref_Class3-method(*)", result verify_convert xref, "Ref_Class2::Ref_Class3;method(*)", result


# There is one Ref_Class3 nested in Ref_Class2 and one defined in the # There is one Ref_Class3 nested in Ref_Class2 and one defined in the
# top-level namespace; regardless, ::Ref_Class3 (Ref_Class3 relative # top-level namespace; regardless, ::Ref_Class3 (Ref_Class3 relative
Expand Down

0 comments on commit 272eee1

Please sign in to comment.