Skip to content

Problems with special characters #654

@natgoodman

Description

@natgoodman

Greetings

After encountering problems with a few special characters I undertook a comprehensive test to see what worked and what didn’t. My test process involved generating @param tags with descriptions containing special characters in 4 contexts: normal text, quoted text, normal code, and quoted code. For each context, I attempted three ways to get the character to work: naked, escaped, and double-escaped. By way of example, the test lines for special character ‘$’ (with apologies for the messed up formatting of the 'code' cases) are

  • #' @param param0003 text unescaped normal: $
  • #' @param param0021 text unescaped quoted: "$"
  • #' @param param0039 text escaped normal: \$
  • #' @param param0057 text escaped quoted: "\$"
  • #' @param param0075 text double normal: \\$
  • #' @param param0093 text double quoted: "\\$"
  • #' @param param0111 code unescaped normal: `$`
  • #' @param param0129 code unescaped quoted: `"$"`
  • #' @param param0147 code escaped normal: `$`
  • #' @param param0165 code escaped quoted: '"$"`
  • #' @param param0183 code double normal: `\\$``
  • #' @param param0201 code double quoted: `"\\$"`

I placed the test lines in a .R file (attached), converted the roxygen to Rd using devtools::document, and converted the Rd to HTML using tools::Rd2HTML. Every so often I produced PDF using R CMD Rd2pdf just to be safe and never saw a case where the conversion to HTML worked, while the PDF conversion had problems.

The special characters I tested were & % $ # _ { } ~ ^ \ @ [ ] ( ) {} [] (). I included balanced pairs - {}, [], () - since balanced and unbalanced work differently in some contexts. These are the 10 LaTeX special characters, plus a few that I saw mentioned as special in roxygen or Rd, plus parentheses for good measure.

The table below shows what needs to be typed to get each special character rendered correctly, or 'NONE' if none of my attempts worked.

spcl text-normal text-quoted code-normal code-quoted
# # (but see note 2) "#" `#` `"#"`
$ $ "$" `$` `"$"`
% % "%" NONE NONE
& & "&" `&` `"&"`
( ( "(" `(` `"("`
() () "()" `()` `"()"`
) ) ")" `)` `")"`
@ @ "@" `@` `"@"`
[ [ "[" `[` `"["`
[] [] "[]" `[]` `"[]"`
\ NONE (but see note 3) "\" `\` NONE (but see note 3)
] ] "]" `]` `"]"`
^ ^ "^" `^` `"^"`
_ _ "_" `_` `"_"`
{ NONE NONE `{` `"{"`
{} \{\} "\{\}" `{}` `"{}"`
} NONE NONE `}` NONE
~ ~ "~" `~` `"~"`

Summary

  1. The following work without drama: $ & ( () ) @ [ [] ] ^ _ ~
  2. # works without drama except at start-of-text where (in a separate test) it triggers an error (“attempt to apply non-function”).
  3. The \ failures only occur when it’s is at the end of the text or code. It works without drama elsewhere.
  4. % works when escaped in text, but as reported by others, it doesn’t work in code
  5. As reported by others, unbalanced braces don’t work in text. Open brace works in code when escaped. Close brace works in normal code when escaped but not in quoted code.
  6. Balanced braces have to be double escaped in text, but work naked in code.

Reasoning that the problems may be caused by Rd limitations, I redid the test generating Rd directly. The results were far better: everything worked except \ in quoted code. I’m happy to provide the Rd results should that be of interest.

It goes without saying I’m also happy to provide the Perl script I used to generate the test cases or modify the script to run additional test cases as you wish.

roxescape.R.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions