Skip to content

Commit

Permalink
Do not write out invalid characters to XML file
Browse files Browse the repository at this point in the history
For special characters such as: <, >, or ', write them out to the file
as: &#[decimal ASCII value];

Character     Written to    Appears in Jenkins
to encode     XML file      test result viewer
=========     ==========    ==================
  <            &lt;             <
  >            &gt;             >
  '            &apos;           '
  &            &amp;            &
  "            &quot;           "

For characters which are listed as RestrictedChar in
http://www.w3.org/TR/xml11/#charsets , these characters are completely
invalid XML, and cannot even be escaped.

Character     Written to    Appears in Jenkins
to encode     XML file      test result viewer
=========     ==========    ==================
  0x08         &amp;#8;          &#8;
  0x1F         &amp;#31;         &#31;

This will at least generate a valid XML file, but let us see where these
restricted characters would appear in the output.

Fixes #136
  • Loading branch information
rodrigc authored and jmmv committed Nov 19, 2015
1 parent f5b7a97 commit 0bb14ef
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 17 deletions.
4 changes: 4 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ STILL UNDER DEVELOPMENT; NOT RELEASED YET.
GDB 6.1.1 (circa 2004) does not have the -ex flag so we need to
generate a temporary GDB script and feed it to GDB with -x instead.

* Issue 136: Fixed the XML escaping in the JUnit output so that
non-printable characters are properly handled when they appear in the
process's stdout or stderr.

* Issue 141: Improved reporting of errors triggered by sqlite3. In
particular, all error messages are now tagged with their corresponding
database filename and, if they are API-level errors, the name of the
Expand Down
46 changes: 29 additions & 17 deletions utils/text/operations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ namespace text = utils::text;

/// Replaces XML special characters from an input string.
///
/// The list of XML special characters is specified here:
/// http://www.w3.org/TR/xml11/#charsets
///
/// \param in The input to quote.
///
/// \return A quoted string without any XML special characters.
Expand All @@ -46,25 +49,34 @@ text::escape_xml(const std::string& in)
{
std::ostringstream quoted;

const char* delims = "\"&<>'"; // Keep in sync with 'switch' below.
std::string::size_type start_pos = 0;
std::string::size_type last_pos = in.find_first_of(delims);
while (last_pos != std::string::npos) {
quoted << in.substr(start_pos, last_pos - start_pos);
switch (in[last_pos]) {
case '"': quoted << "&quot;"; break;
case '&': quoted << "&amp;"; break;
case '<': quoted << "&lt;"; break;
case '>': quoted << "&gt;"; break;
case '\'': quoted << "&apos;"; break;
default: UNREACHABLE;
for (std::string::const_iterator it = in.begin();
it != in.end(); ++it) {
unsigned char c = (unsigned char)*it;
if (c == '"') {
quoted << "&quot;";
} else if (c == '&') {
quoted << "&amp;";
} else if (c == '<') {
quoted << "&lt;";
} else if (c == '>') {
quoted << "&gt;";
} else if (c == '\'') {
quoted << "&apos;";
} else if ((c >= 0x01 && c <= 0x08) ||
(c >= 0x0B && c <= 0x0C) ||
(c >= 0x0E && c <= 0x1F) ||
(c >= 0x7F && c <= 0x84) ||
(c >= 0x86 && c <= 0x9F)) {
// for RestrictedChar characters, escape them
// as '&amp;#[decimal ASCII value];'
// so that in the XML file we will see the escaped
// character.
quoted << "&amp;#" << static_cast< std::string::size_type >(*it)
<< ";";
} else {
quoted << *it;
}
start_pos = last_pos + 1;
last_pos = in.find_first_of(delims, start_pos);
}
if (start_pos < in.length())
quoted << in.substr(start_pos);

return quoted.str();
}

Expand Down
3 changes: 3 additions & 0 deletions utils/text/operations_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ ATF_TEST_CASE_BODY(escape_xml__no_escaping)
{
ATF_REQUIRE_EQ("a", text::escape_xml("a"));
ATF_REQUIRE_EQ("Some text!", text::escape_xml("Some text!"));
ATF_REQUIRE_EQ("\n\t\r", text::escape_xml("\n\t\r"));
}


Expand All @@ -90,6 +91,8 @@ ATF_TEST_CASE_BODY(escape_xml__some_escaping)

ATF_REQUIRE_EQ("&quot;&amp;&lt;&gt;&apos;", text::escape_xml("\"&<>'"));
ATF_REQUIRE_EQ("&amp;&amp;&amp;", text::escape_xml("&&&"));
ATF_REQUIRE_EQ("&amp;#8;&amp;#11;", text::escape_xml("\b\v"));
ATF_REQUIRE_EQ("\t&amp;#127;BAR&amp;", text::escape_xml("\t\x7f""BAR&"));
}


Expand Down

0 comments on commit 0bb14ef

Please sign in to comment.