Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Fix handling of HTML entities invalid in XML #347

Closed
wants to merge 1 commit into from

2 participants

@jasonmp85

The export module relies on innerHTML to retrieve the SVG representation of the current chart. Unfortunately, HTML is not XML. Specifically, if a node contains U+00A0 (NO-BREAK SPACE) or U+00AD (SOFT HYPHEN), some browsers will return an HTML entity (  and ­, respectively) rather than the actual character. Since these entities are not defined in XML, the SVG document is invalid and processing it into an image fails.

This change simply replaces all occurrences of   and ­ with the correct characters.

I created two fiddles to demonstrate this behavior:

  1. HTML entities in SVG — A reproduction of the bug forked from this Highcharts demo. The demo has been modified to add a non-breaking space to the labels, which breaks the export functionality.

  2. Non-Preserved Special Characters — This script iterates over all HTML characters which have entities defined and prints out a list of those which innerHTML modifies. I used this to build my list of affected characters.

I looked for somewhere to add a unit test, but none yet existed for the SVG exporter so I just left that alone. This change does however fix my repro case.

The fiddles were tested in Firefox 5.0.1, Chrome 13, IE 9, and the IE 8 mode of IE 9. While &, <, and > are also transformed by innerHTML, they do not need special handling as they are also valid entities in XML.

@jasonmp85 jasonmp85 Fix handling of HTML entities invalid in XML
The export module relies on `innerHTML` to retrieve the SVG representation
of the current chart. Unfortunately, HTML is **not** XML. Specifically, if
a node contains U+00A0 (NO-BREAK SPACE) or U+00AD (SOFT HYPHEN), some
browsers will return an HTML entity (` ` and `­`, respectively)
rather than the actual character. Since these entities are not defined in
XML, the SVG document is invalid and processing it into an image fails.

This change simply replaces all occurrences of ` ` and `­` with
the correct characters.
7da2665
@highslide-software

Shouldn't   and ­ be avoided in the Highcharts config in the first place?

@jasonmp85

While my example uses them in the formatter, this bug would also be exercised if they were in data returned by an AJAX call, or in input entered by a user (one can type a NO-BREAK SPACE pretty easily on OS X: it's option-space).

And again, this isn't related to the strings nbsp; or shy;, it's about the content of the SVG containing the literal UTF character U+00A0 or U+00AD and the browser returning nbsp; or shy; when Highcharts attempts to get them as part of the SVG string.

So while the entities nbsp; and shy; should be avoided in the Highcharts config, I see no reason to assume that the aforementioned literal characters should be disallowed in config, content, or input as long as the page is being served with the correct character encoding.

@jasonmp85

Any verdict on whether this can be included? If not, are there any alternative approaches?

@highslide-software
@jasonmp85

The entities do not appear in the config. The config contains valid UTF-8 characters which are silently converted to the entities by the browser when innerHTML is called by Highcharts.

Since these entities are not valid in SVG, the export module is broken for all inputs containing these characters. My patch fixes this.

Again to be clear, the input does not contain any entity references, it contains the UTF characters directly. Highcharts is broken for input containing these two characters.

@jasonmp85

Another question: What would I replace them with? They are already in the pure form, simple UTF-8 characters. Highcharts is calling innerHTML on the nodes containing these characters and the browsers are using entity references to encode the characters. Which is fine, because innerHTML returns HTML, and the references are valid in HTML. But then Highcharts slaps those entities into SVG, which is not OK.

Think of this in another way: a user wants a pie chart with number of commits per person and there's a team member named Bjørn. If innerHTML returned encoded that name as Bjørn, Highcharts would not be able to export a graph containing it because while ø is valid in HTML, it's not in SVG. Nothing the user could do short of replacing the ø with an o would work.

@highslide-software

OK, I missed the information that it derivated from true UTF characters, not the config. Fixing this now.

@highslide-software highslide-software closed this pull request from a commit
highslide-software Replace HTML entities before exporting. Closes #347. 0c9cd61
@jasonmp85

Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 6, 2011
  1. @jasonmp85

    Fix handling of HTML entities invalid in XML

    jasonmp85 authored
    The export module relies on `innerHTML` to retrieve the SVG representation
    of the current chart. Unfortunately, HTML is **not** XML. Specifically, if
    a node contains U+00A0 (NO-BREAK SPACE) or U+00AD (SOFT HYPHEN), some
    browsers will return an HTML entity (` ` and `­`, respectively)
    rather than the actual character. Since these entities are not defined in
    XML, the SVG document is invalid and processing it into an image fails.
    
    This change simply replaces all occurrences of ` ` and `­` with
    the correct characters.
This page is out of date. Refresh to see the latest.
Showing with 4 additions and 0 deletions.
  1. +4 −0 js/modules/exporting.src.js
View
4 js/modules/exporting.src.js
@@ -318,6 +318,10 @@ extend(Chart.prototype, {
return s.toLowerCase();
});
+ // replace HTML entities not valid in XML with actual characters
+ svg = svg.replace(/ /g, '\u00A0'); // NO-BREAK SPACE
+ svg = svg.replace(/­/g, '\u00AD'); // SOFT HYPHEN
+
// IE9 beta bugs with innerHTML. Test again with final IE9.
svg = svg.replace(/(url\(#highcharts-[0-9]+)"/g, '$1')
.replace(/"/g, "'");
Something went wrong with that request. Please try again.