Handle non-ASCII data properly #17

GoogleCodeExporter · 2015-06-30T04:19:15Z

There's a mistake in the handling of non-ASCII strings in the tokenizer. The 
xdot format tells us 
how many bytes long a string will be, but I hand that count to the substr 
function, which counts in 
characters, not bytes. I also wrongly named our variable "chars" instead of 
"bytes". (Actually the 
mistake was in the Graphviz documentation which said the xdot format counted 
characters; I 
submitted a patch to fix the documentation.)

None of the sample graphs exhibit the problem. You only see the problem if you 
have a single label 
which results in more than one text draw command, such as a multiline label, or 
a record or HTML-
like table. Here's an example:

digraph utf8 {
    a [label="ää\nb"]
}

Result in Canviz:

unknown token 14.000000

This was originally reported to me by email by Jan Wielemaker in November 2007 
and he provided a 
patch in his repository:

http://gollem.science.uva.nl/git/ClioPatria.git?
a=commitdiff;h=1669b252b25b6e75ced28be39b0449e9d13a62d3

I can't find any JavaScript string functions that work on bytes instead of 
characters so the method 
proposed in this patch seems to be the way to go.

Original issue reported on code.google.com by ryandesi...@gmail.com on 13 Oct 2008 at 4:31

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-06-30T04:19:16Z

Fixed in r115. I rewrote the patch to match my style and use 
easier-to-understand variable names.

Original comment by ryandesi...@gmail.com on 13 Oct 2008 at 6:43

Changed state: Fixed

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels Jun 30, 2015

GoogleCodeExporter closed this as completed Jun 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle non-ASCII data properly #17

Handle non-ASCII data properly #17

GoogleCodeExporter commented Jun 30, 2015

GoogleCodeExporter commented Jun 30, 2015

Handle non-ASCII data properly #17

Handle non-ASCII data properly #17

Comments

GoogleCodeExporter commented Jun 30, 2015

GoogleCodeExporter commented Jun 30, 2015