New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svg double hyphen in plot title -- #1967

Merged
merged 1 commit into from May 3, 2013

Conversation

Projects
None yet
3 participants
@mdboom
Member

mdboom commented May 3, 2013

a double hyphen in plot title breaks the svg output because the double hyphen -- gets included in a xml comment, breaking it

minimal example:

import matplotlib.pyplot as plt
plt.plot([0,1])
plt.title('--')
plt.savefig( 'a.svg', format='svg' )
plt.savefig( 'a.png', format='png' )

now try and open svg output it with firefox and it will point at the error line, 479 in version 1.2.1. which contains:

<!-- -- -->

but xml comments cannot contain -- http://stackoverflow.com/questions/10842131/xml-comments-and

I could not find in the stdlib a standard way to escape/unescape double hyphens, but if anyone does, we should use it.

if an standard way does not exist, my proposed solution is: replace -- with something else in the svg comments. I suggest

  • replace all occurrences of -- with -.-, where . is any literal character
  • replace all occurrences of -.- with -..-
  • -..- with -...-

and that this be done in an overlapping manner:

  • --- becomes -.-.-

This can be achieved with the following class:

import re

class XmlCommentEscaper:
    """Escapes and unescapes ``--`` in xml comments so that they are not broken."""

    def __init__(self):
        self.escape_xml_comment = re.compile( r'-(\.*)(?=-)')

    def escape(self,comment):
        """
        >>> XmlCommentEscaper().escape('--')
        '-.-'
        >>> XmlCommentEscaper().escape('-- --')
        '-.- -.-'

        multiple:

        >>> XmlCommentEscaper().escape('-.-')
        '-..-'
        >>> XmlCommentEscaper().escape('-..-')
        '-...-'
        >>> XmlCommentEscaper().escape('-...-')
        '-....-'

        overlap:

        >>> XmlCommentEscaper().escape('---')
        '-.-.-'
        >>> XmlCommentEscaper().escape('----')
        '-.-.-.-'
        >>> XmlCommentEscaper().escape('-.--')
        '-..-.-'

        non matches:

        >>> XmlCommentEscaper().escape('-.')
        '-.'
        >>> XmlCommentEscaper().escape('.-')
        '.-'
        """
        return self.escape_xml_comment.sub( lambda m: '-' + m.group(1) + '.', comment )

    def unescape(self,comment):
        """
        >>> XmlCommentEscaper().unescape('-.-')
        '--'
        >>> XmlCommentEscaper().unescape('-.- -.-')
        '-- --'

        multiple:

        >>> XmlCommentEscaper().unescape('-..-')
        '-.-'
        >>> XmlCommentEscaper().unescape('-...-')
        '-..-'
        >>> XmlCommentEscaper().unescape('-....-')
        '-...-'

        overlap:

        >>> XmlCommentEscaper().unescape('-.-.-')
        '---'
        >>> XmlCommentEscaper().unescape('-.-.-.-')
        '----'
        >>> XmlCommentEscaper().unescape('-..-.-')
        '-.--'

        non matches:

        >>> XmlCommentEscaper().unescape('--')
        '--'
        >>> XmlCommentEscaper().unescape('-.')
        '-.'
        >>> XmlCommentEscaper().unescape('.-')
        '.-'
        """
        return self.escape_xml_comment.sub( lambda m: '-' + m.group(1)[1:], comment )

if __name__ == "__main__":
    import doctest
    doctest.testmod()

I just don't know exactly where to plug this, but it should take 5 mins for someone who knows the svg savefig architecture.

@mdboom

This comment has been minimized.

Member

mdboom commented May 3, 2013

I think as long as we find a way to escape it, we don't need to worry about unescaping (since matplotlib doesn't read XML/SVG files at all). I'll work up a PR here shortly.

@ghost ghost assigned mdboom May 3, 2013

@mdboom

This comment has been minimized.

Member

mdboom commented May 3, 2013

A solution is attached. @cirosantilli, could you please confirm it works for you?

FWIW, I took the approach from docbook, which is to convert -- to - -.

@cirosantilli

This comment has been minimized.

cirosantilli commented May 3, 2013

I'm sorry but I haven't yet been able to install matplotlib from source to test it out, I'll try to do that

for the time being, I see that you approach is based on s = s.replace(u"--", u"- -"), but I think this may fail because of overlapping:

>>> "---".replace("--","- -")
'- --'

so the comment would still broken (please check). I think the way to go is still with re + sub:

escape_xml_comment = re.compile( r'-(?=-)')
return escape_xml_comment.sub( '- ', s )

which works because of the lookahead: "---" --> "- - -"

as for unscapping, I agree that it is not directly necessary for matplotlib, I just feel it is a good practice to give unscape to escapes, since one day some user will want to use the comment content grammatically and will have to write this himself. Of course, not sure it is worth the extra code maintenance...

@mdboom

This comment has been minimized.

Member

mdboom commented May 3, 2013

Ah, thanks. Good catch. I will update this to use the regex. I'm not too concerned about the unescaping at this point -- comments shouldn't ever really by used grammatically anyway...

@cirosantilli

This comment has been minimized.

cirosantilli commented May 3, 2013

haha sorry I meant programatically, not gramatically =)

@@ -71,6 +71,11 @@ def escape_cdata(s):
s = s.replace(u">", u"&gt;")
return s
def escape_comment(s):
escape_xml_comment = re.compile( r'-(?=-)')

This comment has been minimized.

@pelson

pelson May 3, 2013

Member

Perhaps compile this re outside of the function?

def escape_comment(s):
escape_xml_comment = re.compile( r'-(?=-)')
s = escape_cdata(s)
return escape_xml_comment.sub( '- ', s )

This comment has been minimized.

@pelson

pelson May 3, 2013

Member

A couple of extra whitespaces could do with being removed from this line and the ones above.

@pelson

This comment has been minimized.

Member

pelson commented May 3, 2013

👍 - LGTM

mdboom added a commit that referenced this pull request May 3, 2013

Merge pull request #1967 from mdboom/svg-comment-escape
svg double hyphen in plot title --

@mdboom mdboom merged commit aca3fea into matplotlib:v1.2.x May 3, 2013

@mdboom mdboom deleted the mdboom:svg-comment-escape branch Aug 7, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment