Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svg double hyphen in plot title -- #1967

Merged
merged 1 commit into from May 3, 2013

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented May 3, 2013

a double hyphen in plot title breaks the svg output because the double hyphen -- gets included in a xml comment, breaking it

minimal example:

import matplotlib.pyplot as plt
plt.plot([0,1])
plt.title('--')
plt.savefig( 'a.svg', format='svg' )
plt.savefig( 'a.png', format='png' )

now try and open svg output it with firefox and it will point at the error line, 479 in version 1.2.1. which contains:

<!-- -- -->

but xml comments cannot contain -- http://stackoverflow.com/questions/10842131/xml-comments-and

I could not find in the stdlib a standard way to escape/unescape double hyphens, but if anyone does, we should use it.

if an standard way does not exist, my proposed solution is: replace -- with something else in the svg comments. I suggest

  • replace all occurrences of -- with -.-, where . is any literal character
  • replace all occurrences of -.- with -..-
  • -..- with -...-

and that this be done in an overlapping manner:

  • --- becomes -.-.-

This can be achieved with the following class:

import re

class XmlCommentEscaper:
    """Escapes and unescapes ``--`` in xml comments so that they are not broken."""

    def __init__(self):
        self.escape_xml_comment = re.compile( r'-(\.*)(?=-)')

    def escape(self,comment):
        """
        >>> XmlCommentEscaper().escape('--')
        '-.-'
        >>> XmlCommentEscaper().escape('-- --')
        '-.- -.-'

        multiple:

        >>> XmlCommentEscaper().escape('-.-')
        '-..-'
        >>> XmlCommentEscaper().escape('-..-')
        '-...-'
        >>> XmlCommentEscaper().escape('-...-')
        '-....-'

        overlap:

        >>> XmlCommentEscaper().escape('---')
        '-.-.-'
        >>> XmlCommentEscaper().escape('----')
        '-.-.-.-'
        >>> XmlCommentEscaper().escape('-.--')
        '-..-.-'

        non matches:

        >>> XmlCommentEscaper().escape('-.')
        '-.'
        >>> XmlCommentEscaper().escape('.-')
        '.-'
        """
        return self.escape_xml_comment.sub( lambda m: '-' + m.group(1) + '.', comment )

    def unescape(self,comment):
        """
        >>> XmlCommentEscaper().unescape('-.-')
        '--'
        >>> XmlCommentEscaper().unescape('-.- -.-')
        '-- --'

        multiple:

        >>> XmlCommentEscaper().unescape('-..-')
        '-.-'
        >>> XmlCommentEscaper().unescape('-...-')
        '-..-'
        >>> XmlCommentEscaper().unescape('-....-')
        '-...-'

        overlap:

        >>> XmlCommentEscaper().unescape('-.-.-')
        '---'
        >>> XmlCommentEscaper().unescape('-.-.-.-')
        '----'
        >>> XmlCommentEscaper().unescape('-..-.-')
        '-.--'

        non matches:

        >>> XmlCommentEscaper().unescape('--')
        '--'
        >>> XmlCommentEscaper().unescape('-.')
        '-.'
        >>> XmlCommentEscaper().unescape('.-')
        '.-'
        """
        return self.escape_xml_comment.sub( lambda m: '-' + m.group(1)[1:], comment )

if __name__ == "__main__":
    import doctest
    doctest.testmod()

I just don't know exactly where to plug this, but it should take 5 mins for someone who knows the svg savefig architecture.

@mdboom
Copy link
Member

mdboom commented May 3, 2013

I think as long as we find a way to escape it, we don't need to worry about unescaping (since matplotlib doesn't read XML/SVG files at all). I'll work up a PR here shortly.

@ghost ghost assigned mdboom May 3, 2013
@mdboom
Copy link
Member

mdboom commented May 3, 2013

A solution is attached. @cirosantilli, could you please confirm it works for you?

FWIW, I took the approach from docbook, which is to convert -- to - -.

@cirosantilli
Copy link
Author

I'm sorry but I haven't yet been able to install matplotlib from source to test it out, I'll try to do that

for the time being, I see that you approach is based on s = s.replace(u"--", u"- -"), but I think this may fail because of overlapping:

>>> "---".replace("--","- -")
'- --'

so the comment would still broken (please check). I think the way to go is still with re + sub:

escape_xml_comment = re.compile( r'-(?=-)')
return escape_xml_comment.sub( '- ', s )

which works because of the lookahead: "---" --> "- - -"

as for unscapping, I agree that it is not directly necessary for matplotlib, I just feel it is a good practice to give unscape to escapes, since one day some user will want to use the comment content grammatically and will have to write this himself. Of course, not sure it is worth the extra code maintenance...

@mdboom
Copy link
Member

mdboom commented May 3, 2013

Ah, thanks. Good catch. I will update this to use the regex. I'm not too concerned about the unescaping at this point -- comments shouldn't ever really by used grammatically anyway...

@cirosantilli
Copy link
Author

haha sorry I meant programatically, not gramatically =)

@@ -71,6 +71,11 @@ def escape_cdata(s):
s = s.replace(u">", u"&gt;")
return s

def escape_comment(s):
escape_xml_comment = re.compile( r'-(?=-)')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps compile this re outside of the function?

@pelson
Copy link
Member

pelson commented May 3, 2013

👍 - LGTM

mdboom added a commit that referenced this pull request May 3, 2013
svg double hyphen in plot title --
@mdboom mdboom merged commit aca3fea into matplotlib:v1.2.x May 3, 2013
@mdboom mdboom deleted the svg-comment-escape branch August 7, 2014 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants