Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML has utf-16 fails to parse #41

Open
nportelli opened this issue Jul 2, 2013 · 3 comments
Open

XML has utf-16 fails to parse #41

nportelli opened this issue Jul 2, 2013 · 3 comments

Comments

@nportelli
Copy link

I'm a bit unfamiliar with utf, but is there a reason why it won't parse if it is utf-16?

@TheChrisPratt
Copy link

Are you sure the source file is encoded as UTF-16? It definitely won't
parse a single-byte encoded file using a double-byte decoder.
(Chris)

On Tue, Jul 2, 2013 at 1:49 PM, Nick Portelli notifications@github.comwrote:

I'm a bit unfamiliar with utf, but is there a reason why it won't parse if
it is utf-16?


Reply to this email directly or view it on GitHubhttps://github.com//issues/41
.

@nportelli
Copy link
Author

I'm not sure. I think it is whatever the default .net serializer we are using does. All I need to do is change 16 to 8 and your plugin works great. So in all reality not the plugin's issue. I should figure out how to make the thing save in utf-8. Go ahead and close this.

@domduke12
Copy link

I had the same issue, I temporary change remove the utf-16 from xml declaration and add it back before returning formatted string. Not sure this can be a fixer. Test on sublime 3. Find "fix:" in followed code...

class IndentXmlCommand(BaseIndentCommand):
    def indent(self, s):                
        # convert to utf
        s = s.encode("utf-8") 
        xmlheader = re.compile(b"<\?.*\?>").match(s)
        # fix: replace header 
        if xmlheader:
            s = s.replace(xmlheader.group(), '<?xml version="1.0"?>')
        # convert to plain string without indents and spaces
        s = re.compile(b'>\s+([^\s])', re.DOTALL).sub(b'>\g<1>', s)
        # replace tags to convince minidom process cdata as text
        s = s.replace(b'<![CDATA[', b'%CDATAESTART%').replace(b']]>', b'%CDATAEEND%') 
        try:
            s = parseString(s).toprettyxml()
        except Exception as e:
            sublime.active_window().run_command("show_panel", {"panel": "console", "toggle": True})
            raise e
        # remove line breaks
        s = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL).sub('>\g<1></', s)
        # restore cdata
        s = s.replace('%CDATAESTART%', '<![CDATA[').replace('%CDATAEEND%', ']]>')
        # remove xml header
        s = s.replace("<?xml version=\"1.0\" ?>", "").strip()
        if xmlheader: 
            s = xmlheader.group().decode("utf-8") + "\n" + s 
        return s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants