process XML data before pretty-printing to trim whitespace #172

Merged
merged 2 commits into from Mar 18, 2014

Conversation

Projects
None yet
4 participants
@unsignedint
Contributor

unsignedint commented Oct 15, 2013

When receiving XML data that is already 'pretty' the current pretty printing process seems to add extra new line characters. I've modified the code to actually strip whitespace from the source input which seems to resolve this.

Input:

>>> raw = '<a>\n  <b>blah</b>\n  <c>123</c>\n</a>\n'
>>> print raw
<a>
  <b>blah</b>
  <c>123</c>
</a>

Current:

>>> print parseString(raw).toprettyxml(indent=' '*4)
<?xml version="1.0" ?>
<a>


    <b>
        blah
    </b>


    <c>
        123
    </c>


</a>

With patch:

>>> processed = ''.join((x.strip() for x in raw.split('\n')))
>>> print parseString(processed).toprettyxml(indent=' '*4)
<?xml version="1.0" ?>
<a>
    <b>
        blah
    </b>
    <c>
        123
    </c>
</a>
@unsignedint

This comment has been minimized.

Show comment Hide comment
@unsignedint

unsignedint Oct 16, 2013

Contributor

OK, my github skills are lacking, but commit bee10e5 is an alternative solution that uses ElementTree (and the indent function from http://effbot.org/zone/element-lib.htm#prettyprint).

The difference is that the output XML formatting will then put fields on a single line rather than with the added line-breaks. I.e. compare the version above with:

<a>
    <b>123</b>
    <c>456</c>
</a>
Contributor

unsignedint commented Oct 16, 2013

OK, my github skills are lacking, but commit bee10e5 is an alternative solution that uses ElementTree (and the indent function from http://effbot.org/zone/element-lib.htm#prettyprint).

The difference is that the output XML formatting will then put fields on a single line rather than with the added line-breaks. I.e. compare the version above with:

<a>
    <b>123</b>
    <c>456</c>
</a>
@jakubroztocil

This comment has been minimized.

Show comment Hide comment
@jakubroztocil

jakubroztocil Nov 23, 2013

Owner

@unsignedint Thanks for the pull request! Would you mind adding a couple of test cases to showcase and verify the new behaviour?

Owner

jakubroztocil commented Nov 23, 2013

@unsignedint Thanks for the pull request! Would you mind adding a couple of test cases to showcase and verify the new behaviour?

@maedox

This comment has been minimized.

Show comment Hide comment
@maedox

maedox Feb 27, 2014

Am I right in assuming this did not get merged yet?
I am mostly working with XML input and output and the extra two blank lines per line of XML is driving me nuts. sad-🐼

maedox commented Feb 27, 2014

Am I right in assuming this did not get merged yet?
I am mostly working with XML input and output and the extra two blank lines per line of XML is driving me nuts. sad-🐼

@skurfer

This comment has been minimized.

Show comment Hide comment
@skurfer

skurfer Mar 16, 2014

Am I right in assuming this did not get merged yet?

Looks like it’s waiting on tests, but you can try it now. (I am. Huge improvement.)

git clone https://github.com/jkbr/httpie.git
cd httpie
git checkout -b xml master
git pull https://github.com/unsignedint/httpie master
python setup.py install

That last step can be done in a virtualenv, etc. Since the version number in master matches the version on PyPI (for now), you should still see future updates.

@unsignedint, since your first commit is largely replaced, you can squash it with the second to make it look like you did it right the first time. 😃 (Try something like git rebase -i HEAD~2.) But then you’d have to force-push to get the changes here, which may or may not be frowned upon, depending on the project’s policy.

skurfer commented Mar 16, 2014

Am I right in assuming this did not get merged yet?

Looks like it’s waiting on tests, but you can try it now. (I am. Huge improvement.)

git clone https://github.com/jkbr/httpie.git
cd httpie
git checkout -b xml master
git pull https://github.com/unsignedint/httpie master
python setup.py install

That last step can be done in a virtualenv, etc. Since the version number in master matches the version on PyPI (for now), you should still see future updates.

@unsignedint, since your first commit is largely replaced, you can squash it with the second to make it look like you did it right the first time. 😃 (Try something like git rebase -i HEAD~2.) But then you’d have to force-push to get the changes here, which may or may not be frowned upon, depending on the project’s policy.

- content = doc.toprettyxml(indent=' ' * DEFAULT_INDENT)
- except xml.parsers.expat.ExpatError:
+ root = ElementTree.fromstring(content.encode(encoding))
+ self.indent(root)

This comment has been minimized.

Show comment Hide comment
@jakubroztocil

jakubroztocil Mar 18, 2014

Owner

Isn't XMLProcessor.indent() static?

@jakubroztocil

jakubroztocil Mar 18, 2014

Owner

Isn't XMLProcessor.indent() static?

jakubroztocil added a commit that referenced this pull request Mar 18, 2014

Merge pull request #172 from unsignedint/master
process XML data before pretty-printing to trim whitespace

@jakubroztocil jakubroztocil merged commit 733771f into jakubroztocil:master Mar 18, 2014

1 check passed

default The Travis CI build passed
Details
@jakubroztocil

This comment has been minimized.

Show comment Hide comment
@jakubroztocil

jakubroztocil Mar 18, 2014

Owner

Merged. Tests are still welcome though 😉

Owner

jakubroztocil commented Mar 18, 2014

Merged. Tests are still welcome though 😉

@unsignedint

This comment has been minimized.

Show comment Hide comment
@unsignedint

unsignedint Mar 19, 2014

Contributor

Thanks for merging... TBH I started looking at tests but got lost somewhere with mocks... will try to revisit it in the near future :)

Oh you are right about the static thing, I probably didn't notice it. I could change that but guess it doesn't matter.

Contributor

unsignedint commented Mar 19, 2014

Thanks for merging... TBH I started looking at tests but got lost somewhere with mocks... will try to revisit it in the near future :)

Oh you are right about the static thing, I probably didn't notice it. I could change that but guess it doesn't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment