When using HTML, no extra whitespace (indentation) is added. For example:
# => "<pre><code>moo</code></pre>\n"
However, with XHTML (and XML), the original line is split in three lines (pre, code, and /pre) and the middle line is erroneously indented, like this:
# => "<pre>\n <code>moo</code>\n</pre>\n"
I've faced a similar issue with sanitize first, but the author forwarded this to nokogiri:
>> Nokogiri::HTML.fragment('<b><a href="http://foo.com/">foo</a></b><img src="http://foo.com/bar.jpg" />').to_xhtml
=> "<b>\n <a href=\"http://foo.com/\">foo</a>\n</b><img src=\"http://foo.com/bar.jpg\" />"
Here's the original issue discussion: rgrove/sanitize#47 (comment)
It happens on nokogiri 1.5.0 and ree 1.8.7 (probably ruby 1.8.7 as well)
For XML, you can turn off formatting by calling #to_xml as follows:
doc.to_xml(:save_with => 0)
There's some strangeness in the serializers. It's worth a discussion on the core team. We'll probably revamp this for the next major release.