Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mojo::DOM renders valueless attributes in XML mode #846

Closed
jast opened this issue Sep 24, 2015 · 4 comments
Closed

Mojo::DOM renders valueless attributes in XML mode #846

jast opened this issue Sep 24, 2015 · 4 comments

Comments

@jast
Copy link

jast commented Sep 24, 2015

If you parse something like <a b> as XML, rendering gives <a b /> which is not actually valid XML due to the missing attribute value. I'm not sure what the best way to "fix" the output would be... probably <a b="b" /> or <a b="" />.

To reproduce: print Mojo::DOM->new->xml(1)->parse('<a b>')->to_string

I guess fixing broken XML (or "converting" HTML to XHTML, which is what I'm doing) isn't really the aim of Mojo::DOM but this would be a fairly straightforward change... uncertainty about which fix is the right one aside.

@Grinnz
Copy link
Contributor

Grinnz commented Sep 24, 2015

XML mode as you mentioned is designed exactly for this purpose. If XML mode is not set, valid XML input/output cannot be expected.

Edit: Whoops, read it wrong. Disregard

@zoffixznet
Copy link
Contributor

I think this is just a side-effect of the ability to parse broken XHTML. Adding logic to guestimate fixes for broken XHTML on output would probably be beyond the scope of this module.

This trick might be helpful for your purposes:

perl -MMojo::DOM -wle 'print Mojo::DOM->new->xml(1)->parse("<a b>")->at(q{a[b=""]})->attr("b", "")->to_string';
<a b="" />

@zoffixznet
Copy link
Contributor

Here's a more generic approach for you that fixes all attributes on all elements:

perl -MMojo::DOM -wle 'print Mojo::DOM->new->xml(1)->parse(q{<a b c d/><b x y z c="42"/>})->find("*")->each(sub{for (values %$_){$_="" unless defined;}})->join'
<a b="" c="" d="" /><b c="42" x="" y="" z="" />

@kraih
Copy link
Member

kraih commented Sep 24, 2015

I believe <a b="b" /> would be the correct result, if we decide to repair the XML here. But i'm not sure if we should. Maybe if there's a more elegant solution than what i've come up with so far.

diff --git a/lib/Mojo/DOM/HTML.pm b/lib/Mojo/DOM/HTML.pm
index 016989a..12a75e4 100644
--- a/lib/Mojo/DOM/HTML.pm
+++ b/lib/Mojo/DOM/HTML.pm
@@ -228,8 +228,9 @@ sub _render {

   # Attributes
   for my $key (sort keys %{$tree->[2]}) {
-    $result .= " $key" and next unless defined(my $value = $tree->[2]{$key});
-    $result .= " $key" . '="' . xml_escape($value) . '"';
+    my $value = $tree->[2]{$key};
+    $result .= " $key" and next if !$xml && !defined $value;
+    $result .= " $key" . '="' . xml_escape($value // $key) . '"';
   }

   # No children
diff --git a/t/mojo/dom.t b/t/mojo/dom.t
index 14867c9..52d7e2f 100644
--- a/t/mojo/dom.t
+++ b/t/mojo/dom.t
@@ -1918,7 +1918,8 @@ is $element->parent->tag, 'XMLTest', 'right parent';
 ok $element->root->xml, 'XML mode active';
 $dom->replace('<XMLTest2 /><XMLTest3 just="works" />');
 ok $dom->xml, 'XML mode active';
-is $dom, '<XMLTest2 /><XMLTest3 just="works" />', 'right result';
+$dom->at('XMLTest2')->{foo} = undef;
+is $dom, '<XMLTest2 foo="foo" /><XMLTest3 just="works" />', 'right result';

 # Ensure HTML semantics
 ok !Mojo::DOM->new->xml(undef)->parse('<?xml version="1.0"?>')->xml,

@kraih kraih closed this as completed in 56af831 Sep 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants