Skip to content

[bug] Calling to_html on Nokogiri::XML::Attr raises Unsupported document node (2) exception #3125

@marcoroth

Description

@marcoroth

Please describe the bug

Calling to_html on a Nokogiri::XML::Attr instance, parsed via HTML5::DocumentFragment is raising an Unsupported docu ment node (2); this is a bug in Nokogiri (RuntimeError) exception here.

Help us reproduce what you're seeing

#!/usr/bin/env ruby

require 'nokogiri'
require 'minitest/autorun'

class Test < Minitest::Spec
  describe "Nokogiri::XML::Attr#to_html" do
    it "should convert boolean attribute to HTML" do
      html = %(<input required>)
      fragment = Nokogiri::HTML5::DocumentFragment.parse(html)
       
      input = fragment.at_css("input")
      attribute = input.attribute("required")

      assert_equal " required", attribute.to_html
    end

    it "should convert string attribute to HTML" do
      html = %(<input name="name">)
      fragment = Nokogiri::HTML5::DocumentFragment.parse(html)
       
      input = fragment.at_css("input")
      attribute = input.attribute("name")

      assert_equal %( name="name"), attribute.to_html
    end
  end
end

Expected behavior

I expect the attribute to be converted to HTML and not raise the following error:

/Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/lib/nokogiri/html5/node.rb:64:in `html_standard_serialize': Unsupported document node (2); this is a bug in Nokogiri (RuntimeError)
        from /Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/lib/nokogiri/html5/node.rb:64:in `write_to'
        from /Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/lib/nokogiri/xml/node.rb:1302:in `serialize'
        from /Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/lib/nokogiri/xml/node.rb:1552:in `to_format'
        from /Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/lib/nokogiri/xml/node.rb:1314:in `to_html'

It seems like the same call is working fine in HTML4:

irb(main)> Nokogiri::HTML4::DocumentFragment.parse(%(<input required>)).at_css("input").attribute("required").to_html
=> " required"
irb(main)> Nokogiri::HTML4::DocumentFragment.parse(%(<input name="name">)).at_css("input").attribute("name").to_html
=> " name=\"name\""

Environment

bundle exec nokogiri -v

# Nokogiri (1.16.2)
    ---
    warnings: []
    nokogiri:
      version: 1.16.2
      cppflags:
      - "-I/Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/ext/nokogiri"
      - "-I/Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/ext/nokogiri/include"
      - "-I/Users/marcoroth/.anyenv/envs/rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/nokogiri-1.16.2-x86_64-darwin/ext/nokogiri/include/libxml2"
      ldflags: []
    ruby:
      version: 3.3.0
      platform: x86_64-darwin23
      gem_platform: x86_64-darwin-23
      description: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23]
      engine: ruby
    libxml:
      source: packaged
      precompiled: true
      patches:
      - 0001-Remove-script-macro-support.patch
      - 0002-Update-entities-to-remove-handling-of-ssi.patch
      - 0003-libxml2.la-is-in-top_builddir.patch
      - '0009-allow-wildcard-namespaces.patch'
      - 0010-update-config.guess-and-config.sub-for-libxml2.patch
      - 0011-rip-out-libxml2-s-libc_single_threaded-support.patch
      memory_management: ruby
      iconv_enabled: true
      compiled: 2.12.5
      loaded: 2.12.5
    libxslt:
      source: packaged
      precompiled: true
      patches:
      - 0001-update-config.guess-and-config.sub-for-libxslt.patch
      datetime_enabled: true
      compiled: 1.1.39
      loaded: 1.1.39
    other_libraries:
      zlib: '1.3'
      libiconv: '1.17'
      libgumbo: 1.0.0-nokogiri

Additional context

I found this while trying to update my parse call form HTML4 to HTML5 here with these tests:

https://github.com/marcoroth/phlexing/blob/28f7d6970b69b22c425d2ccea064ecffd8d189ba/gem/test/phlexing/converter/tags_test.rb#L219-L237

I got the failing tests in this PR: marcoroth/phlexing#288

Metadata

Metadata

Assignees

No one assigned

    Labels

    state/needs-triageInbox for non-installation-related bug reports or help requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions