Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

HTML Scanner parser regression #2550

Closed
titanous opened this Issue · 11 comments

7 participants

@titanous

@tenderlove:

586a944 causes at least two different types of false-positive parse errors that did not exist in 3.0.9:

def test_xml_tag
  assert_nothing_raised { HTML::Document.new('<?xml version="1.0" encoding="UTF-8"?>', true, true) }
end

def test_js
  assert_nothing_raised { HTML::Document.new('<script>1<=2</script>', true, true) }
end
Loaded suite test/template/html-scanner/document_test
Started
......F........F
Finished in 0.016336 seconds.

  1) Failure:
test_js(DocumentTest)
    [test/template/html-scanner/document_test.rb:153:in `test_js']:
Exception raised:
Class: <RuntimeError>
Message: <"expected > (got \"\" for <=2, {})">
---Backtrace---
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/node.rb:194:in `parse'
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/document.rb:20:in `initialize'
test/template/html-scanner/document_test.rb:153:in `new'
test/template/html-scanner/document_test.rb:153:in `test_js'
test/template/html-scanner/document_test.rb:153:in `test_js'
---------------

  2) Failure:
test_xml_tag(DocumentTest)
    [test/template/html-scanner/document_test.rb:149:in `test_xml_tag']:
Exception raised:
Class: <RuntimeError>
Message: <"expected > (got \"?>\" for <?xml version=\"1.0\" encoding=\"UTF-8\"?>, {\"encoding\"=>\"UTF-8\", \"version\"=>\"1.0\"})">
---Backtrace---
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/node.rb:194:in `parse'
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/document.rb:20:in `initialize'
test/template/html-scanner/document_test.rb:149:in `new'
test/template/html-scanner/document_test.rb:149:in `test_xml_tag'
test/template/html-scanner/document_test.rb:149:in `test_xml_tag'
---------------

16 tests, 30 assertions, 2 failures, 0 errors
@tenderlove
Owner

I highly recommend you use a real HTML parser like loofah.

I will be replacing the html scanner with loofah in Rails 3.2.

@titanous

That's a good idea, but assert_select uses HTML Scanner, which causes a bunch of false-positive warnings to spew while running our test suite (previous to 3.0.10 we had strict mode on, breaking our tests if things didn't parse properly).

@tenderlove
Owner

Yes. Your choices are:

  • Fix your code so that assert_select is happy
  • Provide a patch that you can guarantee against XSS attacks
  • Use loofah and write a different assert_select
  • wait until Rails 3.2.

Writing an HTML parser is a fools errand. I advise against it. Bottom line is that I won't fix this until 3.2, but I will apply patches.

@NZKoz
Owner

I'd veto #2 in that list too, it's too risky for us to make those changes given the severity of the vulnerability that was fixed.

A patch to fix assert_select itself would be worth it, but the html scanner is a big risk to change.

@tenderlove tenderlove was assigned
@steveklabnik
Collaborator

@tenderlove Did you actually put loofah into Rails 3.2? Can this be closed?

Doesn't look like it: https://github.com/rails/rails/tree/master/actionpack/lib/action_controller/vendor

@tenderlove
Owner

Not yet. Maybe we can get some help with that? /cc @flavorjones

@flavorjones

Serious work needs to be done to either a) deprecate parts of Rails's sanitization API which are both undocumented and untested (yes, srsly) or b) enhance Loofah to support Rails's sanitzation API.

I've got a side branch where some of b) is done, but if we can have an conversation about deprecating some of the API, that might be both easier and better.

@rafaelfranca

I can help to get this done.

@flavorjones

@rafaelfranca - Let me ramp back up on what's partially done ... give me a day or two to find the code and reread my notes.

@rafaelfranca rafaelfranca was assigned
@guilleiguaran

Guys I'll close this since looks like this won't be fixed in 4.0.x and we have a patch to integrate Loofah in 4.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.