Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Scanner parser regression #2550

Closed
titanous opened this issue Aug 16, 2011 · 11 comments
Closed

HTML Scanner parser regression #2550

titanous opened this issue Aug 16, 2011 · 11 comments
Assignees

Comments

@titanous
Copy link

@tenderlove:

586a944 causes at least two different types of false-positive parse errors that did not exist in 3.0.9:

def test_xml_tag
  assert_nothing_raised { HTML::Document.new('<?xml version="1.0" encoding="UTF-8"?>', true, true) }
end

def test_js
  assert_nothing_raised { HTML::Document.new('<script>1<=2</script>', true, true) }
end
Loaded suite test/template/html-scanner/document_test
Started
......F........F
Finished in 0.016336 seconds.

  1) Failure:
test_js(DocumentTest)
    [test/template/html-scanner/document_test.rb:153:in `test_js']:
Exception raised:
Class: <RuntimeError>
Message: <"expected > (got \"\" for <=2, {})">
---Backtrace---
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/node.rb:194:in `parse'
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/document.rb:20:in `initialize'
test/template/html-scanner/document_test.rb:153:in `new'
test/template/html-scanner/document_test.rb:153:in `test_js'
test/template/html-scanner/document_test.rb:153:in `test_js'
---------------

  2) Failure:
test_xml_tag(DocumentTest)
    [test/template/html-scanner/document_test.rb:149:in `test_xml_tag']:
Exception raised:
Class: <RuntimeError>
Message: <"expected > (got \"?>\" for <?xml version=\"1.0\" encoding=\"UTF-8\"?>, {\"encoding\"=>\"UTF-8\", \"version\"=>\"1.0\"})">
---Backtrace---
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/node.rb:194:in `parse'
/Users/titanous/src/rails/actionpack/lib/action_controller/vendor/html-scanner/html/document.rb:20:in `initialize'
test/template/html-scanner/document_test.rb:149:in `new'
test/template/html-scanner/document_test.rb:149:in `test_xml_tag'
test/template/html-scanner/document_test.rb:149:in `test_xml_tag'
---------------

16 tests, 30 assertions, 2 failures, 0 errors
@tenderlove
Copy link
Member

I highly recommend you use a real HTML parser like loofah.

I will be replacing the html scanner with loofah in Rails 3.2.

@titanous
Copy link
Author

That's a good idea, but assert_select uses HTML Scanner, which causes a bunch of false-positive warnings to spew while running our test suite (previous to 3.0.10 we had strict mode on, breaking our tests if things didn't parse properly).

@tenderlove
Copy link
Member

Yes. Your choices are:

  • Fix your code so that assert_select is happy
  • Provide a patch that you can guarantee against XSS attacks
  • Use loofah and write a different assert_select
  • wait until Rails 3.2.

Writing an HTML parser is a fools errand. I advise against it. Bottom line is that I won't fix this until 3.2, but I will apply patches.

@NZKoz
Copy link
Member

NZKoz commented Aug 17, 2011

I'd veto #2 in that list too, it's too risky for us to make those changes given the severity of the vulnerability that was fixed.

A patch to fix assert_select itself would be worth it, but the html scanner is a big risk to change.

@ghost ghost assigned tenderlove Dec 20, 2011
@steveklabnik
Copy link
Member

@tenderlove Did you actually put loofah into Rails 3.2? Can this be closed?

Doesn't look like it: https://github.com/rails/rails/tree/master/actionpack/lib/action_controller/vendor

@tenderlove
Copy link
Member

Not yet. Maybe we can get some help with that? /cc @flavorjones

@flavorjones
Copy link
Member

Serious work needs to be done to either a) deprecate parts of Rails's sanitization API which are both undocumented and untested (yes, srsly) or b) enhance Loofah to support Rails's sanitzation API.

I've got a side branch where some of b) is done, but if we can have an conversation about deprecating some of the API, that might be both easier and better.

@rafaelfranca
Copy link
Member

I can help to get this done.

@flavorjones
Copy link
Member

@rafaelfranca - Let me ramp back up on what's partially done ... give me a day or two to find the code and reread my notes.

@rafaelfranca
Copy link
Member

@flavorjones sure!

@guilleiguaran
Copy link
Member

Guys I'll close this since looks like this won't be fixed in 4.0.x and we have a patch to integrate Loofah in 4.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants