Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loofah-integration #11218

Closed
wants to merge 192 commits into from
Closed

Loofah-integration #11218

wants to merge 192 commits into from

Conversation

kaspth
Copy link
Contributor

@kaspth kaspth commented Jul 1, 2013

Today Rails uses the HTML-scanner gem to do its sanitization. We will switch the gem that is used, while keeping the old API for backwards compatibility. Instead of the scanner gem, we will use Loofah. Loofah is built on top of Nokogiri, meaning we rid the implementation's reliance on regular expressions and we get speed. On large documents and fragments Loofah is around 60 to 100% faster than the current implementation.

Notes

The sanitizers used in ActionView::SanitizeHelper have been extracted to the rails-html-sanitizer gem.
https://github.com/rafaelfranca/rails-html-sanitizer

The DomAssertions and SelectorAssertions have been extracted to the rails-dom-testing gem.
https://github.com/kaspth/rails-dom-testing

The substitution values syntax in assert_select has changed.

assert_select "div#?", /\d+/
assert_select "div:match('id', ?)", /\d+/

The attribute to match should be enclosed in quotes to avoid issues with Nokogiri's css selector syntax parsing. It is not necessary to do so with the question mark.

Todos

  • Extract failing tests to mark them as pending.
  • Find areas where backwards compatibility is broken
Pending Test Fixes
Sanitizers
  • Implement Action View FullSanitizer, LinkSanitizer and WhiteListSanitizer in sanitizers.rb
  • Deprecate protocol_separator and bad_tags for WhiteListSanitizer
  • Make sanitize accept custom :tags and :attributes options
  • Make sanitize accept a Loofah::Scrubber via :scrubber option
  • Make PermitScrubber
  • Make TargetScrubber
Sanitizers Testing
  • Add new tests in sanitizers_test.rb
  • Complete test coverage for PermitScrubber
  • Complete test coverage for TargetScrubber
  • Test behavior of changing sanitization
  • Move testing of PermitScrubber's peculiarities from sanitizers_test.rb
Dom and Selector Assertions
  • Reimplement assert_dom_equal with Loofah
  • Move Dom and Selector assertions to Action View
  • Fix test_nested_css_select failing on line 245
  • Fix test_feed_item_encoded finding two p elements when it shouldn't
  • Marked as pending the colliding xml namespaces issue discussed here: https://groups.google.com/forum/#!topic/nokogiri-talk/Nv8kX4p_r7I

Related issues

flavorjones/loofah#44
flavorjones/loofah#45
flavorjones/loofah#46
flavorjones/loofah#47
flavorjones/loofah#51
flavorjones/loofah#52
flavorjones/loofah#54

//@rafaelfranca

@ghost ghost assigned rafaelfranca Jul 1, 2013
end

def protocol_separator
ActiveSupport::Deprecation.warn('protocol_separator has been deprecated and has no effect.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can call self.class.protocol_separator here and in all the deprecated instance methods to avoid duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, of course!

Den 01/07/2013 kl. 18.25 skrev Rafael Mendonça França notifications@github.com:

In actionview/lib/action_view/helpers/sanitize_helper/sanitizers.rb:

  • def sanitize(html, options = {})
  •  return nil unless html
    
  •  validate_options(options)
    
  •  loofah_fragment = Loofah.fragment(html)
    
  •  loofah_fragment.scrub!(:strip)
    
  •  loofah_fragment.xpath("./form").each { |form| form.remove }
    
  •  loofah_fragment.to_s
    
  • end
  • def sanitize_css(style_string)
  •  Loofah::HTML5::Scrub.scrub_css style_string
    
  • end
  • def protocol_separator
  •  ActiveSupport::Deprecation.warn('protocol_separator has been deprecated and has no effect.')
    
    We can call self.class.protocol_separator here and in all the deprecated instance methods to avoid duplication.


Reply to this email directly or view it on GitHub.

@rafaelfranca
Copy link
Member

cc @carlosantoniodasilva @josevalim @jeremy @NZKoz @tenderlove mind to review this one? I think we are done

@@ -65,5 +65,9 @@
* Fix removing trailing slash for mounted apps #3215

*Piotr Sarnacki*

* Loofah replaces html-scanner in dom assertions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New entries must be on the top.

@guilleiguaran
Copy link
Member

I think we should move actionpack/lib/action_dispatch/testing/assertions/dom.rb to actionview and make to loofah only a dependency for actionview

/cc @drogus @strzalek

@kaspth
Copy link
Contributor Author

kaspth commented Jul 15, 2013

@guilleiguaran I'll second that.
Though html-scanner is still required in selector.rb and tag.rb in assertions/, which I still haven't converted to using Loofah.

@guilleiguaran
Copy link
Member

Let's move selector.rb and tag.rb also to actionview 😄

@kaspth
Copy link
Contributor Author

kaspth commented Jul 15, 2013

Fine by me.

end
var
end
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these no newlines generally allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, there is a newline at the end of files. This is the only one missing a newline. Mostly a nitpick...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I wasn't aware of that. I'll look through my files and add newlines where needed.

@kaspth
Copy link
Contributor Author

kaspth commented Jul 16, 2013

The guides say that assert_tag is deprecated: http://guides.rubyonrails.org/testing.html#testing-views
So I'm guessing we shouldn't bother moving that over.

Should I remove the file assertions/tag.rb, @rafaelfranca?

@@ -2,6 +2,9 @@ source 'https://rubygems.org'

gemspec

# temporary gem while working on loofah integration
gem 'loofah', '~> 1.2.1', github: 'kaspth/loofah'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's pointing to your fork of loofah? Did you made any changes? Are they already merged to official master branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are changes essential for this to work if I am not wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. It would be good to merge those changes to official loofah repository before merging this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of changes that are yet to be merged, which have not received response yet from @flavorjones .
After he does that , it should be fine to change this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these PR are pending, but the idea is remove this git dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@strzalek Yes, I have made lots of changes. You can check out the PRs in the description if you're interested 😉
@vipulnsward is right, we're still waiting for Mike to pull in the changes.

Once Loofah includes them, this line will be removed.
(that's the reason for the 'temporary' comment, I see now that that wasn't clear why the gem was temporary.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaspth you can now probably change this. Thanks to @flavorjones !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can. Much love to @flavorjones! Thank you kindly, Mike.

@rafaelfranca
Copy link
Member

@kaspth yes, lets remove it

@kaspth
Copy link
Contributor Author

kaspth commented Jul 17, 2013

@rafaelfranca Yippee!

@kaspth
Copy link
Contributor Author

kaspth commented Jul 17, 2013

@strzalek is there anything I can do to make it easier for you to move dom.rb and selector.rb in action_dispatch/testing/assertions/ into actionview while I'm changing the files, anyway?

@lukaszx0
Copy link
Member

I'm actually not touching those files at all so feel free to do whatever you want with them. They unlikely be conflicted with my changes. No worries.

@kaspth
Copy link
Contributor Author

kaspth commented Jul 18, 2013

Then it's all good then.

@pftg
Copy link
Contributor

pftg commented Jul 27, 2013

Please take in account this:

index 3b52b20..6ff7b68 100644
--- a/actionview/test/template/sanitize_helper_test.rb
+++ b/actionview/test/template/sanitize_helper_test.rb
@@ -38,6 +38,9 @@ class SanitizeHelperTest < ActionView::TestCase
     assert_equal("<<<bad html", strip_tags("<<<bad html"))
     assert_equal("<<", strip_tags("<<<bad html>"))

+    assert_equal "This is <-- not\n a comment here.",
+                 strip_tags("This is <-- not\n a comment here.")
+
     assert_equal("Weirdos", strip_tags("Wei<<a>a onclick='alert(document.cookie);'</a>/>rdos"))

from #11629

@kaspth
Copy link
Contributor Author

kaspth commented Jul 31, 2013

@pftg Sorry for the late response. The test case has been added.

@pftg
Copy link
Contributor

pftg commented Jul 31, 2013

🆒 thanks

@schneems
Copy link
Member

schneems commented Aug 1, 2013

Monster PR is monster. Where do we stand here? Are we waiting on approval of this PR to cut a release of loofah in order to update the dependency on the PR?

@rafaelfranca
Copy link
Member

There are more things to do @kaspth we need to update the TODO list

@kaspth
Copy link
Contributor Author

kaspth commented Aug 1, 2013

Some small number of tests, regarding the new assert_select, still aren't passing.
Yes, we are waiting for a new release of Loofah too.

@rafaelfranca I'll update it now.

rafaelfranca added a commit that referenced this pull request Jul 10, 2014
Loofah-integration

Conflicts:
	actionpack/CHANGELOG.md
	actionview/CHANGELOG.md
@rafaelfranca
Copy link
Member

I merged this branch on the rails/rails loofah branch. I'll test this branch with some applications I have and will test the deprecated gems too. If you have something to change on this branch please open PRs against this new branch.

@kaspth awesome work! ❤️

@schneems
Copy link
Member

epic!

@kaspth
Copy link
Contributor Author

kaspth commented Jul 10, 2014

Hell yeah!
Thanks for all the love, Rafael ❤️

Kasper

Den 10/07/2014 kl. 21.54 skrev Rafael Mendonça França notifications@github.com:

I merged this branch on the rails/rails loofah branch. I'll test this branch with some applications I have and will test the deprecated gems too. If you have something to change on this branch please open PRs against this new branch.

@kaspth awesome work!


Reply to this email directly or view it on GitHub.

@chancancode
Copy link
Member

🤘

@mdespuits
Copy link

❤️ 💚 💛 💜 ❤️

@vipulnsward
Copy link
Member

Awesome! Great work @kaspth !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet