Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an HTML5 sanitizer vendor for Rails to integrate with #162

Merged
merged 2 commits into from
May 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 41 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,43 @@
## next / unreleased

* `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
* Sanitizers that use an HTML5 parser are now available on platforms supported by
Nokogiri::HTML5. These are available as:

- `Rails::HTML5::FullSanitizer`
- `Rails::HTML5::LinkSanitizer`
- `Rails::HTML5::SafeListSanitizer`

And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
of Rails.

Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
to the vendor class methods on `Rails::HTML::Sanitizer`.

*Mike Dalessio*

* `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the existing sanitizers,
and should have been a private constant all along anyway.
* Module namespaces have changed, but backwards compatibility is provided by aliases.

The library defines three additional modules:

- `Rails::HTML` for general functionality (replacing `Rails::Html`)
- `Rails::HTML4` containing sanitizers that parse content as HTML4
- `Rails::HTML5` containing sanitizers that parse content as HTML5

The following aliases are maintained for backwards compatibility:

- `Rails::Html` points to `Rails::HTML`
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`

*Mike Dalessio*

* `Rails::Html` has been renamed to `Rails::HTML`, but this module is aliased to `Rails::Html` for
backwards compatibility.
* `SafeListSanitizer` allows `time` tag and `lang` attribute by default.

*Mike Dalessio*

* `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the existing sanitizers,
and should have been a private constant all along anyway.

*Mike Dalessio*

Expand All @@ -24,6 +51,7 @@

*seyerian*


## 1.4.4 / 2022-12-13

* Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
Expand Down Expand Up @@ -69,6 +97,7 @@

*Mike Dalessio*


## 1.4.2 / 2021-08-23

* Slightly improve performance.
Expand All @@ -77,6 +106,7 @@

*Mike Dalessio*


## 1.4.1 / 2021-08-18

* Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
Expand All @@ -89,6 +119,7 @@

*Mike Dalessio*


## 1.4.0 / 2021-08-18

* Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
Expand All @@ -101,12 +132,14 @@

*Mike Dalessio*


## 1.3.0

* Address deprecations in Loofah 2.3.0.

*Josh Goodall*


## 1.2.0

* Remove needless `white_list_sanitizer` deprecation.
Expand All @@ -121,6 +154,7 @@

*Kasper Timm Hansen*


## 1.1.0

* Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
Expand All @@ -138,10 +172,12 @@

*Kasper Timm Hansen*


## 1.0.1

* Added support for Rails 4.2.0.beta2 and above


## 1.0.0

* First release.
59 changes: 43 additions & 16 deletions lib/rails/html/sanitizer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,6 @@ module Rails
module HTML
class Sanitizer
class << self
def full_sanitizer
Rails::HTML4::FullSanitizer
end

def link_sanitizer
Rails::HTML4::LinkSanitizer
end

def safe_list_sanitizer
Rails::HTML4::SafeListSanitizer
end

def white_list_sanitizer # :nodoc:
safe_list_sanitizer
end

def html5_support?
return @html5_support if defined?(@html5_support)

Expand Down Expand Up @@ -209,6 +193,28 @@ def serialize(fragment)
end

module HTML4
module Sanitizer
module VendorMethods
def full_sanitizer
Rails::HTML4::FullSanitizer
end

def link_sanitizer
Rails::HTML4::LinkSanitizer
end

def safe_list_sanitizer
Rails::HTML4::SafeListSanitizer
end

def white_list_sanitizer # :nodoc:
safe_list_sanitizer
end
end

extend VendorMethods
end

# == Rails::HTML4::FullSanitizer
#
# Removes all tags from HTML4 but strips out scripts, forms and comments.
Expand Down Expand Up @@ -299,6 +305,26 @@ class SafeListSanitizer < Rails::HTML::Sanitizer
end

module HTML5
class Sanitizer
class << self
def full_sanitizer
Rails::HTML5::FullSanitizer
end

def link_sanitizer
Rails::HTML5::LinkSanitizer
end

def safe_list_sanitizer
Rails::HTML5::SafeListSanitizer
end

def white_list_sanitizer # :nodoc:
safe_list_sanitizer
end
end
end

# == Rails::HTML5::FullSanitizer
#
# Removes all tags from HTML5 but strips out scripts, forms and comments.
Expand Down Expand Up @@ -389,6 +415,7 @@ class SafeListSanitizer < Rails::HTML::Sanitizer
end if Rails::HTML::Sanitizer.html5_support?

module HTML
Sanitizer.extend(HTML4::Sanitizer::VendorMethods) # :nodoc:
FullSanitizer = HTML4::FullSanitizer # :nodoc:
LinkSanitizer = HTML4::LinkSanitizer # :nodoc:
SafeListSanitizer = HTML4::SafeListSanitizer # :nodoc:
Expand Down
32 changes: 28 additions & 4 deletions test/rails_api_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -32,19 +32,43 @@ def test_html4_sanitizer_alias_safe_list
assert_equal("Rails::HTML4::SafeListSanitizer", Rails::HTML::SafeListSanitizer.name)
end

def test_full_sanitizer_returns_a_full_sanitizer
def test_html4_full_sanitizer
assert_equal(Rails::HTML4::FullSanitizer, Rails::HTML::Sanitizer.full_sanitizer)
assert_equal(Rails::HTML4::FullSanitizer, Rails::HTML4::Sanitizer.full_sanitizer)
end

def test_link_sanitizer_returns_a_link_sanitizer
def test_html4_link_sanitizer
assert_equal(Rails::HTML4::LinkSanitizer, Rails::HTML::Sanitizer.link_sanitizer)
assert_equal(Rails::HTML4::LinkSanitizer, Rails::HTML4::Sanitizer.link_sanitizer)
end

def test_safe_list_sanitizer_returns_a_safe_list_sanitizer
def test_html4_safe_list_sanitizer
assert_equal(Rails::HTML4::SafeListSanitizer, Rails::HTML::Sanitizer.safe_list_sanitizer)
assert_equal(Rails::HTML4::SafeListSanitizer, Rails::HTML4::Sanitizer.safe_list_sanitizer)
end

def test_white_list_sanitizer_returns_a_safe_list_sanitizer
def test_html4_white_list_sanitizer
assert_equal(Rails::HTML4::SafeListSanitizer, Rails::HTML::Sanitizer.white_list_sanitizer)
assert_equal(Rails::HTML4::SafeListSanitizer, Rails::HTML4::Sanitizer.white_list_sanitizer)
end

def test_html5_full_sanitizer
skip("no HTML5 support on this platform") unless Rails::HTML::Sanitizer.html5_support?
assert_equal(Rails::HTML5::FullSanitizer, Rails::HTML5::Sanitizer.full_sanitizer)
end

def test_html5_link_sanitizer
skip("no HTML5 support on this platform") unless Rails::HTML::Sanitizer.html5_support?
assert_equal(Rails::HTML5::LinkSanitizer, Rails::HTML5::Sanitizer.link_sanitizer)
end

def test_html5_safe_list_sanitizer
skip("no HTML5 support on this platform") unless Rails::HTML::Sanitizer.html5_support?
assert_equal(Rails::HTML5::SafeListSanitizer, Rails::HTML5::Sanitizer.safe_list_sanitizer)
end

def test_html5_white_list_sanitizer
skip("no HTML5 support on this platform") unless Rails::HTML::Sanitizer.html5_support?
assert_equal(Rails::HTML5::SafeListSanitizer, Rails::HTML5::Sanitizer.white_list_sanitizer)
end
end