Skip to content

Sanitizing HTML Content

Brandon Robins edited this page May 4, 2015 · 5 revisions

When a post is saved, Storytime.post_sanitizer is called to handle any content sanitization on the draft content of a post. Storytime.post_sanitizer is a hook intended to strip out any unwanted tags or attributes, and return a sanitized/clean version of the content to save. The hook can accept either a Lambda or Proc which can be used to override the default method of sanitization.

Overriding Storytime.post_sanitizer

Storytime.post_sanitizer can be overridden by simply editing the config.post_sanitizer assignment in config/initializers/storytime.rb.

Adding Tags and/or Attributes to Sanitizer Whitelist

The following example adds the iframe tag and frameborder and allowfullscreen attributes to the sanitizer whitelist.

# Host app: config/initializers/storytime.rb

config.post_sanitizer = Proc.new do |draft_content|
  if Rails::VERSION::MINOR <= 1
    white_list_sanitizer = HTML::WhiteListSanitizer.new
    tags = white_list_sanitizer.allowed_tags
    attributes = white_list_sanitizer.allowed_attributes
  else
    white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
    tags = Loofah::HTML5::WhiteList::ALLOWED_ELEMENTS_WITH_LIBXML2
    attributes = Loofah::HTML5::WhiteList::ALLOWED_ATTRIBUTES
  end

  # Add any additional tags or attributes to tags/attributes Sets here.
  tags.add("iframe")
  attributes.merge(["frameborder", "allowfullscreen"])

  white_list_sanitizer.sanitize(draft_content, tags: tags, attributes: attributes)
end

No Sanitization of Content

The following Proc block overrides Storytime.post_sanitizer and allows any and all tags and attributes in post's draft content to be passed through untouched.

# Host app: config/initializers/storytime.rb

config.post_sanitizer = Proc.new do |draft_content|
  # No need to sanitize content, just passing it through...
  draft_content
end

Default Content Sanitization

By default, the following Proc block is called and used in the sanitization of draft content:

# Storytime Engine: lib/storytime.rb

@@post_sanitizer = Proc.new do |draft_content|
  if Rails::VERSION::MINOR <= 1
    white_list_sanitizer = HTML::WhiteListSanitizer.new
    tags = white_list_sanitizer.allowed_tags
    attributes = white_list_sanitizer.allowed_attributes
  else
    white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
    tags = Loofah::HTML5::WhiteList::ALLOWED_ELEMENTS_WITH_LIBXML2
    attributes = Loofah::HTML5::WhiteList::ALLOWED_ATTRIBUTES
  end

  white_list_sanitizer.sanitize(draft_content, tags: tags, attributes: attributes)
end

One thing to note from Storytime's post_sanitizer Proc is that, depending on your version of Rails 4, the implementation of the sanitize helper differs (see The New HTML Sanitizer in Rails 4.2 for more info). Rails 4.2+ uses the rails-html-sanitizer gem to sanitize content, while < 4.2 uses html-scanner.

Depending on the implementation you may have a different set of whitelisted tags or attributes being used. The rails-html-sanitizer gem, whitelists 174 tags and 287 attributes (through Loofah), while html-scanner whitelists 41 tags and 12 attributes. If you are using Rails < 4.2 and plan to update to 4.2+ sometime in the future it may be a good idea to declare your own set of tags and attributes.