Loofah is an HTML sanitizer. It will always fix broken markup, but can also sanitize unsafe tags in a few different ways, and transform the markup for storage or display.
It's built on top of Nokogiri and libxml2, so it's fast. And it uses html5lib's whitelist, so it most likely won't make your codes less secure.
(These statements have not been evaluated by Internet Experts.)
This library was formerly known as Dryopteris.
Strip unsafe tags, leaving behind only the inner text.
Prune unsafe tags and their subtrees, removing all traces that they ever existed.
Escape unsafe tags and their subtrees, leaving behind lots of < and > entities.
Whitewash the markup, removing all attributes and namespaced nodes.
Format the markup as plain text.
99 44/100 % Tenderlove-free!
For a full explanation, see the documentation for Loofah.
require 'loofah' unsafe_html = "ohai! <div>a div is safe</div> <script>but script is not</script>" Loofah.scrub_fragment(unsafe_html, :prune).to_s # => "ohai! <div>div is safe</div> "
doc = Loofah.fragment(unsafe_html) # returns a Nokogiri document ... doc.scrub!(:prune) # ... with one extra method doc.to_s # => "ohai! <div>div is safe</div> " doc.text # => "ohai! div is safe "
# config/environment.rb require 'loofah/active_record' # db/schema.rb create_table "posts" do |t| t.string "title" t.string "body" end # app/model/post.rb class Post < ActiveRecord::Base html_fragment :body, :scrub => :prune # scrubs 'body' in a before_save end
ruby 1.8 or 1.9
Nokogiri >= 1.3.3
gem install loofah
The bug tracker is available here:
You can also try the Nokogiri mailing list:
And the IRC channel is #nokogiri on freenode.
Featuring code contributed by:
The MIT License
Copyright © 2009 Mike Dalessio, Bryan Helmkamp
See MIT-LICENSE.txt in this directory.