Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Wants to truncate an HTML string properly? This Ruby gem is for you.

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 lib
Octocat-spinner-32 spec
Octocat-spinner-32 .gitignore
Octocat-spinner-32 .rspec
Octocat-spinner-32 Gemfile
Octocat-spinner-32 MIT-LICENSE
Octocat-spinner-32 README.md
Octocat-spinner-32 html_truncator.gemspec
README.md

HTML Truncator

Wants to truncate an HTML string properly? This gem is for you. It's powered by Nokogiri!

How to use it

It's very simple. Install it with rubygems:

gem install html_truncator

Or, if you use bundler, add it to your Gemfile:

gem "html_truncator", :version => "~>0.1"

Then you can use it in your code:

require "html_truncator"
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
# => "<p>Lorem ipsum dolor...</p>"

The HTML_Truncator class has only one method, truncate, with 3 arguments:

  • the HTML-formatted string to truncate
  • the number of words to keep (real words, tags and attributes aren't count)
  • the ellipsis (optional, '...' by default).

And an attribute, ellipsable_tags, which lists the tags that can contain the ellipsis (by default: p ol ul li div header article nav section footer aside dd dt dl).

Examples

A simple example:

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
# => "<p>Lorem ipsum dolor...</p>"

If the text is too short to be truncated, it won't be modified:

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 5)
# => "<p>Lorem ipsum dolor sit amet.</p>"

You can customize the ellipsis:

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, " (truncated)")
# => "<p>Lorem ipsum dolor (truncated)</p>"

And even have HTML in the ellipsis:

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, '<a href="/more-to-read">...</a>')
# => "<p>Lorem ipsum dolor<a href="/more-to-read">...</a></p>"

The ellipsis is put at the right place, inside <p>, but not <i>:

HTML_Truncator.truncate("<p><i>Lorem ipsum dolor sit amet.</i></p>", 3)
# => "<p><i>Lorem ipsum dolor</i>...</p>"

You can indicate that a tag can contain the ellipsis but adding it to the ellipsable_tags:

HTML_Truncator.ellipsable_tags << "blockquote"
HTML_Truncator.truncate("<blockquote>Lorem ipsum dolor sit amet.</blockquote>", 3)
# => "<blockquote>Lorem ipsum dolor...</blockquote>"

Alternatives

Rails has a truncate helper, but as the doc says:

Care should be taken if text contains HTML tags or entities, because truncation may produce invalid HTML (such as unbalanced or incomplete tags).

I know there are some Ruby code to truncate HTML, like:

But I'm not pleased with these solutions: they are either based on regexp for parsing the content (too fragile), they don't put the ellipsis where expected, they cut words and sometimes leave empty DOM nodes. So I made my own gem ;-)

Issues or Suggestions

Found an issue or have a suggestion? Please report it on Github's issue tracker.

If you wants to make a pull request, please check the specs before:

rspec spec

Credits

Thanks to François de Metz for his awesome help!

Copyright (c) 2011 Bruno Michel bmichel@menfin.info, released under the MIT license

Something went wrong with that request. Please try again.