Skip to content

Commit

Permalink
Merge branch 'master' into edge
Browse files Browse the repository at this point in the history
Conflicts:
	Gemfile.default
  • Loading branch information
norman committed Nov 4, 2010
2 parents e650c03 + 357b51b commit d482d3e
Show file tree
Hide file tree
Showing 6 changed files with 77 additions and 24 deletions.
1 change: 0 additions & 1 deletion Gemfile.default
@@ -1,3 +1,2 @@
source :rubygems
gem "rbench"
gemspec
74 changes: 56 additions & 18 deletions README.md
@@ -1,8 +1,13 @@
# Babosa

Babosa is a library for creating slugs. It is an extraction and improvement of
the string code from [FriendlyId](http://github.com/norman/friendly_id),
intended to help developers create similar libraries and plugins.
Babosa is a library for creating human-friendly identifiers. Its primary
intended purpose is for creating URL slugs, but can also be useful for
normalizing and sanitizing data.

It is an extraction and improvement of the string code from
[FriendlyId](http://github.com/norman/friendly_id). I have released this as a
separate library to help developers who want to create libraries similar to
FriendlyId.

## Features / Usage

Expand All @@ -15,8 +20,8 @@ intended to help developers create similar libraries and plugins.
"Jürgen Müller".to_slug.approximate_ascii.to_s #=> "Jurgen Muller"
"Jürgen Müller".to_slug.approximate_ascii(:german).to_s #=> "Juergen Mueller"

Currently, only German, Spanish and Serbian are supported. I'll gladly accept
contributions and support more languages.
Supported language currently include Danish, German, Serbian and Spanish. I'll
gladly accept contributions and support more languages.

### Non-ASCII removal

Expand All @@ -41,28 +46,59 @@ whose length is limited by bytes rather than UTF-8 characters.

"Gölcük, Turkey".to_slug.normalize.to_s #=> "golcuk-turkey"

### Other stuff

Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use UTF-8 chars
in method names, but you may not want to):


"this is a method".to_slug.to_ruby_method! #=> this_is_a_method
"über cool stuff!".to_slug.to_ruby_method! #=> uber_cool_stuff!

# You can also disallow trailing punctuation chars
"über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff


You can add not only transliterations, but expansions for some characters if you want:

Babosa::Characters.add_approximations(:user, {
"0" => "oh",
"1" => "one",
"2" => "two",
"3" => "three",
"." => " dot "
})
"Web 2.0".to_slug.normalize!(:transliterations => :user) #=> "web-two-dot-oh"

### UTF-8 support

Babosa has no hard dependencies, but if you have either the Unicode or
ActiveSupport gems installed and required prior to requiring "babosa", these
will be used to perform upcasing and downcasing on UTF-8 strings. On JRuby 1.5
and above, Java's native Unicode support will be used.
and above, Java's native Unicode support will be used instead. Unless you're on
JRuby, which already has excellent support for Unicode via Java's Standard
Library, I recommend using the Unicode gem because it's the fastest Ruby
Unicode library available.

If none of these libraries are available, Babosa falls back to a simple module
which only supports Latin characters. I recommend using the Unicode gem where
possible since it's a C extension and is very fast.
which only supports Latin characters.

This default module is fast and can do very naive Unicode composition to ensure
that, for example, "é" will always be composed to a single codepoint rather
than an "e" and a "´" - making it safe to use as a hash key. But seriously -
save yourself the headache and install a real Unicode library.


### Rails 3

Most of Babosa's functionality is already present in Active Support/Rails 3.
Babosa exists primarily to support non-Rails applications, and Rails apps prior
to 3.0. Most of the code here was originally written for FriendlyId. Several
things, like tidy_bytes and ASCII transliteration, were later added to Rails and I18N.
things, like `tidy_bytes` and ASCII transliteration, were later added to Rails
and I18N.

Babosa differs from ActiveSupport primarily in that it supports non-Latin
strings by default, and has per-locale transliterations already baked-in. If
strings by default, and has per-locale ASCII transliterations already baked-in. If
you are considering using Babosa with Rails 3, you should first take a look at
Active Support's
[transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
Expand All @@ -82,8 +118,8 @@ Babosa can be installed via Rubygems:

You can get the source code from its [Github repository](http://github.com/norman/babosa).

Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4-1.5,
Rubinius 1.0, and is probably compatible with other Rubies as well.
Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4-1.5, and
Rubinius 1.0.x. It's probably compatible with other Rubies as well.

## Reporting bugs

Expand All @@ -100,24 +136,26 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue

## Contributors

* [Molte Emil Strange Andersen](http://github.com/molte) - Danish support
* [Milan Dobrota](http://github.com/milandobrota) - Serbian support


## Changelog

* 0.2.0 - Added support for Danish. Added method to generate Ruby identifiers. Improved performance.
* 0.1.1 - Added support for Serbian.
* 0.1.0 - Initial extraction from FriendlyId.

## Copyright

Copyright (c) 2010 Norman Clarke

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Expand Down
1 change: 1 addition & 0 deletions Rakefile
@@ -1,3 +1,4 @@
require "rubygems"
require "rake/testtask"
require "rake/clean"
require "rake/gempackagetask"
Expand Down
17 changes: 15 additions & 2 deletions lib/babosa/identifier.rb
Expand Up @@ -137,8 +137,21 @@ def normalize!(options = nil)
end

# Normalize a string so that it can safely be used as a Ruby method name.
def to_ruby_method!
normalize!(:to_ascii => true, :separator => "_")
def to_ruby_method!(allow_bangs = true)
leader, trailer = @wrapped_string.strip.scan(/\A(.+)(.)\z/).flatten
if allow_bangs
trailer.downcase.gsub!(/[^a-z0-9!=\\\\?]/, '')
else
trailer.downcase.gsub!(/[^a-z0-9]/, '')
end
id = leader.to_identifier
id.transliterate!
id.to_ascii!
id.clean!
id.word_chars!
id.clean!
@wrapped_string = id.to_s + trailer
with_separators!("_")
end

# Delete any non-ascii characters.
Expand Down
2 changes: 1 addition & 1 deletion lib/babosa/utf8/java_proxy.rb
Expand Up @@ -6,7 +6,7 @@ module UTF8
module JavaProxy
extend UTF8Proxy
extend self
import java.text.Normalizer
java_import java.text.Normalizer

def downcase(string)
string.to_java.to_lower_case.to_s
Expand Down
6 changes: 4 additions & 2 deletions test/babosa_test.rb
Expand Up @@ -181,7 +181,9 @@ class BabosaTest < Test::Unit::TestCase
end

test "should get a string suitable for use as a ruby method" do
ss = "カタカナ: katakana is über cool".to_identifier
assert_equal "katakana_is_uber_cool", ss.to_ruby_method!
assert_equal "hello_world?", "¿¿¿hello... world???".to_slug.to_ruby_method!
assert_equal "katakana_is_uber_cool", "カタカナ: katakana is über cool".to_slug.to_ruby_method!
assert_equal "katakana_is_uber_cool!", "カタカナ: katakana is über cool!".to_slug.to_ruby_method!
assert_equal "katakana_is_uber_cool", "カタカナ: katakana is über cool".to_slug.to_ruby_method!(false)
end
end

0 comments on commit d482d3e

Please sign in to comment.