Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluralization #14

Merged
merged 8 commits into from
Apr 25, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
96 changes: 95 additions & 1 deletion NOTICE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
twitter-cldr-rb is a Ruby implementation of the Common Locale Data Repository
twitter-cldr-rb is a Ruby implementation of the Common Locale Data Repository
Copyright (C) 2012 Twitter, Inc.


Portions of this gem were borrowed from Sven Fuchs' ruby-cldr gem. Here is
the license that accompanied Mr. Fuchs' code:

Expand All @@ -24,3 +25,96 @@ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Backporting of String interpolation from Ruby 1.9 to Ruby 1.8 implemented in
this gem (see file interpolation.rb) was partially copied and heavily based
on the implementation of i18n and gettext gems. Below are the license agreements
that accompanied these gems.

License agreement of the i18n gem (https://github.com/svenfuchs/i18n):

Copyright (c) 2008 The Ruby I18n team

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

License of the gettext gem (https://github.com/mutoh/gettext):

Copyright (C) 2005-2009 Masao Mutoh

You may redistribute it [the code] and/or modify it under the same license
terms as Ruby or LGPL [http://www.gnu.org/licenses/lgpl-3.0.txt].

Ruby is copyrighted free software by Yukihiro Matsumoto <matz@netlab.jp>.
You can redistribute it and/or modify it under either the terms of the GPL
version 2 (see the file GPL), or the conditions below:

1. You may make and give away verbatim copies of the source form of the
software without restriction, provided that you duplicate all of the
original copyright notices and associated disclaimers.

2. You may modify your copy of the software in any way, provided that
you do at least ONE of the following:

a) place your modifications in the Public Domain or otherwise
make them Freely Available, such as by posting said
modifications to Usenet or an equivalent medium, or by allowing
the author to include your modifications in the software.

b) use the modified software only within your corporation or
organization.

c) give non-standard binaries non-standard names, with
instructions on where to get the original software distribution.

d) make other distribution arrangements with the author.

3. You may distribute the software in object code or binary form,
provided that you do at least ONE of the following:

a) distribute the binaries and library files of the software,
together with instructions (in the manual page or equivalent)
on where to get the original distribution.

b) accompany the distribution with the machine-readable source of
the software.

c) give non-standard binaries non-standard names, with
instructions on where to get the original software distribution.

d) make other distribution arrangements with the author.

4. You may modify and include the part of the software into any other
software (possibly commercial). But some files in the distribution
are not written by the author, so that they are not under these terms.

For the list of those files and their copying conditions, see the
file LEGAL.

5. The scripts and library files supplied as input to or produced as
output from the software do not automatically fall under the
copyright of the software, but belong to whomever generated them,
and may be sold commercially, and may be aggregated with this
software.

6. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.
28 changes: 28 additions & 0 deletions lib/ext/strings/string.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# encoding: UTF-8

class String
def localize(locale = TwitterCldr.get_locale)
TwitterCldr::LocalizedString.new(self, locale)
end
end

module TwitterCldr
class LocalizedString < LocalizedObject

# Uses wrapped string object as a format specification and returns the result of applying it to +args+ (see
# +TwitterCldr.interpolate+ method for interpolation syntax).
#
# If +args+ is a Hash than pluralization is performed before interpolation (see +PluralFormatter+ class for
# pluralization specification).
#
def %(args)
pluralized = args.is_a?(Hash) ? @formatter.format(@base_obj, args) : @base_obj
TwitterCldr.interpolate(pluralized, args)
end

def formatter_const
TwitterCldr::Formatters::PluralFormatter
end

end
end
81 changes: 81 additions & 0 deletions lib/formatters/plurals/plural_formatter.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# encoding: UTF-8

module TwitterCldr
module Formatters
class PluralFormatter < Base

PLURAL_INTERPOLATION_RE = /%\{(\w+?):(\w+?)\}/

attr_accessor :locale

def initialize(options = {})
self.locale = extract_locale(options)
end

# Replaces every pluralization token in the +string+ with a phrase formed using a number and a pluralization
# pattern from the +replacements+ hash.
#
# Format of a pluralization token is '%{number:objects}'. When pluralization token like that is encountered in
# the +string+, +replacements+ hash is expected to contain a number and pluralization patterns at keys +:number+
# and +:objects+ respectively (note, keys of the +replacements+ hash should be symbols).
#
# Pluralization patterns are specified as a hash containing a pattern for every plural category of the language.
# Keys of this hash should be symbols. If necessary, pluralization pattern can contain placeholder for the number.
# Syntax for the placeholder is similar to the hash-based string interpolation: '%{number}.
#
# Examples:
#
# f.format('%{count:horses}', :count => 1, :horses => { :one => 'one horse', :other => '%{count} horses' })
# # => "one horse"
#
# f.format('%{count:horses}', :count => 2, :horses => { :one => 'one horse', :other => '%{count} horses' })
# # => "2 horses"
#
# Multiple pluralization groups can be present in the same string.
#
# Examples:
#
# f.format(
# '%{ponies_count:ponies} and %{unicorns_count:unicorns}',
# :ponies_count => 2, :ponies => { :one => 'one pony', :other => '%{ponies_count} ponies' },
# :unicorns_count => 1, :unicorns => { :one => 'one unicorn', :other => '%{unicorns_count} unicorns' }
# )
# # => "2 ponies and one unicorn"
#
# If a number or required pluralization pattern is missing in the +replacements+ hash, corresponding
# pluralization token is ignored.
#
# Examples:
#
# f.format('%{count:horses}', :horses => { :one => 'one horse', :other => '%{count} horses' })
# # => "%{count:horses}"
#
# f.format('%{count:horses}', :count => 10, :horses => { :one => 'one horse' })
# # => "%{count:horses}"
#
# f.format('%{count:horses}', {})
# # => "%{count:horses}"
#
def format(string, replacements)
string.gsub(PLURAL_INTERPOLATION_RE) do |match|
number = replacements[$1.to_sym]
patterns = replacements[$2.to_sym]
pattern = number && patterns && patterns[pluralization_rule(number)]

pattern && interpolate_pattern(pattern, $1, number) || match
end
end

private

def pluralization_rule(number)
TwitterCldr::Formatters::Plurals::Rules.rule_for(number, locale)
end

def interpolate_pattern(pattern, placeholder, number)
pattern.gsub("%{#{placeholder}}", number.to_s)
end

end
end
end
100 changes: 100 additions & 0 deletions lib/interpolation.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# encoding: UTF-8

# The implementation of the TwitterCldr.interpolate method that backports String interpolation capabilities
# (originally implemented in String#% method) from Ruby 1.9 to Ruby 1.8 is heavily influenced by the
# implementation of the same feature in i18n (https://github.com/svenfuchs/i18n/blob/89ea337f48562370988421e50caa7c2fe89452c7/lib/i18n/core_ext/string/interpolate.rb)
# and gettext (https://github.com/mutoh/gettext/blob/11b8c1525ba9f00afb1942f7ebf34bec12f7558b/lib/gettext/core_ext/string.rb) gems.
#
# See NOTICE file for corresponding license agreements.


# KeyError is raised during interpolation when there is a placeholder that doesn't have corresponding key in the
# interpolation hash. KeyError is defined in 1.9. We define it for prior versions of Ruby to have the same behavior.
#
class KeyError < IndexError
def initialize(message = nil)
super(message || 'key not found')
end
end unless defined?(KeyError)


module TwitterCldr

HASH_INTERPOLATION_REGEXP = Regexp.union(
/%\{(\w+)\}/,
/%<(\w+)>(.*?\d*\.?\d*[bBdiouxXeEfgGcps])/
)

HASH_INTERPOLATION_WITH_ESCAPE_REGEXP = Regexp.union(
/%%/,
HASH_INTERPOLATION_REGEXP
)

class << self

# Uses +string+ as a format specification and returns the result of applying it to +args+.
#
# There are three ways to use it:
#
# * Using a single argument or Array of arguments.
#
# This is the default behaviour of the String#% method. See Kernel#sprintf for more details about the format
# specification.
#
# Example:
#
# TwitterCldr.interpolate('%d %s', [1, 'message'])
# # => "1 message"
#
# * Using a Hash as an argument and unformatted, named placeholders (Ruby 1.9 syntax).
#
# When you pass a Hash as an argument and specify placeholders with %{foo} it will interpret the hash values as
# named arguments.
#
# Example:
#
# TwitterCldr.interpolate('%{firstname}, %{lastname}', :firstname => 'Masao', :lastname => 'Mutoh')
# # => "Masao Mutoh"
#
# * Using a Hash as an argument and formatted, named placeholders (Ruby 1.9 syntax).
#
# When you pass a Hash as an argument and specify placeholders with %<foo>d it will interpret the hash values
# as named arguments and format the value according to the formatting instruction appended to the closing >.
#
# Example:
#
# TwitterCldr.interpolate('%<integer>d, %<float>.1f', :integer => 10, :float => 43.4)
# # => "10, 43.3"
#
# An exception can be thrown in two cases when Ruby 1.9 interpolation syntax is used:
#
# * ArgumentError is thrown if Ruby 1.9. interpolation syntax is used in +string+, but +args+ is not a Hash;
# * KeyError is thrown if the value for one of the placeholders in +string+ is missing in +args+ hash.
#
def interpolate(string, args)
string =~ HASH_INTERPOLATION_REGEXP ? interpolate_hash(string, args) : interpolate_value_or_array(string, args)
end

private

def interpolate_hash(string, args)
raise ArgumentError unless args.is_a?(Hash)

string.gsub(HASH_INTERPOLATION_WITH_ESCAPE_REGEXP) do |match|
if match == '%%'
'%'
else
key = ($1 || $2).to_sym
raise KeyError unless args.has_key?(key)
$3 ? sprintf("%#{$3}", args[key]) : args[key]
end
end
end

def interpolate_value_or_array(string, args)
string.gsub(/%([{<])/, '%%\1') % args
end

end

end