Skip to content

Commit

Permalink
Upgrade to CLDR v35.1 (#226)
Browse files Browse the repository at this point in the history
  • Loading branch information
camertron committed Oct 15, 2019
1 parent 125819f commit 42d3476
Show file tree
Hide file tree
Showing 2,152 changed files with 940,921 additions and 131,379 deletions.
1 change: 0 additions & 1 deletion .travis.yml
Expand Up @@ -2,7 +2,6 @@ language: ruby
sudo: false
cache: bundler
rvm:
- 1.9.3
- 2.0.0
- 2.1.10
- 2.2.10
Expand Down
9 changes: 8 additions & 1 deletion CHANGELOG.md
@@ -1,5 +1,12 @@
# TwitterCldr Changelog

### 5.0.0 (October 15, 2019)
* Upgrade to Unicode v12.0.0, CLDR v35.1, and ICU 64.2.
* Fixes several transliteration bugs causing incorrect transform rules to be applied.
* BREAKING: `LocalizedNumber#to_short_decimal` and `LocalizedNumber#to_long_decimal` have been replaced with `LocalizedNumber#to_decimal#to_s(format: :short)` and `LocalizedNumber#to_decimal#to_s(format: :long)` respectively.
* BREAKING: Telephone code support has been removed since the data are no longer published in the CLDR data set.
* BREAKING: Dropped support for Ruby 1.9.

### 4.4.5 (August 11, 2019)
* Fix infinite recursion bug affecting certain Russian RBNF rule sets (and
possibly other locales).
Expand Down Expand Up @@ -232,7 +239,7 @@

### 1.6.2

* Collation tries now loaded from marshal dumps, collation running time improved by ~80%.
* Collation tries now loaded from marshal dumps, collation running time improved by \~80%.

### 1.6.1

Expand Down
13 changes: 7 additions & 6 deletions Gemfile
Expand Up @@ -6,16 +6,17 @@ group :development, :test do
gem 'rake'
gem 'pry-nav'
gem 'ruby-prof' unless RUBY_PLATFORM == 'java'
gem 'regexp_parser', '~> 0.1'
gem 'regexp_parser', '~> 0.5'
gem 'benchmark-ips'
gem 'rubyzip', '~> 1.0'
end

group :development do
gem 'nokogiri', "~> 1.5.9"
gem 'nokogiri', "~> 1.0"
gem 'parallel'

gem 'ruby-cldr', github: 'svenfuchs/ruby-cldr'
gem 'i18n', '~> 0.6.11'
gem 'i18n'
gem 'cldr-plurals', '~> 1.0'

gem 'rest-client', '~> 1.8'
Expand All @@ -24,11 +25,11 @@ end
group :test do
gem 'rspec', '~> 3.0'

gem 'term-ansicolor', '~> 1.3.0' # 1.4 breaks ruby 1.9 support
gem 'term-ansicolor', '~> 1.3'
gem 'coveralls', require: false
gem 'tins', '~> 1.6.0', require: false # 1.7 breaks ruby 1.9 support
gem 'tins', '~> 1.6', require: false

gem 'simplecov'
gem 'launchy'
gem 'addressable', '~> 2.4.0' # 2.5 breaks ruby 1.9 support
gem 'addressable', '~> 2.4'
end
69 changes: 28 additions & 41 deletions README.md
Expand Up @@ -18,7 +18,7 @@ require 'twitter_cldr'
Get a list of all currently supported locales (these are all supported on twitter.com):

```ruby
TwitterCldr.supported_locales # [:af, :ar, :be, :bg, :bn, :bo, ... ]
TwitterCldr.supported_locales # [:af, :ar, :az, :be, :bg, :bn, ... ]
```

Determine if a locale is supported by TwitterCLDR:
Expand Down Expand Up @@ -76,14 +76,14 @@ TwitterCldr::Shared::Currencies.for_code("CAD") # {:currency=>:CAD, :

#### Short / Long Decimals

In addition to formatting regular decimals, TwitterCLDR supports short and long decimals. Short decimals abbreviate the notation for the appropriate power of ten, for example "1M" for 1,000,000 or "2K" for 2,000. Long decimals include the full notation, for example "1 million" or "2 thousand". Long and short decimals can be generated using the appropriate `to_` method:
In addition to formatting regular decimals, TwitterCLDR supports short and long decimals. Short decimals abbreviate the notation for the appropriate power of ten, for example "1M" for 1,000,000 or "2K" for 2,000. Long decimals include the full notation, for example "1 million" or "2 thousand". Long and short decimals can be generated using the appropriate `format` option:

```ruby
2337.localize.to_short_decimal.to_s # "2K"
1337123.localize.to_short_decimal.to_s # "1M"
2337.localize.to_decimal.to_s(format: :short) # "2K"
1337123.localize.to_decimal.to_s(format: :short) # "1M"

2337.localize.to_long_decimal.to_s # "2 thousand"
1337123.localize.to_long_decimal.to_s # "1 million"
2337.localize.to_decimal.to_s(format: :long) # "2 thousand"
1337123.localize.to_decimal.to_s(format: :long) # "1 million"
```

### Units
Expand Down Expand Up @@ -207,7 +207,7 @@ dt.to_short_s # ...etc
Besides the default date formats, CLDR supports a number of additional ones. The list of available formats varies for each locale. To get a full list, use the `additional_formats` method:

```ruby
# ["E", "EEEEd", "EHm", "EHms", "Ed", "Ehm", "Ehms", "Gy", "GyMMM", "GyMMMEEEEd", "GyMMMEd", "GyMMMd", ... ]
# ["Bh", "Bhm", "Bhms", "E", "EBhm", "EBhms", "EEEEd", "EHm", "EHms", "Ed", "Ehm", "Ehms", ... ]
DateTime.now.localize(:ja).additional_formats
```

Expand All @@ -224,7 +224,12 @@ It's important to know that, even though any given format may not be available a

| Format | Output |
|:-----------|------------------------|
| Bh | 12 PM |
| Bhm | 12:20 PM |
| Bhms | 12:20:05 PM |
| E | Fri |
| EBhm | Fri 12:20 PM |
| EBhms | Fri 12:20:05 PM |
| EHm | Fri 12:20 |
| EHms | Fri 12:20:05 |
| Ed | 14 Fri |
Expand All @@ -237,21 +242,22 @@ It's important to know that, even though any given format may not be available a
| H | 12 |
| Hm | 12:20 |
| Hms | 12:20:05 |
| Hmsv | 12:20:05 v |
| Hmv | 12:20 v |
| Hmsv | 12:20:05 UTC |
| Hmv | 12:20 UTC |
| M | 2 |
| MEd | Fri, 2/14 |
| MMM | Feb |
| MMMEd | Fri, Feb 14 |
| MMMMW | week 3 of February |
| MMMMd | February 14 |
| MMMd | Feb 14 |
| Md | 2/14 |
| d | 14 |
| h | 12 PM |
| hm | 12:20 PM |
| hms | 12:20:05 PM |
| hmsv | 12:20:05 PM v |
| hmv | 12:20 PM v |
| hmsv | 12:20:05 PM UTC |
| hmv | 12:20 PM UTC |
| ms | 20:05 |
| y | 2014 |
| yM | 2/2014 |
Expand All @@ -263,6 +269,7 @@ It's important to know that, even though any given format may not be available a
| yMd | 2/14/2014 |
| yQQQ | Q1 2014 |
| yQQQQ | 1st quarter 2014 |
| yw | week 7 of 2014 |



Expand Down Expand Up @@ -372,7 +379,7 @@ TwitterCldr::Formatters::Plurals::Rules.all # [:one, :other]

# get all rules for a specific locale
TwitterCldr::Formatters::Plurals::Rules.all_for(:es) # [:one, :other]
TwitterCldr::Formatters::Plurals::Rules.all_for(:ru) # [:few, :many, :one, :other]
TwitterCldr::Formatters::Plurals::Rules.all_for(:ru) # [:one, :few, :many, :other]

# get the rule for a number in a specific locale
TwitterCldr::Formatters::Plurals::Rules.rule_for(1, :ru) # :one
Expand Down Expand Up @@ -476,7 +483,7 @@ You can use the localize convenience method on territory code symbols to get the

```ruby
:gb.localize(:pt).as_territory # "Reino Unido"
:cz.localize(:pt).as_territory # "República Tcheca"
:cz.localize(:pt).as_territory # "Tchéquia"
```

Behind the scenes, these convenience methods are creating instances of `LocalizedSymbol`. You can do the same thing if you're feeling adventurous:
Expand All @@ -490,20 +497,20 @@ In addition to translating territory codes, TwitterCLDR provides access to the f

```ruby
# get all territories for the default locale
TwitterCldr::Shared::Territories.all # { ... :tl => "East Timor", :tm => "Turkmenistan" ... }
TwitterCldr::Shared::Territories.all # { ... :tl => "Timor-Leste", :tm => "Turkmenistan" ... }

# get all territories for a specific locale
TwitterCldr::Shared::Territories.all_for(:pt) # { ... :tl => "República Democrática de Timor-Leste", :tm => "Turcomenistão" ... }
TwitterCldr::Shared::Territories.all_for(:pt) # { ... :tl => "Timor-Leste", :tm => "Turcomenistão" ... }

# get a territory by its code for the default locale
TwitterCldr::Shared::Territories.from_territory_code(:'gb') # "UK"
TwitterCldr::Shared::Territories.from_territory_code(:'gb') # "United Kingdom"

# get a territory from its code for a specific locale
TwitterCldr::Shared::Territories.from_territory_code_for_locale(:gb, :pt) # "Reino Unido"

# translate a territory from one locale to another
# signature: translate_territory(territory_name, source_locale, destination_locale)
TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en) # "UK"
TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en) # "United Kingdom"
TwitterCldr::Shared::Territories.translate_territory("U.K.", :en, :pt) # "Reino Unido"
```

Expand Down Expand Up @@ -554,32 +561,12 @@ postal_code.regexp # /(\d{5})(?:[ \-](\d{4}))?/
Get a sample of valid postal codes with the `#sample` method:

```ruby
postal_code.sample(5) # ["10781", "69079-7159", "79836-3996", "79771", "61093"]
postal_code.sample(5) # ["66877", "52179", "39565", "39335", "83881"]
```

### Phone Codes

Look up phone codes by territory:

```ruby
# United States
TwitterCldr::Shared::PhoneCodes.code_for_territory(:us) # "1"

# Perú
TwitterCldr::Shared::PhoneCodes.code_for_territory(:pe) # "51"

# Egypt
TwitterCldr::Shared::PhoneCodes.code_for_territory(:eg) # "20"

# Denmark
TwitterCldr::Shared::PhoneCodes.code_for_territory(:dk) # "45"
```

Get a list of supported territories by using the `#territories` method:

```ruby
TwitterCldr::Shared::PhoneCodes.territories # [:ac, :ad, :ae, :af, :ag, ... ]
```
Telephone codes were deprecated and have now been removed from the CLDR data set. They have been removed from TwitterCLDR as of v5.0.0.

### Language Codes

Expand Down Expand Up @@ -1055,7 +1042,7 @@ No external requirements.

`bundle exec rake` will run our basic test suite suitable for development. To run the full test suite, use `bundle exec rake spec:full`. The full test suite takes considerably longer to run because it runs against the complete normalization and collation test files from the Unicode Consortium. The basic test suite only runs normalization and collation tests against a small subset of the complete test file.

Tests are written in RSpec using RR as the mocking framework.
Tests are written in RSpec.

## Test Coverage

Expand All @@ -1078,6 +1065,6 @@ TwitterCLDR currently supports localization of certain textual objects in JavaSc

## License

Copyright 2017 Twitter, Inc.
Copyright 2019 Twitter, Inc.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
48 changes: 14 additions & 34 deletions README.md.erb
Expand Up @@ -18,7 +18,7 @@ require 'twitter_cldr'
Get a list of all currently supported locales (these are all supported on twitter.com):

```ruby
TwitterCldr.supported_locales # <%= ellipsize(assert(TwitterCldr.supported_locales.sort[0..5], [:af, :ar, :be, :bg, :bn, :bo])) %>
TwitterCldr.supported_locales # <%= ellipsize(assert(TwitterCldr.supported_locales.sort[0..5], [:af, :ar, :az, :be, :bg, :bn, :bo])) %>
```

Determine if a locale is supported by TwitterCLDR:
Expand Down Expand Up @@ -76,14 +76,14 @@ TwitterCldr::Shared::Currencies.for_code("CAD") # <%= assert(TwitterC

#### Short / Long Decimals

In addition to formatting regular decimals, TwitterCLDR supports short and long decimals. Short decimals abbreviate the notation for the appropriate power of ten, for example "1M" for 1,000,000 or "2K" for 2,000. Long decimals include the full notation, for example "1 million" or "2 thousand". Long and short decimals can be generated using the appropriate `to_` method:
In addition to formatting regular decimals, TwitterCLDR supports short and long decimals. Short decimals abbreviate the notation for the appropriate power of ten, for example "1M" for 1,000,000 or "2K" for 2,000. Long decimals include the full notation, for example "1 million" or "2 thousand". Long and short decimals can be generated using the appropriate `format` option:

```ruby
2337.localize.to_short_decimal.to_s # <%= assert(2337.localize.to_short_decimal.to_s, "2K").inspect %>
1337123.localize.to_short_decimal.to_s # <%= assert(1337123.localize.to_short_decimal.to_s, "1M").inspect %>
2337.localize.to_decimal.to_s(format: :short) # <%= assert(2337.localize.to_decimal.to_s(format: :short), "2K").inspect %>
1337123.localize.to_decimal.to_s(format: :short) # <%= assert(1337123.localize.to_decimal.to_s(format: :short), "1M").inspect %>

2337.localize.to_long_decimal.to_s # <%= assert(2337.localize.to_long_decimal.to_s, "2 thousand").inspect %>
1337123.localize.to_long_decimal.to_s # <%= assert(1337123.localize.to_long_decimal.to_s, "1 million").inspect %>
2337.localize.to_decimal.to_s(format: :long) # <%= assert(2337.localize.to_decimal.to_s(format: :long), "2 thousand").inspect %>
1337123.localize.to_decimal.to_s(format: :long) # <%= assert(1337123.localize.to_decimal.to_s(format: :long), "1 million").inspect %>
```

### Units
Expand Down Expand Up @@ -207,7 +207,7 @@ dt.to_short_s # ...etc
Besides the default date formats, CLDR supports a number of additional ones. The list of available formats varies for each locale. To get a full list, use the `additional_formats` method:

```ruby
# <%= ellipsize(assert(datetime.localize(:ja).additional_formats.sort[0..11], ["E", "EEEEd", "EHm", "EHms", "Ed", "Ehm", "Ehms", "Gy", "GyMMM", "GyMMMEEEEd", "GyMMMEd", "GyMMMd"])) %>
# <%= ellipsize(assert(datetime.localize(:ja).additional_formats.sort[0..11], ["Bh", "Bhm", "Bhms", "E", "EBhm", "EBhms", "EEEEd", "EHm", "EHms", "Ed", "Ehm", "Ehms"])) %>
DateTime.now.localize(:ja).additional_formats
```

Expand Down Expand Up @@ -437,7 +437,7 @@ You can use the localize convenience method on territory code symbols to get the

```ruby
:gb.localize(:pt).as_territory # <%= assert(:gb.localize(:pt).as_territory, "Reino Unido").inspect %>
:cz.localize(:pt).as_territory # <%= assert(:cz.localize(:pt).as_territory, "República Tcheca").inspect %>
:cz.localize(:pt).as_territory # <%= assert(:cz.localize(:pt).as_territory, "Tchéquia").inspect %>
```

Behind the scenes, these convenience methods are creating instances of `LocalizedSymbol`. You can do the same thing if you're feeling adventurous:
Expand All @@ -451,21 +451,21 @@ In addition to translating territory codes, TwitterCLDR provides access to the f

```ruby
# get all territories for the default locale
TwitterCldr::Shared::Territories.all # <%= ellipsize(assert(slice_hash(TwitterCldr::Shared::Territories.all, [:tl, :tm]), { :tl=>"East Timor", :tm=>"Turkmenistan" })) %>
TwitterCldr::Shared::Territories.all # <%= ellipsize(assert(slice_hash(TwitterCldr::Shared::Territories.all, [:tl, :tm]), { :tl=>"Timor-Leste", :tm=>"Turkmenistan" })) %>

# get all territories for a specific locale
TwitterCldr::Shared::Territories.all_for(:pt) # <%= ellipsize(assert(slice_hash(TwitterCldr::Shared::Territories.all_for(:pt), [:tl, :tm]), { :tl=>"República Democrática de Timor-Leste", :tm=>"Turcomenistão" })) %>
TwitterCldr::Shared::Territories.all_for(:pt) # <%= ellipsize(assert(slice_hash(TwitterCldr::Shared::Territories.all_for(:pt), [:tl, :tm]), { :tl=>"Timor-Leste", :tm=>"Turcomenistão" })) %>

# get a territory by its code for the default locale
TwitterCldr::Shared::Territories.from_territory_code(:'gb') # <%= assert(TwitterCldr::Shared::Territories.from_territory_code(:'gb'), "UK").inspect %>
TwitterCldr::Shared::Territories.from_territory_code(:'gb') # <%= assert(TwitterCldr::Shared::Territories.from_territory_code(:'gb'), "United Kingdom").inspect %>

# get a territory from its code for a specific locale
TwitterCldr::Shared::Territories.from_territory_code_for_locale(:gb, :pt) # <%= assert(TwitterCldr::Shared::Territories.from_territory_code_for_locale(:gb, :pt), "Reino Unido").inspect %>

# translate a territory from one locale to another
# signature: translate_territory(territory_name, source_locale, destination_locale)
TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en) # <%= assert(TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en), "UK").inspect %>
TwitterCldr::Shared::Territories.translate_territory("U.K.", :en, :pt) # <%= assert(TwitterCldr::Shared::Territories.translate_territory("UK", :en, :pt), "Reino Unido").inspect %>
TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en) # <%= assert(TwitterCldr::Shared::Territories.translate_territory("Reino Unido", :pt, :en), "United Kingdom").inspect %>
TwitterCldr::Shared::Territories.translate_territory("U.K.", :en, :pt) # <%= assert(TwitterCldr::Shared::Territories.translate_territory("United Kingdom", :en, :pt), "Reino Unido").inspect %>
```

### Postal Codes
Expand Down Expand Up @@ -520,27 +520,7 @@ postal_code.sample(5) # <%= postal_code.sample(5).to_s %>

### Phone Codes

Look up phone codes by territory:

```ruby
# United States
TwitterCldr::Shared::PhoneCodes.code_for_territory(:us) # <%= assert(TwitterCldr::Shared::PhoneCodes.code_for_territory(:us), "1").inspect %>

# Perú
TwitterCldr::Shared::PhoneCodes.code_for_territory(:pe) # <%= assert(TwitterCldr::Shared::PhoneCodes.code_for_territory(:pe), "51").inspect %>

# Egypt
TwitterCldr::Shared::PhoneCodes.code_for_territory(:eg) # <%= assert(TwitterCldr::Shared::PhoneCodes.code_for_territory(:eg), "20").inspect %>

# Denmark
TwitterCldr::Shared::PhoneCodes.code_for_territory(:dk) # <%= assert(TwitterCldr::Shared::PhoneCodes.code_for_territory(:dk), "45").inspect %>
```

Get a list of supported territories by using the `#territories` method:

```ruby
TwitterCldr::Shared::PhoneCodes.territories # <%= ellipsize(assert(TwitterCldr::Shared::PhoneCodes.territories.sort[0..4], [:ac, :ad, :ae, :af, :ag])) %>
```
Telephone codes were deprecated and have now been removed from the CLDR data set. They have been removed from TwitterCLDR as of v5.0.0.

### Language Codes

Expand Down

0 comments on commit 42d3476

Please sign in to comment.