Skip to content

Commit

Permalink
Adds basic collation support
Browse files Browse the repository at this point in the history
Collation (locale-sensitive sorting) is one of a few
high-value features which lean on *a lot* of the ICU
functionality and CLDR-supplied data in a way that can
not easily be replicated by other solutions.

This commit adds a basic collation API, which covers only
the main use cases of collation.

Fixes: 85
  • Loading branch information
filmil committed May 12, 2020
1 parent c7d42e7 commit 445653d
Show file tree
Hide file tree
Showing 14 changed files with 425 additions and 16 deletions.
5 changes: 3 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,16 @@
members = [
"rust_icu",
"rust_icu_common",
"rust_icu_intl",
"rust_icu_sys",
"rust_icu_ucal",
"rust_icu_ucol",
"rust_icu_udat",
"rust_icu_udata",
"rust_icu_uenum",
"rust_icu_uloc",
"rust_icu_umsg",
"rust_icu_ustring",
"rust_icu_utext",
"rust_icu_umsg",
"rust_icu_intl",
]

2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ publish:
$(call publish,rust_icu_ucal)
$(call publish,rust_icu_udat)
$(call publish,rust_icu_udata)
$(call publish,rust_icu_ucol)
$(call publish,rust_icu_umsg)
$(call publish,rust_icu)

Expand Down Expand Up @@ -124,6 +125,7 @@ uprev:
$(call uprev,rust_icu_udata)
$(call uprev,rust_icu_umsg)
$(call uprev,rust_icu_intl)
$(call uprev,rust_icu_ucol)
$(call uprev,rust_icu)

cov:
Expand Down
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,15 @@ coverage in the headers.
| [rust_icu_common](https://crates.io/crates/rust_icu_common)| Commonly used low-level wrappings of the bindings. |
| [rust_icu_intl](https://crates.io/crates/rust_icu_intl)| Implements ECMA 402 recommendation APIs. |
| [rust_icu_sys](https://crates.io/crates/rust_icu_sys)| Low-level bindings code |
| [rust_icu_ucal](https://crates.io/crates/rust_icu_ucal)| Implements [`ucal.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ucal_8h.html) C API header from the ICU library. |
| [rust_icu_udat](https://crates.io/crates/rust_icu_udat)| Implements [`udat.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/udat_8h.html) C API header from the ICU library. |
| [rust_icu_udata](https://crates.io/crates/rust_icu_udata)| Implements [`udata.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/udata_8h.html) C API header from the ICU library. |
| [rust_icu_uenum](https://crates.io/crates/rust_icu_uenum)| Implements [`uenum.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uenum_8h.html) C API header from the ICU library. Mainly `UEnumeration` and friends. |
| [rust_icu_uloc](https://crates.io/crates/rust_icu_uloc)| Implements [`uloc.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uloc_8h.html) C API header from the ICU library. |
| [rust_icu_ustring](https://crates.io/crates/rust_icu_ustring)| Implements [`ustring.h`]() C API header from the ICU library. |
| [rust_icu_utext](https://crates.io/crates/rust_icu_utext)| Implements [`utext.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/utext_8h.html) C API header from the ICU library. |
| [rust_icu_umsg](https://crates.io/crates/rust_icu_umsg)| Implements [`umsg.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/umsg_8h.html) C API header from the ICU library. |
| [rust_icu_ucal](https://crates.io/crates/rust_icu_ucal)| ICU Calendar. Implements [`ucal.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ucal_8h.html) C API header from the ICU library. |
| [rust_icu_ucol](https://crates.io/crates/rust_icu_ucol)| Collation support. Implements [`ucol.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ucol_8h.html) C API header from the ICU library. |
| [rust_icu_udat](https://crates.io/crates/rust_icu_udat)| ICU date and time. Implements [`udat.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/udat_8h.html) C API header from the ICU library. |
| [rust_icu_udata](https://crates.io/crates/rust_icu_udata)| ICU binary data. Implements [`udata.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/udata_8h.html) C API header from the ICU library. |
| [rust_icu_uenum](https://crates.io/crates/rust_icu_uenum)| ICU enumerations. Implements [`uenum.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uenum_8h.html) C API header from the ICU library. Mainly `UEnumeration` and friends. |
| [rust_icu_uloc](https://crates.io/crates/rust_icu_uloc)| Locale support. Implements [`uloc.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uloc_8h.html) C API header from the ICU library. |
| [rust_icu_umsg](https://crates.io/crates/rust_icu_umsg)| MessageFormat support. Implements [`umsg.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/umsg_8h.html) C API header from the ICU library. |
| [rust_icu_ustring](https://crates.io/crates/rust_icu_ustring)| ICU strings. Implements [`ustring.h`]() C API header from the ICU library. |
| [rust_icu_utext](https://crates.io/crates/rust_icu_utext)| Text operations. Implements [`utext.h`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/utext_8h.html) C API header from the ICU library. |

# Limitations

Expand Down
1 change: 1 addition & 0 deletions build/showprogress.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ cd $TOP_DIR

C_API_HEADER_NAMES=(
"ucal"
"ucol"
"udat"
"udata"
"uenum"
Expand Down
59 changes: 57 additions & 2 deletions coverage/report.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

| Header | Implemented |
| ------ | ----------- |
| `ucal.h` | 15 / 46 |
| `ucal.h` | 15 / 45 |
| `ucol.h` | 2 / 50 |
| `udat.h` | 6 / 38 |
| `udata.h` | 2 / 8 |
| `uenum.h` | 8 / 8 |
Expand Down Expand Up @@ -46,7 +47,6 @@
| `ucal_getDSTSavings` | |
| `ucal_getFieldDifference` | |
| `ucal_getGregorianChange` | |
| `ucal_getHostTimeZone` | |
| `ucal_getKeywordValuesForLocale` | |
| `ucal_getLimit` | |
| `ucal_getLocaleByType` | |
Expand All @@ -65,6 +65,61 @@
| `ucal_setGregorianChange` | |
| `ucal_setTimeZone` | |

# Header: `ucol.h`

| Unimplemented | Implemented |
| ------------- | ----------- |
| | `ucol_close` |
| | `ucol_strcoll` |
| `ucol_cloneBinary` | |
| `ucol_countAvailable` | |
| `ucol_equal` | |
| `ucol_getAttribute` | |
| `ucol_getAvailable` | |
| `ucol_getBound` | |
| `ucol_getContractions` | |
| `ucol_getContractionsAndExpansions` | |
| `ucol_getDisplayName` | |
| `ucol_getEquivalentReorderCodes` | |
| `ucol_getFunctionalEquivalent` | |
| `ucol_getKeywords` | |
| `ucol_getKeywordValues` | |
| `ucol_getKeywordValuesForLocale` | |
| `ucol_getLocale` | |
| `ucol_getLocaleByType` | |
| `ucol_getMaxVariable` | |
| `ucol_getReorderCodes` | |
| `ucol_getRules` | |
| `ucol_getRulesEx` | |
| `ucol_getShortDefinitionString` | |
| `ucol_getSortKey` | |
| `ucol_getStrength` | |
| `ucol_getTailoredSet` | |
| `ucol_getUCAVersion` | |
| `ucol_getUnsafeSet` | |
| `ucol_getVariableTop` | |
| `ucol_getVersion` | |
| `ucol_greater` | |
| `ucol_greaterOrEqual` | |
| `ucol_mergeSortkeys` | |
| `ucol_nextSortKeyPart` | |
| `ucol_normalizeShortDefinitionString` | |
| `ucol_open` | |
| `ucol_openAvailableLocales` | |
| `ucol_openBinary` | |
| `ucol_openFromShortString` | |
| `ucol_openRules` | |
| `ucol_prepareShortStringOpen` | |
| `ucol_restoreVariableTop` | |
| `ucol_safeClone` | |
| `ucol_setAttribute` | |
| `ucol_setMaxVariable` | |
| `ucol_setReorderCodes` | |
| `ucol_setStrength` | |
| `ucol_setVariableTop` | |
| `ucol_strcollIter` | |
| `ucol_strcollUTF8` | |

# Header: `udat.h`

| Unimplemented | Implemented |
Expand Down
1 change: 0 additions & 1 deletion coverage/ucal_all.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ ucal_getDefaultTimeZone
ucal_getDSTSavings
ucal_getFieldDifference
ucal_getGregorianChange
ucal_getHostTimeZone
ucal_getKeywordValuesForLocale
ucal_getLimit
ucal_getLocaleByType
Expand Down
50 changes: 50 additions & 0 deletions coverage/ucol_all.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
ucol_cloneBinary
ucol_close
ucol_countAvailable
ucol_equal
ucol_getAttribute
ucol_getAvailable
ucol_getBound
ucol_getContractions
ucol_getContractionsAndExpansions
ucol_getDisplayName
ucol_getEquivalentReorderCodes
ucol_getFunctionalEquivalent
ucol_getKeywords
ucol_getKeywordValues
ucol_getKeywordValuesForLocale
ucol_getLocale
ucol_getLocaleByType
ucol_getMaxVariable
ucol_getReorderCodes
ucol_getRules
ucol_getRulesEx
ucol_getShortDefinitionString
ucol_getSortKey
ucol_getStrength
ucol_getTailoredSet
ucol_getUCAVersion
ucol_getUnsafeSet
ucol_getVariableTop
ucol_getVersion
ucol_greater
ucol_greaterOrEqual
ucol_mergeSortkeys
ucol_nextSortKeyPart
ucol_normalizeShortDefinitionString
ucol_open
ucol_openAvailableLocales
ucol_openBinary
ucol_openFromShortString
ucol_openRules
ucol_prepareShortStringOpen
ucol_restoreVariableTop
ucol_safeClone
ucol_setAttribute
ucol_setMaxVariable
ucol_setReorderCodes
ucol_setStrength
ucol_setVariableTop
ucol_strcoll
ucol_strcollIter
ucol_strcollUTF8
2 changes: 2 additions & 0 deletions coverage/ucol_implemented.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ucol_close
ucol_strcoll
5 changes: 5 additions & 0 deletions rust_icu/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ rust_icu_udat = { path = "../rust_icu_udat", version = "0.1.4", default-features
rust_icu_udata = { path = "../rust_icu_udata", version = "0.1.4", default-features = false }
rust_icu_uenum = { path = "../rust_icu_uenum", version = "0.1.4", default-features = false }
rust_icu_uloc = { path = "../rust_icu_uloc", version = "0.1.4", default-features = false }
rust_icu_ucol = { path = "../rust_icu_ucol", version = "0.1.4", default-features = false }
rust_icu_umsg = { path = "../rust_icu_umsg", version = "0.1.4", default-features = false }
rust_icu_ustring = { path = "../rust_icu_ustring", version = "0.1.4", default-features = false }
rust_icu_utext = { path = "../rust_icu_utext", version = "0.1.4", default-features = false }
Expand All @@ -38,6 +39,7 @@ use-bindgen = [
"rust_icu_common/use-bindgen",
"rust_icu_sys/use-bindgen",
"rust_icu_ucal/use-bindgen",
"rust_icu_ucol/use-bindgen",
"rust_icu_udat/use-bindgen",
"rust_icu_udata/use-bindgen",
"rust_icu_uenum/use-bindgen",
Expand All @@ -50,6 +52,7 @@ renaming = [
"rust_icu_common/renaming",
"rust_icu_sys/renaming",
"rust_icu_ucal/renaming",
"rust_icu_ucol/renaming",
"rust_icu_udat/renaming",
"rust_icu_udata/renaming",
"rust_icu_uenum/renaming",
Expand All @@ -62,6 +65,7 @@ icu_config = [
"rust_icu_common/icu_config",
"rust_icu_sys/icu_config",
"rust_icu_ucal/icu_config",
"rust_icu_ucol/icu_config",
"rust_icu_udat/icu_config",
"rust_icu_udata/icu_config",
"rust_icu_uenum/icu_config",
Expand All @@ -74,6 +78,7 @@ icu_version_in_env = [
"rust_icu_common/icu_version_in_env",
"rust_icu_sys/icu_version_in_env",
"rust_icu_ucal/icu_version_in_env",
"rust_icu_ucol/icu_version_in_env",
"rust_icu_udat/icu_version_in_env",
"rust_icu_udata/icu_version_in_env",
"rust_icu_uenum/icu_version_in_env",
Expand Down
8 changes: 6 additions & 2 deletions rust_icu/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,22 +27,26 @@
//!
//! | Original | Remapped |
//! | -------- | -------- |
//! | rust_icu_sys | icu::sys |
//! | rust_icu_common | icu::common |
//! | rust_icu_sys | icu::sys |
//! | rust_icu_ucal | icu::cal |
//! | rust_icu_ucol | icu::col |
//! | rust_icu_udat | icu::dat |
//! | rust_icu_udata | icu::data |
//! | rust_icu_uenum | icu::enums |
//! | rust_icu_uloc | icu::loc |
//! | rust_icu_umsg | icu::msg |
//! | rust_icu_ustring | icu::string |
//! | rust_icu_utext | text |

pub use rust_icu_sys as sys;
pub use rust_icu_common as common;
pub use rust_icu_sys as sys;
pub use rust_icu_ucal as cal;
pub use rust_icu_ucol as col;
pub use rust_icu_udat as dat;
pub use rust_icu_udata as data;
pub use rust_icu_uenum as enums;
pub use rust_icu_uloc as loc;
pub use rust_icu_umsg as msg;
pub use rust_icu_ustring as string;
pub use rust_icu_utext as text;
6 changes: 5 additions & 1 deletion rust_icu_sys/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ lazy_static! {
// headers. Any of these will fail if the required binaries are not present in $PATH.
static ref BINDGEN_SOURCE_MODULES: Vec<&'static str> = vec![
"ucal", "udat", "udata", "uenum", "ustring", "utext", "uclean", "umsg",
"ucol",
"ucol", "uset",
];

// C functions that will be made available to rust code. Add more to this list if you want to
Expand All @@ -50,6 +50,7 @@ lazy_static! {
"uloc_.*",
"utext_.*",
"umsg_.*",
"ucol_.*",
];

// C types that will be made available to rust code. Add more to this list if you want to
Expand All @@ -67,6 +68,9 @@ lazy_static! {
"UMessageFormat",
"UParseError",
"UText",
"UCollator",
"USet",
"UCol.*",
];
}

Expand Down
69 changes: 69 additions & 0 deletions rust_icu_ucol/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
[package]
authors = ["Google Inc."]
edition = "2018"
license = "Apache-2.0"
name = "rust_icu_ucol"
readme = "README.md"
repository = "https://github.com/google/rust_icu"
version = "0.1.4"
default-features = false
keywords = ["icu", "unicode", "i18n", "l10n"]

description = """
Native bindings to the ICU4C library from Unicode.
- ucol.h: Collation support
"""

[dependencies]
log = "0.4.6"
paste = "0.1.5"
rust_icu_common = { path = "../rust_icu_common", version = "0.1.4", default-features = false }
rust_icu_sys = { path = "../rust_icu_sys", version = "0.1.4", default-features = false }
rust_icu_uenum = { path = "../rust_icu_uenum", version = "0.1.4", default-features = false }
rust_icu_ustring = { path = "../rust_icu_ustring", version = "0.1.4", default-features = false }
anyhow = "1.0.25"

[dev-dependencies]
anyhow = "1.0.25"

# See the feature description in ../rust_icu_sys/Cargo.toml for details.
[features]
default = ["use-bindgen", "renaming", "icu_config"]

use-bindgen = [
"rust_icu_common/use-bindgen",
"rust_icu_sys/use-bindgen",
"rust_icu_uenum/use-bindgen",
"rust_icu_ustring/use-bindgen",
]
renaming = [
"rust_icu_common/renaming",
"rust_icu_sys/renaming",
"rust_icu_uenum/renaming",
"rust_icu_ustring/renaming",
]
icu_config = [
"rust_icu_common/icu_config",
"rust_icu_sys/icu_config",
"rust_icu_uenum/icu_config",
"rust_icu_ustring/icu_config",
]
icu_version_in_env = [
"rust_icu_common/icu_version_in_env",
"rust_icu_sys/icu_version_in_env",
"rust_icu_uenum/icu_version_in_env",
"rust_icu_ustring/icu_version_in_env",
]
icu_version_64_plus = []
icu_version_67_plus = []

[build-dependencies]
anyhow = "1.0"
bindgen = "0.53.2"

[badges]
maintenance = { status = "actively-developed" }
is-it-maintained-issue-resolution = { repository = "google/rust_icu" }
is-it-maintained-open-issues = { repository = "google/rust_icu" }
travis-ci = { repository = "google/rust_icu", branch = "master" }
1 change: 1 addition & 0 deletions rust_icu_ucol/README.md

0 comments on commit 445653d

Please sign in to comment.