Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
README for pg_collkey version 0.3 Please read the LICENSE file, which is shipping with this software. *** DESCRIPTION *** This is a wrapper to use the collation functions of the ICU library with a PostgreSQL database server. Using this wrapper, you can specify the desired locale for sorting UTF-8 strings directly in the SQL query, rather than setting it during database installation. Default Unicode collation (DUCET) is supported. You can select whether punctuation should be a primary collation attribute or not. The level of comparison can be limited (in order to ignore accents for example). Numeric sequences of strings can be recognized, so that 'test2' is placed before 'test10'. *** REQUIREMENTS *** - PostgreSQL version 8.1.4, http://www.postgresql.org/ (might run with older versions too, but this is not tested) - ICU library version 3.4, http://icu.sourceforge.net/ (might run with older versions too, but this is not tested) *** INSTALLATION *** 1. Ensure icu-config and pg_config binaries are available to resolve compilation flags for ICU and PostgreSQL. 2. Call 'make'. 3. Call 'make install' to install into your PostgreSQL library directory. You will probably need to prefix this with "sudo" if you are a non-root user. Now you can use the 'collkey_icu.sql' file to enable the 'collkey' function for a database. *** CAVEATS *** The function will only work if the database encoding is set to UTF-8. This is a limitation which can be solved in future releases. Using different locales within one query can slow down perfomance. *** USAGE *** collkey (text, text, bool, int4, bool) RETURNS bytea The first parameter is the UTF-8 string for which you want a collation key. The second parameter is defining the locale (i.e. 'root' or 'de_DE', see http://icu.sourceforge.net/ for more information). Use 'root', if you want a default unicode collation (DUCET). If the third parameter is true, certain characters (puncuation, etc.) are processed at the 4th level (after processing of the case), instead of being processed at the 1st level. The fourth parameter is selecting the level of collation: 0: default level (usually 3rd level) 1: 1st level, only base characters are compared, accents, upper/lowercases (and punctuation, if the third parameter is true) are ignored 2: 2nd level, accents and modifications of characters are taken into account 3: 3rd level, upper/lowercase is evaluated 4: 4th level, other differences are evaluated 5: 5th level, strings are only equal if they contain the same codepoints (after normalization) The fifth parameter enables numeric sorting, if it is set to true. This means collkey('test 2') < collkey('test 10'). There are shortcuts with fewer parameters, see 'collkey_icu.sql'. Example: SELECT name FROM tbl ORDER BY collkey(name, 'fr_FR'); -- french ordering *** TODO *** - enable support for non UTF-8 databases