Skip to content

Provides a java class 'LanguageCode' with instances for all available codes in ISO-639-3

Notifications You must be signed in to change notification settings

mihxil/i18n-iso-639

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JAVA ISO-639 support

Build Status snapshots codecov

javadoc Maven Central

Codes for Languages (and language groups) of the World are covered by the ISO-639 standard

These standards provide letter codes for each language. E.g. ISO-639-3 provides a three-letter code for all living languages.

There are too many such codes to be contained in a java-enum (e.g. https://github.com/TakahikoKawasaki/nv-i18n/blob/master/src/main/java/com/neovisionaries/i18n/LanguageAlpha3Code.java is just not complete)

This package has the tab seperated files provided by https://iso639-3.sil.org/, and java classes to read this, and provide all language codes as java objects, with getters.

Usage

import org.meeuw.i18n.languages.*;
// get a language by its code;
Optional<LanguageCode> optional = ISO_639.getByPart3("nld");
LanguageCode languageCode = LanguageCode.languageCode("nl");

// show its 'inverted' name
System.out.println(languageCode.nameRecord(Locale.US).inverted());

// get a language family
Optional<LanguageFamilyCode> family = ISO_639.getByPart5("ger");

// get by any code
Optional<ISO_639_Code> byCode = ISO_639.get("nl");

// stream by names, language may have several names (dutch, flemish), and appear multiple times
ISO_639_Code.streamByNames().forEach(e -> {
	System.out.println(e.getKey() + " " + e.getValue());
});

See also the test cases

link:src/test/java/org/meeuw/i18n/languages/test/LanguageCodeTest.java[role=include]

Retired codes

LanguageCode#getByCode will also support retired codes if possible. This means that the code of the returned object may be different:

// the 'krim' dialect (Sierra Leone) officially merged into 'bmf' (Bom-Kim) in 2017

 assertThat(LanguageCode.getByCode("krm").get().getCode()).isEqualTo("bmf");

Fall backs

Sometimes we have to deal with systems which have their own versions of the standards. In these cases it is possible to register 'fall backs'.

E.g.

 // Our partner uses the pseudo ISO-639-1 code 'XX' for 'no language'
 //  fall back to a proper Part 3 code.
 try {
   LanguageCode.registerFallback("XX", LanguageCode.languageCode("zxx"));
   assertThat(ISO_639.iso639("XX").code()).isEqualTo("zxx");
 } finally {
   LanguageCode.resetFallBacks();
 }

Support

JAXB

The language code is annotated with a JAXB annotation. It will serialize and deserialize to and from the code. The dependency on the annotation is optional.

JSON

The needed classes are also annotated by Jackson annotations, so they can be serialized to and from JSON.

Serializable

LanguageCode is serializable too, and ensures that on deserialization the same object for every language is returned. (And only the code is non-transient).

Sortable

The default sort order of a LanguageCode used to be on 'Inverted Name'. There may be more than one (inverted) name though (E.g. Dutch and Flemish). Since 3.0 LanguageCode is not Sortable anymore. LanguageCode#stream() is sorted by ISO-639-3 code.

Internationalization of language names

The files of sil.org only provide names of languages and language families in English. Java’s java.util.Locale can provide the name of a lot of languages (I presume the language with a ISO-639-1 code) in a lot of other languages.

E.g.

 new Locale("en").getDisplayName(new Locale("nl"));

will give the name of the language 'English' in Dutch ('Engels').

To make this available in a generic way the base interface ISO_639 just has

String getDisplayName(Locale locale)

For ISO_639_1 this will first try to use the above way with java.util.Locale. For other codes, and as a fallback in ISO_639_1 will use the resource bundle org.meeuw.i18n.languages.DisplayNames, where the default and english names are provided by the sil.org files.

Noticeably, for example for sign languages this is the way to have proper names available

assertThat(LanguageCode.languageCode("dse").getDisplayName(new Locale("nl"))).isEqualTo("Nederlandse Gebarentaal");

Versions

<1

developing/testing

2023

1.x

compatible with java 8, javax.xml, module-info java 11

1.0

2023-11-30

2.x

java 11, jakarta.xml

2024-01-28

jakarta mostly applies to the optional jaxb support (and to some - also optional - validation annotations)

2.1

support for retired codes

2024-02-11

2.2

migrated support for language code validation from i18n-regions

2024-?

3.0

Refactoring

2024-3

Added enum for ISO-639-1 codes, Made syntax forward compabible with records. So, getters like getPart1()) are dropped in favor of part1(). LanguageCode itself is now an interface. This may be backported to 1.2 for javax compatibility.

3.1

Refactoring

2024-3

Support for ISO-639-5. Dropped the -3 from the artifact id.

3.6

Support for #getDisplayName

2024-8

About

Provides a java class 'LanguageCode' with instances for all available codes in ISO-639-3

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published