Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-52096] Determine the locale of Native Image executables at run-time #8295

Open
loicottet opened this issue Feb 1, 2024 · 0 comments
Open

Comments

@loicottet
Copy link
Member

loicottet commented Feb 1, 2024

TL;DR

It is currently counter-intuitive to use the localization support in Native Image, due to the fact that the program's default locale is chosen at build-time and does not depend on the machine executing the compiled program.
We are proposing the following:

  • The default locale of Native Image executables will be determined at run-time instead of build-time;
  • Users will be able to manage which locales are included in executables using the new --include-locales flag:
    • --include-locales=en-GB,de includes the selected locales (Locale.ROOT and Locale.US are always included);
    • --include-locales includes all JDK-supported locales;
  • Resource bundle registrations will be locale-independent (registration makes a bundle available for all included locales at run-time).

Goals

  • Enable Native Image executables to determine their locale based on the machine they run on;
  • Streamline the way locales are handled in Native Image, and make their behaviour match the JDK as much as possible;
  • Ensure those changes have minimal impact on image size and startup time for users who do not use localization features.

Non-goals

  • Provide a JDK-equivalent localization coverage for all users. The localization data required to include every JDK-supported locale is too big to be included in every Native Image executable;
  • Change the resource bundle JSON registration format. The current format is suitable for the proposed changes and previously generated resource configuration files should keep working with the new behavior.

Detailed description

Default locale

Native Image currently determines the default locale of the programs it compiles at build-time.
The default value for this locale is the default locale of the machine the Native Image builder is running on.
Users can specify a different locale using system properties or the (deprecated) -H:DefaultLocale option.

Given that the purpose of localization is to adapt a single program to the various preferences and needs of its users, imposing a single locale to all users of a program is completely contrary to the intended use of that feature.
We are therefore proposing to switch the detection of the default locale from build-time to run-time.
This change will pose the following challenges:

Run-time locale detection

On HotSpot, the default locale is detected by platform-specific C code. On Linux for example, it is inspecting the value of the LC_MESSAGES and LC_CTYPE environment variables.
This code then fills the locale-specific system properties (user.language, user.country, etc.).
We plan to port this code to the Native Image static libraries and use it to load the default locale on demand at run-time.
That way, there will be no startup time impact on programs that do not require localization support.
This method is already used to set up certain machine-specific system properties, such as user.home and os.version.
We expect the image size impact of this change to be minimal as well, as the ported code only consists of calls to the standard library and string handling operations, as well as a 6KB data header.

Supported locales

Native Image already supports including a specific set of locales in the generated executable, using the -H:IncludeLocales and -H:+IncludeAllLocales options.
We plan to repurpose these options to specify the set of locales that are supported by the compiled program.
They will be turned into the --include-locales API option, which will be able to be used either with a comma-separated list of locales to include, or as-is to include all JDK-supported locales.

Including all locales in the image results in an image size increase of around 25MB of compressed localization data.
We therefore do not plan to include all locales in the image by default.
We propose to include Locale.ROOT and Locale.US by default, as those are required to be present by Locale.getAvailableLocales().
Locale.ENGLISH will be included as well as a result, since it is a fallback locale of Locale.US.
It is still an open question whether the default locale of the Native Image builder should be included as well.
We could include it if it facilitates the development process and/or makes backwards compatibility easier.

The following examples show how the --include-locales flag will impact the locales supported by the program:

# Includes Locale.ROOT, Locale.ENGLISH and Locale.US (all default)
native-image Main

# Includes all JDK-available locales
native-image Main --include-locales

# Includes Locale.ROOT, Locale.ENGLISH, Locale.US, (default)
#          Locale.UK (en-GB), Locale.GERMAN (de),
#          Locale.CHINESE (zh, fallback for zh-CN) and Locale.SIMPLIFIED_CHINESE (zh-CN)
native-image Main --include-locales=en-GB,de,zh-CN

Locale fallback

It is likely that a user's exact locale will not be present as-is in the compiled program.
This will be especially true if a program only includes the three default locales listed above.
In that case, we plan to rely on the same mechanism that HotSpot uses to fallback to a known locale, where the locale is progressively stripped of its more specific elements until a matching locale is found.
For example, in an image containing only Locale.ROOT, Locale.ENGLISH and Locale.US, the locale en-DE would be reduced to en, which corresponds to Locale.ENGLISH.

Resource bundle handling

As part of the proposed changes, we plan for run-time user resource bundle lookups to now use the same locale fallback mechanism as the JDK localization support.
We are proposing that a MissingResourceRegistrationError (see #5171) will now only be thrown if a bundle's base name was not registered, regardless of the queried locale.
As a result, the Native Image agent would only register bundle base names, ignoring locales.
Existing configuration files containing resource bundle registrations with a "locales" field would be registered for all locales, regardless of the specified ones.

In order to ensure that a specific bundle/locale combination is present in the image, two conditions would have to be met:

  1. The bundle name should be registered for runtime resource access, either through JSON or a feature, and;
  2. The locale should be included in the image, either by default or through --include-locales.

This behavior is described in the following examples, where the Main program simply performs ResourceBundle.getBundle("com.example.Bundle", Locale.getDefault()) and prints the name and locale of the returned bundle.

Only default locales included
# Existing bundles: com.example.Bundle, com.example.Bundle_en_US, com.example.Bundle_fr, com.example.Bundle_fr_CA

# resource-config.json
"bundles": [{
  "name": "com.example.Bundle"
}]

# Only default locales included (ROOT, ENGLISH, US)
> native-image Main

# Running on a computer with default locale en-US
> ./main
com.example.Bundle_en_US

# Running on a computer with default locale fr-CA (fallback to ROOT since FRENCH and FRENCH_CANADA are not included in the image)
> ./main
com.example.Bundle
Additional locales included
# Existing bundles: com.example.Bundle, com.example.Bundle_en_US, com.example.Bundle_fr, com.example.Bundle_fr_CA

# resource-config.json
"bundles": [{
  "name": "com.example.Bundle"
}]

# Default locales (ROOT, ENGLISH, US) and FRENCH included
> native-image Main --include-locales=fr

# Running on a computer with default locale en-US
> ./main
com.example.Bundle_en_US

# Running on a computer with default locale fr-CA (fallback to FRENCH since FRENCH_CANADA are not included in the image)
> ./main
com.example.Bundle_fr
All locales included
# Existing bundles: com.example.Bundle, com.example.Bundle_en_US, com.example.Bundle_fr, com.example.Bundle_fr_CA

# resource-config.json
"bundles": [{
  "name": "com.example.Bundle"
}]

# All locales included
> native-image Main --include-locales

# Running on a computer with default locale en-US
> ./main
com.example.Bundle_en_US

# Running on a computer with default locale fr-CA
> ./main
com.example.Bundle_fr_CA
"locales" field in resources-config.json
# Existing bundles: com.example.Bundle, com.example.Bundle_en_US, com.example.Bundle_fr, com.example.Bundle_fr_CA

# resource-config.json ("locales" field ignored)
"bundles": [{
  "name": "com.example.Bundle",
  "locales": ["", "en_US", "fr"]
}]

# Only default locales included (ROOT, ENGLISH, US)
> native-image Main

# Running on a computer with default locale en-US
> ./main
com.example.Bundle_en_US

# Running on a computer with default locale fr-CA (fallback to ROOT since FRENCH and FRENCH_CANADA are not included in the image)
> ./main
com.example.Bundle

An alternative to this behavior would be to still track bundle/locale combinations in the agent, and include the queried locales in the image without requiring them to be explicitly added. In that case, the last example would include com.example.Bundle_fr and return it when running on a computer with the fr-CA default locale. com.example.Bundle_fr_CA would in any case not be accessible from the image.
This is not our preferred option, as it would enable libraries to force users to include certain locales, but it would be possible to implement if the proposed behavior affects backwards compatibility too much.

@loicottet loicottet self-assigned this Feb 1, 2024
@spavlusieva spavlusieva changed the title Determine the locale of Native Image executables at run-time [GR-52096] Determine the locale of Native Image executables at run-time Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

1 participant