Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added I18N support #1769

Merged
merged 13 commits into from May 10, 2014
Merged

Added I18N support #1769

merged 13 commits into from May 10, 2014

Conversation

davebaol
Copy link
Member

Why
As said here this PR is heavily inspired to ResourceBundle + Control class from java.util.
The main advantage, compared to java ResourceBundle, is that Android 2.2 is fully supported (Control class has been introduced in Android 2.3).
Also, under certain circumstances looks like Android API uses an inconsistent encoding for property files loaded through a ResourceBundle. This issue is fixed.

What's changed
The default behavior has changed. Bundles are loaded from file system (actually from FileType.Internal) instead of classpath.
This change implies that the bundle name is now expressed as a relative path rather than as a fully qualified class name.
Also, due to the use of relative paths, bundle names are no more unique. For example, the bundle name "foo/bar/Message" on FileType.Interal and the same bundle name on FileType.Local represent different bundles (it might depend on the specific backend though).

What's less
The only missing features are ListResourceBundle and multiple class loaders support.
However both features are pretty much useless for our purposes, so it's not really a big loss.

What's more

  • Property files with an encoding other than the usual ISO-8859-1 are supported. This feature might be helpful for property files compatible with the 2nd approach below of GWT implementation (GWT I18N expects UTF8 files).
  • Properties in the form of a XML document are supported. To load xml properties files from the FileType.Internal file system just pass one of the following Control instances to getBundle method:
    • PropertyFileControl.INTERNAL_XML_ONLY to access xml files only
    • PropertyFileControl.INTERNAL_XML_AND_PROPERTIES to access xml and key/value pairs files (xml has higher priority)
    • PropertyFileControl.INTERNAL_PROPERTIES_AND_XML to access xml and key/value pairs files (xml has lower priority)

To load xml files from any file system other than FileType.Internal you have to extend PropertyFileControl and override its getFormats method accordingly.

And GWT?
The GWT implementation is not part of this PR. In fact, it's not yet clear what's the best way to implement this feature.
From a first analysis there seem to be two viable approaches:

The second approach seems to be more promising.

I'll be on vacation the next days, also I'm not a GWT expert so don't expect a GWT version soon, at least not by me.
Anyone who wants to contribute with ideas or code is welcome, of course.

@NathanSweet
Copy link
Member

I don't know the java.util classes. Maybe some examples in the gdx-tests would help me understand why so much code is needed for i18n?

For i18n it seems like you want to decide on a language somehow, then have a map containing the text name as a key and the localized text as a value. Seems like this could be done in a single class rather simply and would be GWT compatible.

An additional feature might be handling multiple amounts (1 vs >1 (plural) for English, 1, 2-4, >5 for Croatian, Russian and others, etc).

@davebaol
Copy link
Member Author

Yeah sorry I've forgot to commit the test class. I'm gonna do it later when I'll come back home.

@davebaol
Copy link
Member Author

Just committed the test class. It uses 2 bundles, namely "data/i18n/message1" and "data/i18n/message2". Both have 3 locales: default, English and Italian.
Note that I've deliberately chosen to call getBundle every time in order to debug the interleaved search of different bundles and the access to the internal cache.

I wanted to port the ResourceBundle API because is certainly something most Java developers already know and it's well specified too.

Also, this PR has a lot of code because ResourceBundle API is extremely versatile and customizable. You can customize almost everything, e.g. cache policy, search strategy, file format and file location.
Actually, if I had not wanted to support Android 2.2 the class PropertyFileControl alone would have sufficed, nothing more.
Nevertheless, the default behavior is very simple to use, as you can see from the test class.

About plurals, you can still implement this feature over the ResourceBundle API. In fact the API does not even force you to use a specific way to parametrize your localized strings. You're free to use java.text.MessageFormat, String.format() or whatever you want.

EDIT:
I've just noticed that the file tests/gdx-tests-android/assets/data/i18n/message1_it.properties defines the key msg multiple times. It doesn't make sense of course, but it's not a problem since the last definition, which is the correct one, overwrites the previous ones.

@BlueRiverInteractive
Copy link
Contributor

I really would love to see internationalization support in libgdx. But I'm not sure if it's the correct approach to just adapt the java.util classes. For me it would seem more natural if libgdx had it's own simplified way to handle multiple languages and actually one class should be sufficient to handle all this as NathanSweet already suggests.
I'm also using ResourceBundles in my own game, but I always found them rather bloated and actually they were missing some important features that I had to implement on my own.

The following features seem reasonable for most games:

  • Automatically select the correct language file (first look for region (en_US), then for language (en), or fall back to base file or base language, if no base file available)
  • Parse the property list or xml to a key-value map and get a localized string via a simple method like getString(String key)
  • Strings can be parametrized, like "This will cost you {0} coins. You currently have {1} coins!". By using an overloaded getString(String key, Object...params) method, those parameters can be filled.
  • Strings can reference other strings, like "Buy for {0} {currency_symbol}?" During the parsing process those references will be automatically filled in.

In the end it should be as simple as:

Localization gameStrings = new Localization(/* base name: */"game_strings");
gameStrings.setLocale(Locale.ENGLISH); // Optional to override the default locale.
gameStrings.loadPropertiesList(/* internal path: */ "data");
String myString = gameString.getString("my_string");

@davebaol
Copy link
Member Author

IMO the default behavior of java ResourceBundle is even simpler than your example.

  • Load a bundle for the default locale with ResourceBundle.getBundle("baseName")
  • Use Locale.getDefault() to get the default locale of the JVM (its initial value is the system locale).
  • Use Locale.setDefault(new Locale(lang, country, variant)) to change the JVM locale.
  • If you don't want to change the default locale you can load a bundle for a given locale with ResourceBundle.getBundle("baseName", myLocale)
  • Given the ResourceBundle rb you can get a localized string with rb.getString(key) which returns the 1st value found for the key in the parent chain of the bundle up to the base bundle.
  • You can use java.text.MessageFormat to parametrize your strings. It allows you to localize parameters too. And you can give them a type. For example the float argument 3.14 will be rendered as 3,14 for the locale it_IT (yes, in Italy we use the comma instead of the decimal point).

Finally on GWT side there are some intrinsic issues that are properly managed by its native localization system. For example the default locale is determined by the client, not by the JVM which is server side of course.

@davebaol
Copy link
Member Author

Just to keep you informed.
From the GWT I18N documentation:

The format of the messages in the properties files follows the specification in Java MessageFormat.

Here is an example of parametric properties using MessageFormat syntax:

lastUpdateTime=Last update: {0,date,medium} {0,time,medium}
fileSize=File is {0,number,#.#}MB in size.

Also, on a side note, GWT I18N already support plural forms

@badlogic
Copy link
Member

@BlueRiverInteractive i'm actually in big favor of your proposal which
sounds lightweight and sane. I'm also not sure i want to copy the
bloated java.utils stuff to libGDX. First PR wins :)

On 27.04.14 02:11, BlueRiverInteractive wrote:

I really would love to see internationalization support in libgdx. But
I'm not sure if it's the correct approach to just adapt the java.util
classes. For me it would seem more natural if libgdx had it's own
simplified way to handle multiple languages and actually one class
should be sufficient to handle all this as NathanSweet already suggests.
I'm also using ResourceBundles in my own game, but I always found them
rather bloated and actually they were missing some important features
that I had to implement on my own.

The following features seem reasonable for most games:

  • Automatically select the correct language file (first look for
    region (en_US), then for language (en), or fall back to base file
    or base language, if no base file available)
  • Parse the property list or xml to a key-value map and get a
    localized string via a simple method like getString(String key)
  • Strings can be parametrized, like "This will cost you {0} coins.
    You currently have {1} coins!". By using an overloaded
    getString(String key, Object...params) method, those parameters
    can be filled.
  • Strings can reference other strings, like "Buy for {0}
    {currency_symbol}?" During the parsing process those references
    will be automatically filled in.

In the end it should be as simple as:
Localization gameStrings = new Localization("game_strings");
gameStrings.loadPropertiesList("data");
String myString = gameString.getString("my_string");


Reply to this email directly or view it on GitHub
#1769 (comment).

@davebaol
Copy link
Member Author

Well, if the approach is "first PR wins" then I give up immediately.
It doesn't make sense that more people work simultaneously at the same stuff in a competitive way, especially for an open source project. It sounds ridiculous.
However, if you want to address the issue in a cooperative way I'll be glad to contribute.

@davebaol davebaol closed this Apr 27, 2014
@davebaol
Copy link
Member Author

Also, I'd like to point out that before doing this PR I tried to discuss together a possible implementation.
Unfortunately, it seems that the focal point was whether to make it a core package rather than an extension, which initially is probably less important than some technical aspects, such as:

  • Should it be thread-safe?
  • Should you wrap the native I18N system of GWT (which already solves a number of problems) or not?

@NathanSweet
Copy link
Member

I don't think thread safety is an issue, very little of libgdx is thread safe.

The test helps a little. So I get a ResourceBundle, with or without specifying a locale, and I get strings from that.

It's hard for me to judge the java.util stuff as I've not localized a Java app (though I have localized other software). It looks like a lot of code with a number of classes and I think the complexity turns people off.

I'd like to see something like:

  1. The text (name to localized text entries) is in files named after the locale (en, de, etc). Format should be simple, easy to parse (eg disallow = in the name then name=anything to eol).
  2. Something knows how to load the text from those files into an ObjectMap (which excels at readonly data) in two ways: a) by specifying a language, or b) by whatever language the user has by default.
  3. I can look up a string in the map. A static method would probably suffice, Text.get("name"). If we had Text._("name") then a static import would allow _("name") which is less noise for something that gets littered throughout your code (ala PHP's gettext).
  4. I can do some sort of formatting for the text, eg "I have $x apples". Don't care how it's done as long as it is simple and works everywhere it needs to. Handling plurals, currency or whatever extra would be a bonus as long as it doesn't make things complicated.

It seems like this could be done in a single class without getting too complex. Probably a good portion of what you have could be used, just simplified.

Most apps only need a single text file, but maybe an RPG or something wants multiple files. We could allow multiple files to be specified and stuffed into a single map. Or we could load them into separate maps where one is the default and the others can optionally be specified when getting a string or the default is changed (whatever keeps API usage simple).

@BlueRiverInteractive
Copy link
Contributor

@badlogic @NathanSweet : Here is something quick that supports all of the features you noted in one simple class: http://pastebin.com/CLdX0dsb. It's just a start, it could be enhanced with cross string linking, automatic escaping of single apostrophes, allowing comments. But if you like it, I can create a new pull request.
Usage is quite simple:

// First create a new localization of one basename. Supply just any internal file path or filehandle of a language file.
// the language file is split into: basename_locale.extension
Localization loc = new Localization("internal/path/to/any/language/file/text_en.anything"); 
// Read in the language file and fill the string map.
loc.load();
// Get a string and supply additional parameters. They will be automatically formatted via the common MessageFormat pattern: http://docs.oracle.com/javase/7/docs/api/java/text/MessageFormat.html
System.out.println( loc.get("test", 123456789) );

Example language file:

test=Format my currency: {0,number,currency}
test2=Some twisted parameters: {1} {2} {0}
test3=Single apostrophe needs to be escaped ''apostrophe'' atm

@erodozer
Copy link

I feel there's nothing wrong with using GNU's gettext po files with java's MessageText + ResourceBundle aside from having to convert them to java classes. Just trying to understand here, but is this issue just saying LibGDX should have its own built in parser/api for PO files to improve ease of use and compatibility with Android?

@davebaol
Copy link
Member Author

I think the real problem is the GWT backend, not Android.
You don't have a lot of classes on GWT, e.g. MasageFormat and Locale are missing and a porting would mean hundreds lines of code once again.
Basically, from previous discussions, it's clear to me that you can approach the issue in two opposing ways:

  • Writing a simple API using new ad-hoc classes that are able to work as-is for all backends
  • Writing a simple API reusing standard well-established classes like Properties, Locale, MessageFormat for all non-GWT backends and then implementing the same API for the GWT backend just wrapping its native I18N system which is compatible with ".properties" files and MessageFormat syntax.

TBH I think I'm in favor of the 2nd approach.
However, regardless of the approach, I think the following features should be supported:

  • Different languages have to be in different files. This is how translators usually work. They start from a copy of the english file (or any other language) and overwrite each string with their translation. Also, this allows you to add the support for a new language avoiding to compromise the other ones.
  • In parametrized strings, such as coveredPath=You covered {0,number}% of the path and lastUpdate=Last update occurred on {0,date}, the actual arguments you pass have to be automatically localized, meaning that main types like numbers and dates are properly transformed to a string based on the locale. For similar stuff the class MessageFormat really comes in handy. Also, passing arguments by position (paramenters are identified by an ordinal number) allows you to put them in different order for different translations.
  • A parent chain based search strategy over the locales, meaning that a bundle is a list of hash tables. For example, if the user locale is fr_CA, i.e. language French and country Canada, the requested string is looked up in the following order:
    • bundleName_fr_CA
    • bundleName_fr
    • bundleName

I'm pretty sure that just removing from this PR stuff like thread-safety, cache management, customizable control and customizable search strategy the remaining code size will be about 10% of the current one.
Also, the 1st and the 3rd features mentioned above are already implemented. And the 2nd feature is really trivial to implement if you use MessageFormat, actually 5 lines of code or so.

Sample strings

newMission={0}, you have a new mission. Reach level {1}.
coveredPath=You covered {0,number}% of the path
lastUpdate=Last update occurred on {0,date}

Common API
Basically the common API should be something like

I18NBundle bundle = I18N.createBundle("relative/path/to/MyBundle");
String mission = bundle.getString("newMission", player.getName()), nextLevel.getName());
String coveredPath = bundle.getString("coveredPath", path.getPerc());
String lastUpdate = bundle.getString("lastUpdate", change.getDate());

GWT backend
Let's assume we want to use the 2nd approach, so leveraging the GWT I18N native system.
From what I've learned in the last few days reading the online documentation, a compile step is needed.
The i18nCreator script scans all the property files (UTF8 encoding required) and generates the corresponding java classes and interfaces.
Then you use it like that:

MyBundle bundle = (MyBundle) GWT.create(MyBundle.class);
String mission = bundle.newMission(player.getName()), nextLevel.getName());
String coveredPath = bundle.coveredPath(path.getPerc());
String lastUpdate = bundle.lastUpdate(change.getDate());

This example clearly shows that you have to use reflection to implement the common API for the GWT backend.
Is it still acceptable? I think so since you usually don't call I18N functions on each frame.
Also this approach should keep the code small and simple for both implementations of the common API.
What do you think?

@davebaol
Copy link
Member Author

Ok last night I've quickly removed all the bloated stuff, and the common API is working as described above on non GWT backends.
Then I started experimenting with GWT using reflection to implement the common API. Unfortunately, I get the compile error Only class literals may be used as arguments to GWT.create()
due to the fact that I have to use something like GWT.create(type.getClassOfType()) instead of GWT.create(MyBundle.class) since I don't know the class at compile-time but only the class name at run-time. So the error is pretty clear, it's a limitation imposed by GWT itself.
I've googled a bit searching for a solution and I only found complex tricks about generators, deferred binding,.... Damn, I'm confused, it sounds too advanced for my current GWT skills. 😕
Hope someone of you, GWT gurus, can help me to solve this problem. 😊

@NathanSweet
Copy link
Member

@BlueRiverInteractive your pastebin is private.

@nhydock Using PO/MO has the benefit of using the various tools for editing/merging. It's not a terribly simple format though, I wonder if it is worth using?

@davebaol I meant one file per language. Not sure it's essential we chain string lookups, at least it can be worked around if we don't have this. If it is easy enough it is a nice feature.

For GWT's i18n stuff, having to compile property files and then use reflection doesn't seem great. Why can't GWT just load and parse data from a file? Ok, there's no MessageFormat and presumably providing that is too much effort, so we can write something that replaces tokens and GWT just doesn't get the advanced features. If you develop for GWT you have a number of things to keep in mind for your app to work, this is just one more item in the list.

@BlueRiverInteractive
Copy link
Contributor

http://pastebin.com/CLdX0dsb is now public. Sorry

@BlueRiverInteractive
Copy link
Contributor

@NathanSweet For GWT the only thing to adapt would be MessageFormat.format(). Simple arguments (like {0}) should be easily to support with something like:

for (int i = 0; i < params.length; i++) {
    message = message.replaceFirst("\\{[" + i + "]+\\}", String.valueOf(params[i]));
}

[EDIT] Locale probably could be replaced with GwtLocale?

@davebaol
Copy link
Member Author

@NathanSweet

I meant one file per language.

Sorry, I didn't understand.

Not sure it's essential we chain string lookups, at least it can be worked around if we don't have this. If it is easy enough it is a nice feature.

Parent chain is already implemented and working. And GWT I18N uses the same search strategy. Well, actually if I got it right GWT I18N has no parent chain at runtime since it resolves missing properties at compile time.

For GWT's i18n stuff, having to compile property files and then use reflection doesn't seem great.

Yeah, I agree. it's a bit redundant but I think there's no other way to implement the common API if you use GWT I18N.

Why can't GWT just load and parse data from a file? Ok, there's no MessageFormat and presumably providing that is too much effort, so we can write something that replaces tokens and GWT just doesn't get the advanced features.

AFAIK If you use properties GWT I18N only works through a compile step (by launching i18nCreator script) regardless of the use of parametric messages. Actually you must tell to i18nCreator if property files are constant or parametric and it will create classes that implement Constants or Messages interfaces respectively. All methods, corresponding to a property, in Constants subclasses have no arguments of course.
Unfortunately, my issue occurs on GWT.create() which should create the bundle. Since you have to create the bundle before accessing its properties, using Constants in place of Messages makes no difference.

Finally, if we can't make GWT I18N work via reflection then I'm afraid that the only viable path is the first approach mentioned a few posts above, which likely means more code and less features. 😞

@NathanSweet
Copy link
Member

@BlueRiverInteractive your class is simple, but doesn't have all the basic features (eg auto choosing a language file).

@davebaol only GWT needs to suffer from less features.

@BlueRiverInteractive
Copy link
Contributor

@NathanSweet I'm not sure what you mean by auto choosing a language file.
If locale.load() is called, the language file of the current system language gets loaded. First language+country, then language, then base. This means auto-fallback for non-localized strings of the specified/current language to the base file. It shouldn't be too difficult to let the developer specify a custom language path, like de_DE -> de -> en_US -> en -> base.
Or do you mean that the developer should be able to load multiple language files at the same time, like: texts_en.language, urls_en.language, img_en.language ?

@MobiDevelop
Copy link
Member

After looking over all of this, I'd favor a simple interface that employs a simple parse and replace strategy on GWT. The added complexity of using GWT's built-in i18n implementation isn't justified by the benefits, in my opinion. In any case, the common api should be flexible enough that the underlying mechanism doesn't really matter.

@badlogic
Copy link
Member

just came home, didn't look at any source, but why is reflection required
at all?

@Davebol i understand that this is frustrating, especially if you put in
the time to actually implement things. however, big additional features add
more surface area and more code to maintain once the original author is no
longer available. also, we should iterate the api before it gets public. we
have to consider inclusions of such big features carefully. I wished we had
time to carefully review every PR at the depth level it deserves, that may
sometimes not be possible so, sorry for that.
On Apr 29, 2014 7:09 PM, "Justin Shapcott" notifications@github.com wrote:

After looking over all of this, I'd favor a simple interface that employs
a simple parse and replace strategy on GWT. The added complexity of using
GWT's built-in i18n implementation isn't justified by the benefits, in my
opinion. In any case, the common api should be flexible enough that the
underlying mechanism doesn't really matter.


Reply to this email directly or view it on GitHubhttps://github.com//pull/1769#issuecomment-41703771
.

@davebaol
Copy link
Member Author

Ok after discussing with Mobidevelop on irc I think we've finally reached a good balance between functionality and simplicity.

  • A simple common API, like the one defined a few post above, able to read porperty files and supporting replacement of arguments given in positional order.
  • A simplified replacement sintax, like {0}, {1}, {2} and so on, that will be supported by all backends, GWT included.
  • A typed and localizable replacement syntax via MessaFormat, i.e. {0}, {1,number,#.##}, {2,date} and so on, that will be supported by all non GWT backends.

This policy falls in line with other APIs that aren't fully supported by GWT.

@badlogic Don't worry, code size has dramatically decreased after removing thread-safety, cache management, customizable control and customizable search strategy. Now that the default behavior of java ResourceBundle is implemented directly without all that bloated stuff the source code is rather readable. The PR has not been updated yet.
Also, I was just investigating a possible implementation of the GWT backend. Reflection was just a consequence of the initial idea (apparently good) to use GWT native I18N which was (is?) an unknown creature for most people here. Once realized all the implications behind its use we abandoned the idea.

@badlogic
Copy link
Member

Cool, thanks for sticking with us even if it's a bit comlicated at times.
Since, @MobiDevelop took the lead on IRC, i'll let him pull the trigger
once the PR is in place (and because i'm lazy and on vaccation this week)
On Apr 29, 2014 10:57 PM, "davebaol" notifications@github.com wrote:

Ok after discussing with Mobidevelop on irc I think we've finally reached
a good balance between functionality and simplicity.

  • A simple common API, like the one defined a few post above, able to
    read porperty files and supporting replacement of arguments given in
    positional order.
  • A simplified replacement sintax, like {0}, {1}, {2} and so on, that
    will be supported by all backends, GWT included.
  • A typed and localizable replacement syntax via MessaFormat, i.e.
    {0}, {1,number,#.##}, {2,date} and so on, that will be supported by all non
    GWT backends.

This policy falls in line with other APIs that aren't fully supported by
GWT.

@badlogic https://github.com/badlogic Don't worry, code size has
dramatically decreased after removing thread-safety, cache management,
customizable control and customizable search strategy. Now that the default
behavior of java ResourceBundle is implemented directly without all that
bloated stuff the source code is rather readable. The PR has not been
updated yet.
Also, I was just investigating a possible implementation of the GWT
backend. Reflection was just a consequence of the initial idea (apparently
good) to use GWT native I18N which was (is?) an unknown creature for most
people here. Once realized all the implications behind its use we abandoned
the idea.


Reply to this email directly or view it on GitHubhttps://github.com//pull/1769#issuecomment-41731184
.

@davebaol
Copy link
Member Author

Me too. I'll be on vacation the next few days and my beloved PC will stay at home. Every now and then you have to take a break. :)

@davebaol
Copy link
Member Author

davebaol commented May 1, 2014

Bad weather messed up our vacation, so we stay home. 😞
But it means that I can work on this PR. 😄
I'm going to update it soon.

@davebaol
Copy link
Member Author

davebaol commented May 8, 2014

On desktop en != it

en -> { string=LibGDX - PI=3.142 - today=5/8/14 9:55 PM - time=5/8/14 9:55 PM }
it -> { string=LibGDX - PI=3,142 - today=08/05/14 21.55 - time=08/05/14 21.55 }

on GWT en == it

en -> { string=LibGDX - PI=3.142 - today=5/8/14 9:55 PM - time=5/8/14 9:55 PM }
it -> { string=LibGDX - PI=3.142 - today=5/8/14 9:55 PM - time=5/8/14 9:55 PM }

@NathanSweet
Copy link
Member

Thanks. We could do the simple token replace ourselves, before passing to MessageFormat. That or a setting to turn off MessageFormat.

@davebaol
Copy link
Member Author

davebaol commented May 8, 2014

lol I committed while you were commenting. I hope the commit is fine.

@davebaol
Copy link
Member Author

davebaol commented May 9, 2014

Sorry, I forgot to update TextFormatter GWT emu.
Now it's done.

@NathanSweet
Copy link
Member

I think it is looking good. Talked with Mario and he signed off on it. Can you add a note to CHANGES with a link to the wiki page? Then we'll merge!

Probably better not to support Output/InputStream in PropertiesUtils. It's character based so should take a reader and make it the responsibility of the user to choose the encoding.

One last thought, would it be better to do simple token replace on {x} when no formatting type is specified? So we'd do the simple token replace ourselves, before passing to MessageFormat. It's pretty disturbing that MessageFormat always munges token replaces. Also this would get rid of the need for setSimpleFormat. I'm fine with your decision either way, just a thought.

@davebaol
Copy link
Member Author

davebaol commented May 9, 2014

I think it is looking good. Talked with Mario and he signed off on it. Can you add a note to CHANGES with a link to the wiki page? Then we'll merge!

Ok, do I have to sign the CLA?

Probably better not to support Output/InputStream in PropertiesUtils. It's character based so should take a reader and make it the responsibility of the user to choose the encoding.

Ok, I'll remove IO streams support. Actually both methods are never used since I18NBundle already calls the method that takes a reader while the method that takes a writer is there just for completeness' sake, which makes sense for an utility class.

One last thought, would it be better to do simple token replace on {x} when no formatting type is specified? So we'd do the simple token replace ourselves, before passing to MessageFormat. It's pretty disturbing that MessageFormat always munges token replaces. Also this would get rid of the need for setSimpleFormat. I'm fine with your decision either way, just a thought.

You already said it before and I didn't reply because... well... either I'm misinterpreting your words or you're missing something. Not sure which one of the two. 😄
Anyways, I18N always uses TextFormatter whose constructor takes a locale and a flag indicating whether to use MessageFormat.format() or the private method simpleFormat() that does simple token replacement on {x}. See javadoc and code at https://github.com/davebaol/libgdx/blob/49b7ada38cf7398d6a7320620a47017ec8fa0be9/gdx/src/com/badlogic/gdx/utils/TextFormatter.java#L35
So I can't see how we can get rid of the need for setSimpleFormat without expecting that type of information as an argument of the factory mehod createBundle.
That's why in an older commit I18NBundle was abstract and the developer had to specify the concrete bundle class to be used, so having to use internally both generics and reflection.
Of course createBundle might take a boolean useSimpleFormat which defaults to false.
In any case the API can't make an autonomous decision.

@davebaol
Copy link
Member Author

davebaol commented May 9, 2014

Updated PropertiesUtils.
Do you mind if I update CHANGES in a new PR later in the weekend after publishing the wiki page?

@NathanSweet
Copy link
Member

Yes, pretty please send a CLA:
https://github.com/libgdx/libgdx/wiki/Contributing

Yeah it's fine to do the CHANGES later.

You already said it before and I didn't reply because... well... either I'm misinterpreting your words or you're missing something. Not sure which one of the two.

We apply MessageFormat to the string. If the string has {0} and is passed a float (eg PI in your example), MessageFormat formats the float using whatever Locale stuff it does. This may be annoying, because what if I don't want MessageFormat to mess with my value? It seems unexpected since I didn't give it a format style.

If we do simple token replace on the string before passing it to MessageFormat, then {0} would get the value unmolested. If you want the MessageFormat stuff you'd use something like {0,short}. With this change, now there is no difference for simple tokens like {0} between GWT and non-GWT, so we don't need setSimpleFormat at all.

@davebaol
Copy link
Member Author

davebaol commented May 9, 2014

Oh ok, now I see what you mean and of course I was misinterpreting, sorry. 😊
Your suggestion makes sense for our issue but generally speaking what you call an unexpected behavior is a feature IMO.
I mean, the idea behind this behavior is that if you use the short notation {0} the class of the actual argument determines the type and the style of its format, where format = {index[,type[,style]]}.
It's like all "formattable" java classes have a default type and style.
if you want to keep your value unmolested you have to pass it as string since strings are never formatted or localized.

If it's ok for you I'd prefer to keep setSimpleFormat so not to lose localization on short notation.

@MobiDevelop
Copy link
Member

I'd think unexpected behavior should be avoided.

@davebaol
Copy link
Member Author

If it was unexpected, but it's not.
The goal of MessageFormat is to support patterns with argument localization. So it would be unexpected if localizable arguments were not localized.

@MobiDevelop
Copy link
Member

But the goal of a common api is to have it work "the same" on the different
back ends.

@davebaol
Copy link
Member Author

Then we should completely remove MessageFormat since GWT only supports simple format. :P
Also I don't see what's the problem since the documentation already has to treat GWT as a special case. We might simply write something like: "GWT only supports simple format {x}. It's recommended to call setSimpleFormat(true) to force consistent behavior on all backends".
Anyways don't want to be rude but after 2 weeks, 80 posts and a dozen of commits I hope you guys can accept this PR like it is right now and let the users test it before modifying once again the behavior of a standard Java API. :)

@NathanSweet
Copy link
Member

It's been quite a long discussion. I know you've been patient and put in a lot of work, but I think the discussion has been a good one. Solely because it has been so long isn't a good reason to stop. If we wanted to change how replacement is done, doing so after users are using it isn't great, though I'm not afraid of doing that.

Only supporting the lowest common denominator isn't great because not everyone uses GWT. I do think there is some value in having the lowest common denominator be the same, so non-GWT works the same as GWT but has more features. We established earlier that formatting is an optional part of the API, people can get the strings and apply their own formatting if they don't like our (nonstandard) way of doing it.

If you feel strongly about GWT and non-GWT being different, I'm ok with it. We'll merge now, feel free to make changes in a new PR if you like. I promise to try not to drag it out into weeks of discussion! :D

FWIW and a bit OT, copying a standard Java API is almost never ideal IMO (acronym combo!). Those APIs are not always well designed and they evolve without breaking backward compatibility, often resulting in a mess. It's better to review how it works there and cherry pick the things that are good. People familiar with the Java API can easily learn the libgdx API. Where the Java API is good the libgdx API will be the same, where it isn't the libgdx API will be better. I don't think there is an expectation that the APIs match.

For example, Array doesn't match ArrayList in a number of ways. I prefer insert(int, Object) over ArrayList's add(int, Object), removeIndex(int) over ArrayList's remove(int) (which is ambiguous with remove(Object) for ArrayList<Integer>), etc.

NathanSweet added a commit that referenced this pull request May 10, 2014
@NathanSweet NathanSweet merged commit b205f4a into libgdx:master May 10, 2014
@davebaol
Copy link
Member Author

Thanks Nate.
For me there are at least 2 good reasons:

  • A psychological reason. I was getting a bit stressed. I needed to be sure that my work would have been merged before investing more effort into it.
  • A practical reason. According to the release plan recently adopted by libgdx we have about 2 weeks to tweak the API before the next release. In the meantime users can easily test it using the nightly builds.

Now I'm going to update the documentation.

@badlogic
Copy link
Member

Quick question, is the wiki page up already? I tried finding it but failed.

@badlogic
Copy link
Member

Sorry, i'm dumb, you are still working on that. Take your weekend off, no need to rush it!

Thanks for the this very valuable contribution and sticking with us silly nitpickers :)

@nooone
Copy link
Contributor

nooone commented May 10, 2014

May I suggest adding an AssetLoader<I18NBundle>?

@davebaol
Copy link
Member Author

Yeah it's already in my todo list.
I want to publish the wiki page first, so interested users can start testing.

@davebaol
Copy link
Member Author

I18N wiki page

@davebaol
Copy link
Member Author

@NathanSweet

We apply MessageFormat to the string. If the string has {0} and is passed a float (eg PI in your example), MessageFormat formats the float using whatever Locale stuff it does. This may be annoying, because what if I don't want MessageFormat to mess with my value? It seems unexpected since I didn't give it a format style.

If we do simple token replace on the string before passing it to MessageFormat, then {0} would get the value unmolested. If you want the MessageFormat stuff you'd use something like {0,short}. With this change, now there is no difference for simple tokens like {0} between GWT and non-GWT, so we don't need setSimpleFormat at all.

Ok I'm investigating your idea and I think there are some non-trivial implications.
For example when replacing {n} the actual argument might contain single quotes or, even worse, some other {m}. Those {m} will be replaced by MessageFormat.format(). We don't want this 2-level-recursive behavior because it deviates from the original MessageFormat behavior and might also cause an exception if m is out of bounds.
I think we must escape the actual arguments to be replaced just before replacing them so MessageFormat.format() will consider them as a constant part of the pattern.
Not sure if this is enough though. Any thoughts?

@NathanSweet
Copy link
Member

True, if the simple token {0} is replaced with text containing single quote or other curly bracketed tokens, it would be a problem. I guess we'd have to replace the simple tokens to be surrounded by single quotes '{0}', apply MessageFormat, then replace the single quotes and simple tokens. Fun! :D

@davebaol
Copy link
Member Author

Eh it's more complicated than this if we want it really robust :|

I guess we'd have to replace the simple tokens to be surrounded by single quotes '{0}',

this is true only for simple tokens not escaped with {, because the original string is assumed to use our escaping rule

apply MessageFormat,

it means we have to convert our escaping rule to MessageFormat escaping rules while replacing non-escaped simple tokens with '{n}'

then replace the single quotes and simple tokens.

This is tricky because some occurrences of '{n}' might have been generated by MessageFormat, so we shouldn't replace them.

TBH I think that having to use setSimpleFormat(true) is annoying but far safer.

@NathanSweet
Copy link
Member

It is safer, though I don't think it's common for the MessageFormat replace to output {n}. I'm fine with leaving it as is.

@davebaol
Copy link
Member Author

Yeah far from being likely but still possible.
I leave it as is for now since it's ok for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants