Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use locale grouping to format numbers #1781

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

nkottary
Copy link

@nkottary nkottary commented Apr 6, 2020

Currently numbers are grouped in groups of 3 regardless of locale. This PR implements grouping by looking at locale. I took some screenshots by using wxIntegerValidator on wxTextCtrl:

Default grouping or grouping with "en_US", "en_UK" etc
grouping_us

Grouping with "en_IN"
grouping_in

Copy link
Contributor

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! We definitely should support locale-specific grouping, however this PR can't be merged as is because:

  1. It breaks compatibility by modifying an existing function: this should be simple to fix by just adding a new function and keeping the old one as a wrapped for it.
  2. It changes Mac code almost certainly incorrectly, which probably explains the crashes in the Travis CI builds under this platform. This should be easy to fix too, just by creating the string properly.
  3. It results in test failures under MSW in AppVeyor builds. I'm not sure why is this so, but it definitely needs to be fixed.

Speaking of tests, it would be great to have tests checking that formatting/grouping in Indian locale works as expected (the test could be skipped if Indian locale is not available). Ideal would be to factor out wxNumberFormatter::AddThousandsSeparators() into a testable public function and add tests to it to the test suite.

Could you please try to fix these problems? If you have any questions, please post to wx-dev to discuss them.

TIA!

include/wx/numformatter.h Show resolved Hide resolved
interface/wx/intl.h Show resolved Hide resolved
src/common/intl.cpp Outdated Show resolved Hide resolved
src/common/intl.cpp Outdated Show resolved Hide resolved
@vadz vadz added the work needed Too useful to close, but can't be applied in current state label Apr 6, 2020
@nkottary
Copy link
Author

nkottary commented Apr 6, 2020

Thank you for the detailed comments @vadz !

Copy link
Contributor

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, but there seems to be a serious problem here: the code just returns whatever localeconv() or GetLocaleInfo(SGROUPING) returns directly, but they use different formats. Notably MSW doesn't use CHAR_MAX as delimiter, see documentation which probably explains the CI failures. And, FWIW, macOS uses a different, but much simpler convention, with just (primary) groupingSize and secondaryGroupingSize.

Anyhow, we need to standardize on one convention, maybe MSW one as it seems simpler, document it and return the grouping information in the same format under all platforms to make this work.

interface/wx/numformatter.h Show resolved Hide resolved
@nkottary
Copy link
Author

Thanks! I will look into the MacOS thing once I get Windows tests to pass.

@nkottary nkottary closed this Apr 11, 2020
@nkottary nkottary reopened this Apr 11, 2020
@nkottary nkottary closed this Apr 12, 2020
@nkottary nkottary reopened this Apr 12, 2020
@vadz
Copy link
Contributor

vadz commented Apr 19, 2020

Thanks for the updates! Other than the tests failures, that should be fixed, of course, I wonder why do the newly added tests don't just call FormatNumber() directly, with the different values of grouping? This was one of the ideas behind making this function public in the first place...

Also, it's a bit too late for this advice as you've already written the code using this style (and you do not need to change it), but if you write more tests, please feel free to use CATCH framework directly. This is much simpler than using legacy CppUnit macros, as you can just write functions (instead of classes, with methods, special set up and tear down methods and what not) and use a single CHECK() macro instead of many different ones. See e.g. tests/rowheightcache/rowheightcachetest.cpp for a simple example of a test written in this way or tests/validators/valtext.cpp for a closer (but slightly more complicated, as using section) example. Also note that you can combine both tests in the same file, i.e. you can just add TEST_CASE()s to it as in e.g. tests/filename/filenametest.cpp.

@nkottary
Copy link
Author

Looking at the test failures it seems that the grouping string returned from GetLocaleInfo on Windows does not end with '0' when it should. On my laptop I have a Visual Studio 2019 build of this PR with nmake and test.exe runs fine on that. So I'm not sure why it is different on appveyor.

@vadz
Copy link
Contributor

vadz commented Apr 22, 2020

Strange, the behaviour of the system function such as ::GetLocaleInfo() shouldn't depend on the compiler of the build method. It might depend on the OS version but MSDN doesn't say anything about it changing. It could depend on locale, of course. Perhaps you could add some debugging statements to check locale and the value of SGROUPING on AppVeyor?

@nkottary
Copy link
Author

Indeed the grouping string is different on appveyor for en_IN locale. It is 3;2. Whereas on my Linux and Windows 10 systems it is 3;2;0.

@vadz
Copy link
Contributor

vadz commented Apr 26, 2020

This is indeed annoying, apparently locale info depends on MSW version (AppVeyour is probably using something different from normal desktop one). Ideal would be to let the tests work in both cases, but you could also just skip them if the grouping is not the expected one.

Perhaps you could combine it with my suggestion to test FormatNumber() directly to retarget the existing tests to it?

@nkottary
Copy link
Author

Okay, I will add tests for FormatNumber()

@nkottary
Copy link
Author

Regarding the CI failure, I have seen the socketStream test fail before. It passes if I re-run the test by closing and re-opening the PR. In any case, it seems to be unrelated to this PR.

@nkottary nkottary closed this Nov 5, 2020
@nkottary nkottary reopened this Nov 5, 2020
@nkottary
Copy link
Author

nkottary commented Nov 5, 2020

Hi, can this be merged?

Copy link
Contributor

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've redone reviewing this and it should indeed be merged, even though I'd like to make a few more changes first, so it's going to take some time.

The remarks below are mostly reminders for myself, but if you have any comments about them, please don't hesitate to leave your comments.

interface/wx/intl.h Outdated Show resolved Hide resolved
interface/wx/numformatter.h Outdated Show resolved Hide resolved

// Format a number s with the specified thousands separator, decimal separator
// and grouping format for the thousands separator
static void FormatNumber(wxString &s, wxChar thousandsSep,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't remember any more, but was there a reason to pass s as output parameter here instead of just returning it? Is it done for efficiency or is there something else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just followed the declaration of AddThousandsSeparators(wxString& s).

interface/wx/numformatter.h Outdated Show resolved Hide resolved
@@ -69,6 +69,7 @@ wxLANGUAGE_ENGLISH_PHILIPPINES en_PH LANG_ENGLISH SUBLANG_ENGLISH_P
wxLANGUAGE_ENGLISH_SOUTH_AFRICA en_ZA LANG_ENGLISH SUBLANG_ENGLISH_SOUTH_AFRICA LTR "English (South Africa)"
wxLANGUAGE_ENGLISH_TRINIDAD en_TT LANG_ENGLISH SUBLANG_ENGLISH_TRINIDAD LTR "English (Trinidad)"
wxLANGUAGE_ENGLISH_ZIMBABWE en_ZW LANG_ENGLISH SUBLANG_ENGLISH_ZIMBABWE LTR "English (Zimbabwe)"
wxLANGUAGE_ENGLISH_INDIA en_IN LANG_ENGLISH SUBLANG_ENGLISH_INDIA LTR "English (India)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be moved after wxLANGUAGE_ENGLISH_EIRE to keep things in alphabetical order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a2ef888

src/common/intl.cpp Outdated Show resolved Hide resolved
include/wx/intl.h Outdated Show resolved Hide resolved
src/common/intl.cpp Outdated Show resolved Hide resolved
s_thousandsSeparator = s[0];
const wxString
g = wxLocale::GetInfo(wxLOCALE_GROUPING, wxLOCALE_CAT_NUMBER);
if ( g[0] != '\0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this test... First of all, I think we should always overwrite s_grouping, as it can have the value corresponding to the old locale. Second, AFAICS g will never have a NUL byte inside it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is unnecessary. I will remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9ed8ad9

{
const wxString
s = wxLocale::GetInfo(wxLOCALE_THOUSANDS_SEP, wxLOCALE_CAT_NUMBER);
if ( s.length() == 1 )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: need to check whether this works correctly with non-ASCII separators. as in fr_FR.utf8 locale, the thousands separator is 'NARROW NO-BREAK SPACE' (U+202F).

src/common/intl.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
work needed Too useful to close, but can't be applied in current state
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants