Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not supporting en_001, en_150 or en_US_POSIX locale #58

Closed
mattjgalloway opened this issue Jul 17, 2020 · 3 comments · Fixed by #148
Closed

Not supporting en_001, en_150 or en_US_POSIX locale #58

mattjgalloway opened this issue Jul 17, 2020 · 3 comments · Fixed by #148

Comments

@mattjgalloway
Copy link
Contributor

I'm starting to use Boost.Locale in a project and I've hit up against a problem when the system locale is en_001. And I believe it will also be a problem with en_150 and en_US_POSIX as well.
The problem is best described with the following test case:

#include <boost/locale.hpp>
#include <iostream>

int main(int argc, char** argv) {
  boost::locale::generator gen;
  std::locale loc(gen(""));
  std::locale::global(loc);
  std::cout.imbue(loc);

  std::cout << "LOCALE NAME: " << std::use_facet<boost::locale::info>(loc).name() << std::endl;
  std::cout << "LOCALE LANG: " << std::use_facet<boost::locale::info>(loc).language() << std::endl;
  std::cout << "LOCALE COUNTRY: " << std::use_facet<boost::locale::info>(loc).country()
            << std::endl;
  std::cout << "LOCALE ENCODING: " << std::use_facet<boost::locale::info>(loc).encoding()
            << std::endl;
  std::cout << "LOCALE UTF8: " << std::use_facet<boost::locale::info>(loc).utf8() << std::endl;

  return 0;
}

If the system is in en_001, which on Windows is the "English (World)" region name, then the following will be output:

LOCALE NAME: en_001.UTF-8
LOCALE LANG: en
LOCALE COUNTRY:
LOCALE ENCODING: us-ascii
LOCALE UTF8: 0

I would expect the output to be:

LOCALE NAME: en_001.UTF-8
LOCALE LANG: en
LOCALE COUNTRY: 001
LOCALE ENCODING: utf-8
LOCALE UTF8: 1

It's coming from the fact that in boost::locale::util::locale_data::parse_from_country, we are assuming the country needs to contain only 'a' to 'z' or 'A' to 'Z'. But en_001 (and en_150) are valid locales. Probably en_US_POSIX should be handled separately as it's special.

mattjgalloway added a commit to mattjgalloway/locale that referenced this issue Jul 17, 2020
This addresses boostorg#58, by allowing 0-9 in the range of characters for the country of a locale.
@Flamefire
Copy link
Collaborator

Hi @mattjgalloway . I'm currently working on getting your PR fixing this into the next release (no worries about the conflicts, I'll resolve those)

I was wondering whether you had any expectation on how en_US_POSIX is handled?
As far as I've understood this is basically the "C" locale in C++, aka "POSIX". So I think it makes sense to treat it as an alias so the output would be:

LOCALE NAME: C
LOCALE LANG:
LOCALE COUNTRY:
LOCALE ENCODING:
LOCALE UTF8: 0

You could run into this on Linux or when using boost::locale::generator("en_US_POSIX") as the WinAPI will not return that as the "system locale".

@mattjgalloway
Copy link
Contributor Author

@Flamefire Thanks for coming back on this!

Yes I think that would be right for en_US_POSIX.

Flamefire pushed a commit to Flamefire/locale that referenced this issue Feb 26, 2023
This addresses boostorg#58, by allowing 0-9 in the range of characters for the country of a locale.
Flamefire pushed a commit to Flamefire/locale that referenced this issue Feb 28, 2023
This addresses boostorg#58, by allowing 0-9 in the range of characters for the country of a locale.
@mattjgalloway
Copy link
Contributor Author

Thanks @Flamefire for getting this merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants