Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-28180: Fix the implementation of PEP 538 on Android #4334

Merged
merged 5 commits into from Nov 12, 2017

Conversation

@xdegaye
Copy link
Contributor

commented Nov 8, 2017

@xdegaye xdegaye requested a review from ncoghlan Nov 8, 2017

@xdegaye xdegaye changed the title Fix the implementation of PEP 538 on Android bpo-28180: Fix the implementation of PEP 538 on Android Nov 8, 2017

@xdegaye

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2017

Reminder:
Prefix the commit msg with bpo-28180 before merging.

@vstinner

This comment has been minimized.

Copy link
Member

commented Nov 8, 2017

Is the "C.UTF-8" locale available on Android?

@xdegaye

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2017

Android has the "C.UTF-8" and "C" locales, see setlocale.

@ncoghlan
Copy link
Contributor

left a comment

If I'm reading the code correctly, I think the outcome you're aiming for can be achieved more simply with one or two unconditional setenv calls.

@@ -21,6 +20,8 @@
# AIX uses iso8859-1 in the C locale, other *nix platforms use ASCII

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 8, 2017

Contributor

Rather than either letting it get out of date, or having it repeat the conditional logic as it does now, this comment can be simplified to say something generic like "While most *nix platforms default to ASCII in the C locale, some use a different encoding"

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 9, 2017

Author Contributor

Ok

@@ -143,12 +144,15 @@ def get_child_details(cls, env_vars):


# Details of the shared library warning emitted at runtime
LEGACY_LOCALE_WARNING = (
if test.support.is_android:

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 8, 2017

Contributor

Let's make the check here:

     if C_LOCALE_STREAM_ENCODING == "utf-8":

That way, if any other platforms start using UTF-8 by default in the C locale, we'll just need to change the test logic in one place.

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 8, 2017

Contributor

Oops, had a typo in my suggested check (I wrote 'ascii' when I meant 'utf-8'). Fixed now :)

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 9, 2017

Author Contributor

Ok

if (env_loc == NULL || strncmp(env_loc, "C.UTF-8", 8) != 0) {
_Py_CoerceLegacyLocale();
}
}

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 8, 2017

Contributor

I was attempting to figure out how this differs from the existing logic in _Py_LegacyLocaleDetected, and it seems to me that this is requiring not only that the locale category LC_ALL be set to C.UTF-8 (which this code still forces unconditionally below), but also that either LANG or LC_CTYPE (or both) be set to C.UTF-8 in the environment.

If I've understood that correctly, then the only real effect of the call to _Py_CoerceLegacyLocale here that isn't already covered by the setlocale call below is to do setenv("LC_CTYPE", "C.UTF-8", 1), just in a roundabout way that doesn't make it clear that that's all it is going to do.

If that's right, then I'd suggest just putting an unconditional setenv("LC_ALL", "C.UTF-8", 1) here to align with the setlocale call, and amende the comment to say something like "While setlocale doesn't check the environment variables, some other ported code may (e.g. in extension modules), so we make sure that the env vars match the forced locale configuration"

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 9, 2017

Author Contributor

If I've understood that correctly, then the only real effect of the call to _Py_CoerceLegacyLocale here that isn't already covered by the setlocale call below is to do setenv("LC_CTYPE", "C.UTF-8", 1)

Actually another effect of _Py_CoerceLegacyLocale() is to print the _C_LOCALE_COERCION_WARNING (test_c_locale_coercion uses this warning to detect success or failure, hence the conditionals on the LANG and LC_CTYPE envt vars before calling _Py_CoerceLegacyLocale()).

The meaning of "locale coercion" may be understood as setting both the locale with setlocale() and the LC_CTYPE environment variable. My understanding is that this warning may be printed when warnings are enabled and Python is changing the LC_CTYPE environment variable.

With your proposed change, test_c_locale_coercion must be changed since _C_LOCALE_COERCION_WARNING is never printed on Andoid in that case. I do not have a strong opinion about this.

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 9, 2017

Author Contributor

I do not have a strong opinion about this.

I mean that I do not have a strong opinion whether _C_LOCALE_COERCION_WARNING must be printed on Android or not.

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 9, 2017

Author Contributor

If I read the code correctly at locale.cpp on bionic master branch (aka Android libc) setlocale(LC_CTYPE, "") returns "C.UTF-8". But on Android API 24 it does return "C" as expected by the implementation of PEP 538.

Another point is that call_readline() in Modules/readline.c does call setlocale(LC_CTYPE, "") (and the reason I gave in msg305848 to explain why test_nonasci succeeds seems to be wrong).

For these reasons, I am starting to think we could use a new _Py_setlocale_from_env() function to replace all calls to setlocale(category, "") with a custom implementation for Android and returning setlocale(category, "") for all other platforms.

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 10, 2017

Contributor

Yeah, I think so - that way we can isolate the idiosyncrasies of different libc implementations a bit better.

As far as the locale coercion warning goes, my understanding is that Android really is UTF-8 everywhere, so we're not at significant risk of corrupting improperly encoded data and hence shouldn't print the warning. Instead, we're just adjusting the environment so that extension modules and CPython code that are assuming glibc-style locale handling see what they expect to see.

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 10, 2017

Contributor

However, making sure I understand the suggestion correctly:

  • on most platforms, _Py_setlocale_from_env(category) would just call setlocale(category, "")
  • on Android, we'd special case it such that if none of LC_ALL, LC_CATEGORY, or LANG were set, we'd pass "C.UTF-8" explicitly (otherwise we'd pass whatever the highest priority environment variable was set to).
@@ -503,7 +503,8 @@ _Py_CoerceLegacyLocale(void)
const char *new_locale = setlocale(LC_CTYPE,
target->locale_name);
if (new_locale != NULL) {
#if !defined(__APPLE__) && defined(HAVE_LANGINFO_H) && defined(CODESET)
#if !defined(__APPLE__) && !defined(__ANDROID__) && \
defined(HAVE_LANGINFO_H) && defined(CODESET)

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 8, 2017

Contributor

If we go back to not calling the coercion function on Android at all, this adjustment won't be needed.

@bedevere-bot

This comment has been minimized.

Copy link

commented Nov 8, 2017

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@xdegaye

This comment has been minimized.

Copy link
Contributor Author

commented Nov 11, 2017

With the implementation of _Py_SetLocaleFromEnv(), the Android specific code path in the implementation of PEP 538 is now limited to _Py_SetLocaleFromEnv() itself.

When one of the locale envt variable is set to 'C', 'POSIX' or an invalid locale, then locale coercion is done with the corresponding warnings printed as for the other *nix platforms, and this can be overriden with the envt variable PYTHONCOERCECLOCALE.

@xdegaye

This comment has been minimized.

Copy link
Contributor Author

commented Nov 11, 2017

I have made the requested changes; please review again.

@bedevere-bot

This comment has been minimized.

Copy link

commented Nov 11, 2017

Thanks for making the requested changes!

@ncoghlan: please review the changes made to this pull request.

@ncoghlan
Copy link
Contributor

left a comment

Nice, I like this approach to consolidating the platform dependent logic.

@ncoghlan

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2017

@xdegaye I'm wondering if we should actually add a short NEWS entry for this, though - that way there will be a clearer record as to when we consolidated the platform dependent handling of setting the locale from the environment, which should be useful for folks embedding CPython in larger applications.

@xdegaye xdegaye removed the skip news label Nov 12, 2017

@@ -0,0 +1,3 @@
The platform dependent handling of setting the locale from the environment is

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Nov 12, 2017

Contributor

It's probably clearer to mention the new internal API by name. Something like

A new internal _Py_SetLocaleFromEnv(category) helper function has been added in order
to improve the consistency of behaviour across different libc implementations (e.g. Android
doesn't support setting the locale from the environment by default)

This comment has been minimized.

Copy link
@xdegaye

xdegaye Nov 12, 2017

Author Contributor

Thanks Nick :-)

@xdegaye xdegaye closed this Nov 12, 2017

@xdegaye xdegaye reopened this Nov 12, 2017

@xdegaye

This comment has been minimized.

Copy link
Contributor Author

commented Nov 12, 2017

Closing and re-opening the PR because travis was hanged without starting.
This seems to happen when a new commit is being pushed while travis is still running on the previous commit.

@xdegaye xdegaye merged commit 1588be6 into python:master Nov 12, 2017

3 checks passed

bedevere/issue-number Issue number 28180 found
Details
bedevere/news News entry found in Misc/NEWS.d
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@xdegaye xdegaye deleted the xdegaye:bpo-28180 branch Nov 12, 2017

yan12125 added a commit to yan12125/python3-android that referenced this pull request Nov 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.