Skip to content

Commit

Permalink
XXX finish msg,thread_init Revamp sync_locale, switch_to_global_locale
Browse files Browse the repository at this point in the history
f
  • Loading branch information
khwilliamson committed May 5, 2021
1 parent 10fb4af commit ae962b4
Show file tree
Hide file tree
Showing 7 changed files with 170 additions and 117 deletions.
4 changes: 2 additions & 2 deletions embed.fnc
Expand Up @@ -1623,8 +1623,8 @@ XpT |void |_warn_problematic_locale
Xp |void |set_numeric_underlying
Xp |void |set_numeric_standard
Xp |bool |_is_in_locale_category|const bool compiling|const int category
ApdT |void |switch_to_global_locale
ApdT |bool |sync_locale
Apd |void |switch_to_global_locale
Apd |bool |sync_locale
Apx |void |thread_locale_init
ApxT |void |thread_locale_term
ApdO |void |require_pv |NN const char* pv
Expand Down
4 changes: 2 additions & 2 deletions embed.h
Expand Up @@ -716,8 +716,8 @@
#define sv_vsetpvf(a,b,c) Perl_sv_vsetpvf(aTHX_ a,b,c)
#define sv_vsetpvf_mg(a,b,c) Perl_sv_vsetpvf_mg(aTHX_ a,b,c)
#define sv_vsetpvfn(a,b,c,d,e,f,g) Perl_sv_vsetpvfn(aTHX_ a,b,c,d,e,f,g)
#define switch_to_global_locale Perl_switch_to_global_locale
#define sync_locale Perl_sync_locale
#define switch_to_global_locale() Perl_switch_to_global_locale(aTHX)
#define sync_locale() Perl_sync_locale(aTHX)
#define taint_env() Perl_taint_env(aTHX)
#define taint_proper(a,b) Perl_taint_proper(aTHX_ a,b)
#define thread_locale_init() Perl_thread_locale_init(aTHX)
Expand Down
1 change: 1 addition & 0 deletions embedvar.h
Expand Up @@ -250,6 +250,7 @@
#define PL_parser (vTHX->Iparser)
#define PL_patchlevel (vTHX->Ipatchlevel)
#define PL_peepp (vTHX->Ipeepp)
#define PL_perl_controls_locale (vTHX->Iperl_controls_locale)
#define PL_perl_destruct_level (vTHX->Iperl_destruct_level)
#define PL_perldb (vTHX->Iperldb)
#define PL_perlio (vTHX->Iperlio)
Expand Down
3 changes: 3 additions & 0 deletions intrpvar.h
Expand Up @@ -723,6 +723,9 @@ PERLVAR(I, padix_floor, PADOFFSET) /* how low may inner block reset padix */
* platform */
PERLVARA(I, curlocales, 12, const char *)

#endif
#ifndef USE_THREAD_SAFE_LOCALE_EMULATION
PERLVARI(I, perl_controls_locale, bool, 1)
#endif
#ifdef USE_LOCALE_COLLATE

Expand Down
266 changes: 155 additions & 111 deletions locale.c
Expand Up @@ -6501,21 +6501,30 @@ Perl_my_strerror(pTHX_ const int errnum, int * utf8ness)
=for apidoc switch_to_global_locale
On systems without locale support, or on typical single-threaded builds, or on
platforms that do not support per-thread locale operations, this function does
nothing. On such systems that do have locale support, only a locale global to
the whole program is available.
On multi-threaded builds on systems that do have per-thread locale operations,
this function converts the thread it is running in to use the global locale.
This is for code that has not yet or cannot be updated to handle multi-threaded
locale operation. As long as only a single thread is so-converted, everything
works fine, as all the other threads continue to ignore the global one, so only
this thread looks at it.
However, on Windows systems this isn't quite true prior to Visual Studio 15,
at which point Microsoft fixed a bug. A race can occur if you use the
following operations on earlier Windows platforms:
This function copies the locale state of the calling thread into the program
global locale, and converts the thread to use that global locale.
It is intended so that Perl can safely be used with C libraries that access the
global locale and which can't be converted to not access it. Effectively, this
means libraries that call C<L<setlocale(3)>> on non-Windows systems. (For
portability, it is a good idea to use it on Windows as well.)
A downside of using it is that it disables the services that Perl provides to
hide locale gotchas from your code. The service you most likely will miss
regards the radix character (decimal point) in floating point numbers. Code
executed after this function is called can no longer just assume that this
character is correct for the current circumstances.
To return to Perl control, and restart the gotcha prevention services, call
C<L</sync_locale>>. Behavior is undefined for any pure Perl code that executes
while the switch is in effect.
The global locale and the per-thread locales are independent. As long as just
one thread converts to the global locale, everything works smoothly. But if
more than one does, they can easily interfere with each other, and races are
likely. On Windows systems prior to Visual Studio 15 (at which point Microsoft
fixed a bug), races can occur (even if only one thread has been converted to
the global locale), but only if you use the following operations:
=over
Expand All @@ -6528,54 +6537,101 @@ following operations on earlier Windows platforms:
=back
The first item is not fixable (except by upgrading to a later Visual Studio
release), but it would be possible to work around the latter two items by using
the Windows API functions C<GetNumberFormat> and C<GetCurrencyFormat>; patches
release), but it would be possible to work around the latter two items by
having Perl change its algorithm for calculating these to use Windows API
functions (likely C<GetNumberFormat> and C<GetCurrencyFormat>); patches
welcome.
Without this function call, threads that use the L<C<setlocale(3)>> system
function will not work properly, as all the locale-sensitive functions will
look at the per-thread locale, and C<setlocale> will have no effect on this
thread.
Perl code should convert to either call
L<C<Perl_setlocale>|perlapi/Perl_setlocale> (which is a drop-in for the system
C<setlocale>) or use the methods given in L<perlcall> to call
XS code should never call plain C<setlocale>, but should instead be converted
to either call L<C<Perl_setlocale>|perlapi/Perl_setlocale> (which is a drop-in
for the system C<setlocale>) or use the methods given in L<perlcall> to call
L<C<POSIX::setlocale>|POSIX/setlocale>. Either one will transparently properly
handle all cases of single- vs multi-thread, POSIX 2008-supported or not.
Non-Perl libraries, such as C<gtk>, that call the system C<setlocale> can
continue to work if this function is called before transferring control to the
library.
Upon return from the code that needs to use the global locale,
L<C<sync_locale()>|perlapi/sync_locale> should be called to restore the safe
multi-thread operation.
=cut
*/

void
Perl_switch_to_global_locale()
Perl_switch_to_global_locale(pTHX)
{
dTHX;

#ifdef USE_THREAD_SAFE_LOCALE
# ifdef WIN32
#ifdef USE_LOCALE

_configthreadlocale(_DISABLE_PER_THREAD_LOCALE);
unsigned int i;
bool perl_controls;

DEBUG_L(PerlIO_printf(Perl_debug_log, "Entering switch_to_global; %s\n",
get_LC_ALL_display()));

# ifdef USE_THREAD_SAFE_LOCALE

/* In these cases, we use the system state to determine if we are in the
* global locale or not. */

# ifdef USE_POSIX_2008_LOCALE

perl_controls = LC_GLOBAL_LOCALE != uselocale((locale_t) 0);

# elif defined(WIN32) && defined(USE_THREAD_SAFE_LOCALE)

perl_controls = _ENABLE_PER_THREAD_LOCALE
== _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
# else
# error Unexpected Configuration
# endif
# else

{
unsigned int i;
/* In the other cases we rely on our own variable which we control. */

for (i = 0; i < LC_ALL_INDEX_; i++) {
setlocale(categories[i], querylocale_i(i));
perl_controls = PL_perl_controls_locale;

# endif

/* No-op if already in global */
if (! perl_controls) {
return;
}

# ifdef USE_THREAD_SAFE_LOCALE
# if defined(WIN32)

_configthreadlocale(_DISABLE_PER_THREAD_LOCALE);

# elif defined(USE_POSIX_2008_LOCALE)

{
const char * curlocales[NOMINAL_LC_ALL_INDEX + 1];

/* Save each category's current state */
for (i = 0; i < NOMINAL_LC_ALL_INDEX; i++) {
curlocales[i] = savepv(querylocale_i(i));
}

/* Switch to global */
uselocale(LC_GLOBAL_LOCALE);

/* Set the global to what was our per-thread state */
for (i = 0; i < NOMINAL_LC_ALL_INDEX; i++) {
porcelain_setlocale(categories[i], curlocales[i]);
Safefree(curlocales[i]);
}
}
# else
# error Unexpected Configuration
# endif
# else

PL_perl_controls_locale = FALSE;

# endif
# ifdef USE_LOCALE_NUMERIC

/* Switch to the underlying C numeric locale; the application is on its own
* */
PORCELAIN_SETLOCALE_LOCK;
porcelain_setlocale(LC_NUMERIC, PL_numeric_name);
PORCELAIN_SETLOCALE_UNLOCK;

# endif
#endif

Expand All @@ -6585,27 +6641,45 @@ Perl_switch_to_global_locale()
=for apidoc sync_locale
This function copies the state of the program global locale into the calling
thread, and converts that thread to using per-thread locales, if it wasn't
already, and the platform supports them. The LC_NUMERIC locale is toggled into
the standard state (using the C locale's conventions), if not within the
lexical scope of S<C<use locale>>.
Perl will now consider itself to have control of the locale.
Since unthreaded perls have only a global locale, this function is a no-op
without threads.
This function is intended for use with C libraries that do locale manipulation.
It allows Perl to accommodate the use of them. Call this function before
transferring back to Perl space so that it knows what state the C code has left
things in.
XS code should not manipulate the locale on its own. Instead,
L<C<Perl_setlocale>|perlapi/Perl_setlocale> can be used at any time to query or
change the locale (though changing the locale is antisocial and dangerous on
multi-threaded systems that don't have multi-thread safe locale operations.
(See L<perllocale/Multi-threaded operation>). Using the system
L<C<setlocale(3)>> should be avoided. Nevertheless, certain non-Perl libraries
called from XS, such as C<Gtk> do so, and this can't be changed. When the
locale is changed by XS code that didn't use
L<C<Perl_setlocale>|perlapi/Perl_setlocale>, Perl needs to be told that the
locale has changed. Use this function to do so, before returning to Perl.
(See L<perllocale/Multi-threaded operation>).
Using the libc L<C<setlocale(3)>> function should be avoided. Nevertheless,
certain non-Perl libraries called from XS, do call it, and their behavior may
not be able to be changed. This function, along with
C<L</switch_to_global_locale>>, can be used to get seamless behavior in these
circumstances, as long as only one thread is involved.
If the library has an option to turn off its locale manipulation, doing that is
preferable to using this mechanism. C<Gtk> is such a library.
The return value is a boolean: TRUE if the global locale at the time of call
was in effect; and FALSE if a per-thread locale was in effect. This can be
used by the caller that needs to restore things as-they-were to decide whether
or not to call
L<C<Perl_switch_to_global_locale>|perlapi/switch_to_global_locale>.
was in effect for the caller; and FALSE if a per-thread locale was in effect.
=cut
*/

bool
Perl_sync_locale()
Perl_sync_locale(pTHX)
{

#ifndef USE_LOCALE
Expand All @@ -6614,77 +6688,43 @@ Perl_sync_locale()

#else

const char * newlocale;
dTHX;

# ifdef USE_POSIX_2008_LOCALE

bool was_in_global_locale = FALSE;
locale_t cur_obj = uselocale((locale_t) 0);
bool was_in_global = TRUE;

/* On Windows, unless the foreign code has turned off the thread-safe
* locale setting, any plain setlocale() will have affected what we see, so
* no need to worry. Otherwise, If the foreign code has done a plain
* setlocale(), it will only affect the global locale on POSIX systems, but
* will affect the */
if (cur_obj == LC_GLOBAL_LOCALE) {
# ifndef USE_THREAD_SAFE_LOCALE

# ifdef HAS_QUERY_LOCALE

void_setlocale_c(LC_ALL, querylocale_c(LC_ALL));
PL_perl_controls_locale = TRUE;

# else
# if defined(WIN32)

unsigned int i;
was_in_global = _DISABLE_PER_THREAD_LOCALE
== _configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
# elif defined(USE_POSIX_2008_LOCALE)

/* We can't trust that we can read the LC_ALL format on the
* platform, so do them individually */
for (i = 0; i < LC_ALL_INDEX_; i++) {
void_setlocale_i(i, querylocale_i(i));
}

# endif

was_in_global_locale = TRUE;
}
was_in_global = LC_GLOBAL_LOCALE == uselocale((locale_t) 0);

# else

bool was_in_global_locale = TRUE;

# error Unexpected Configuration
# endif
# ifdef USE_LOCALE_CTYPE
# endif /* USE_THREAD_SAFE_LOCALE */
# ifdef LC_ALL

newlocale = savepv(querylocale_c(LC_CTYPE));
DEBUG_Lv(PerlIO_printf(Perl_debug_log,
"%s:%d: %s\n", __FILE__, __LINE__,
setlocale_debug_string_c(LC_CTYPE, NULL, newlocale)));
new_ctype(newlocale);
Safefree(newlocale);
/* Use the external interface Perl_setlocale() to make sure all setup gets
* done */
Perl_setlocale(LC_ALL, stdized_setlocale(LC_ALL, NULL));

# endif /* USE_LOCALE_CTYPE */
# ifdef USE_LOCALE_COLLATE
# else

newlocale = savepv(querylocale_c(LC_COLLATE));
DEBUG_Lv(PerlIO_printf(Perl_debug_log,
"%s:%d: %s\n", __FILE__, __LINE__,
setlocale_debug_string_c(LC_COLLATE, NULL, newlocale)));
new_collate(newlocale);
Safefree(newlocale);
{
unsigned int i;
for (i = 0; i < NOMINAL_LC_ALL_INDEX; i++) {
Perl_setlocale(categories[i], stdized_setlocale(categories[i], NULL);
}
}

# endif
# ifdef USE_LOCALE_NUMERIC

newlocale = savepv(querylocale_c(LC_NUMERIC));
DEBUG_Lv(PerlIO_printf(Perl_debug_log,
"%s:%d: %s\n", __FILE__, __LINE__,
setlocale_debug_string_c(LC_NUMERIC, NULL, newlocale)));
new_numeric(newlocale);
Safefree(newlocale);

# endif /* USE_LOCALE_NUMERIC */

return was_in_global_locale;
return was_in_global;

#endif

Expand Down Expand Up @@ -6761,10 +6801,11 @@ Perl_thread_locale_init(pTHX)
"new thread, initial locale is %s\n",
porcelain_setlocale(LC_ALL, NULL)));

# if 0
if (! sync_locale()) { /* Side effect of going to per-thread if avail */
locale_panic_("Thread unexpectedly started not in global locale");
/*locale_panic_("Thread unexpectedly started not in global locale");*/
}

# endif
# ifdef LC_ALL

void_setlocale_c(LC_ALL, "C");
Expand All @@ -6782,6 +6823,9 @@ Perl_thread_locale_init(pTHX)

new_LC_ALL(NULL);

# ifndef USE_THREAD_SAFE_LOCALE
PL_perl_controls_locale = TRUE;
# endif
#endif

}
Expand Down

0 comments on commit ae962b4

Please sign in to comment.