Skip to content

Commit

Permalink
locale.c: Skip locale utf8ness calculation if feasible
Browse files Browse the repository at this point in the history
I originally wrote this to save time processing the strings returned by
localeconv().  If we know that the locale isn't UTF-8, then we don't
have to calculate this for each string returned, Thus "Calculate once, not
many times".

But I hadn't realized that likely only one string is ever going to be
non-ASCII: the currency symbol.  (The decimal and thousands separators
could also be, but of the >500 locales on my Linux box, only ps_AF has
them so.  That is the Pashtun language in Afghanistan; not a frequently
occurring locale.)

So it comes down to either case is effectively "calculate once".  And it
is generally more expensive to calculate the UTF8ness of a locale than a
particular string in it, especially when those strings are going to be
ASCII, as in this case.

This commit changes to not calculate the locale's UTF8ness.
  • Loading branch information
khwilliamson committed Nov 22, 2023
1 parent bf2be0d commit 2f559a6
Showing 1 changed file with 2 additions and 7 deletions.
9 changes: 2 additions & 7 deletions locale.c
Expand Up @@ -5722,11 +5722,6 @@ S_my_localeconv(pTHX_ const int item)

for (unsigned int i = 0; i < 2; i++) { /* Try both types of strings */

const char * locale = locales[i];
if (! is_locale_utf8(locale)) {
continue; /* No string can be UTF-8 if the locale isn't */
}

/* Examine each string */
for (const lconv_offset_t *strp = strings[i]; strp->name; strp++) {
const char * name = strp->name;
Expand All @@ -5740,8 +5735,8 @@ S_my_localeconv(pTHX_ const int item)

/* Determine if the string should be marked as UTF-8. */
if (UTF8NESS_YES == (get_locale_string_utf8ness_i(SvPVX(*value),
LOCALE_IS_UTF8,
NULL,
LOCALE_UTF8NESS_UNKNOWN,
locales[i],
LC_ALL_INDEX_ /* OOB */)))
{
SvUTF8_on(*value);
Expand Down

0 comments on commit 2f559a6

Please sign in to comment.