Skip to content

Commit

Permalink
Do localeconv() adjustments immediately unthreaded
Browse files Browse the repository at this point in the history
localeconv() must be called and its struct processed in a critical
section when other threads or embedded perls could also be using it.
Something like StructCopy can't be used to save its values, as a deep
copy is needed.  Some of the processing can be expensive, so is deferred
to a separate pass after the critical section.

But if no critical section is needed, it's cheaper to do the processing
as we go along doing the copy.  A comment removed in this commit said
that the reason this wasn't done was because of an extra maintenance
cost (having to maintain the code in two places).  But when I actually
looked at what it would like like to do this, I found it is essentially
just an extra function call, not enough "extra" to worry about.
  • Loading branch information
khwilliamson committed Nov 22, 2023
1 parent 0b3e948 commit ccdad10
Showing 1 changed file with 59 additions and 17 deletions.
76 changes: 59 additions & 17 deletions locale.c
Original file line number Diff line number Diff line change
Expand Up @@ -5700,20 +5700,15 @@ S_my_localeconv(pTHX_ const int item)
strings, integers);
}

/* Here, the hash has been completely populated.
*
* Now go through all the items and:
* For string items, see if they should be marked as UTF-8 or not.
* This would have been more convenient and faster to do while
* populating the hash in the first place, but that operation has to be
* done within a critical section, keeping other threads from
* executing, so only the minimal amount of work necessary is done at
* that time.
* XXX On unthreaded perls, this code could be #ifdef'd out, and the
* corrections determined at hash population time, at an extra maintenance
* cost which khw doesn't think is worth it
*/
/* Here, the hash has been completely populated. */

# ifdef MULTIPLICITY

/* When the hash was populated during a critical section, the determination
* of if a string element should be marked as UTF-8 or not was deferred, so
* as to minimize the amount of time in the critical section. But now we
* have the hash specific to this thread, and can do the adjusting without
* worrying about delaying other threads. */
for (unsigned int i = 0; i < 2; i++) { /* Try both types of strings */

/* The return from this function is already adjusted */
Expand Down Expand Up @@ -5743,6 +5738,8 @@ S_my_localeconv(pTHX_ const int item)
}
}

# endif /* MULTIPLICITY */

return hv;

# endif /* End of must have one or both USE_MONETARY, USE_NUMERIC */
Expand Down Expand Up @@ -5970,20 +5967,65 @@ S_populate_hash_from_localeconv(pTHX_ HV * hv,
const PERL_UINT_FAST8_T i = lsbit_pos(working_mask);
working_mask &= ~ (1 << i);

/* For each string field for the given category ... */
/* Point to the string field list for the given category ... */
const lconv_offset_t * category_strings = strings[i];

# ifdef MULTIPLICITY

/* In a critical section, defer finding the utf8ness of all strings'
* utf8ness until later (in an extra pass through the hash). */
const bool utf8ness = false;

# else
# ifdef HAS_SOME_LANGINFO

/* When not in a critical section, and the only possible call to this
* function is to populate more than just a single field, we can find
* every string's utf8ness now. This avoid the extra pass later */
const bool calculate_utf8ness_here = true;
bool utf8ness;

# else

/* When not in a critical section, and this is a call for just a single
* field, we don't calculate this string's utf8ness now. (This case is
* indicated by element [1] being a NULL marker, hence only one real
* element.) It is instead done by our ultimate caller to avoid
* potential infinite recursion in some Configurations and extra work
* in the others. (There is no extra pass for just a single field.)
* Otherwise we can find every string's utf8ness now */
const bool calculate_utf8ness_here = (category_strings[1].name != NULL);
bool utf8ness = false; /* Assume false when not calculating it */

# endif
# endif

/* For each string field */
while (category_strings->name) {

/* We have set things up so that we know where in the returned
* structure, when viewed as a string, the corresponding value is.
* */
const char *value = *((const char **)( lcbuf_as_string
+ category_strings->offset));
char *value = *((char **)( lcbuf_as_string
+ category_strings->offset));
if (value) { /* Copy to the hash */

# ifndef MULTIPLICITY

if (calculate_utf8ness_here) {
utf8ness =
( UTF8NESS_YES
== get_locale_string_utf8ness_i(value,
LOCALE_UTF8NESS_UNKNOWN,
locale,
LC_ALL_INDEX_ /* OOB */));
}
# endif

(void) hv_store(hv,
category_strings->name,
strlen(category_strings->name),
newSVpv(value, strlen(value)),
newSVpvn_utf8(value, strlen(value), utf8ness),
0);
}

Expand Down

0 comments on commit ccdad10

Please sign in to comment.