Skip to content

Commit

Permalink
Just malloc instead of stack+malloc+memcpy when coercing int to string
Browse files Browse the repository at this point in the history
Instead of itoa() writing into a stack-allocated buffer of the max
possible size and then MVM_malloc()ing a new buffer of the actual size
and memcpy()ing from the stack buffer into the new buffer, alway
MVM_malloc() a buffer of the max size and just use that. This means we
don't use the stack as much and don't have to memcpy(). However, we are
now probably wasting some memory with every coerce that doesn't hit the
cache, since most coercions likely result in a smaller string than the
max possible size. One obvious solution would be to MVM_realloc() down
to the actual size, but that's actually slower than the original code.
This does mean that unmanaged_size() will slightly underreport for the
MVMStrings resulting from these coercions, but maybe the difference will
be small enough not to really matter?

My test case, `MVM_SPESH_BLOCKING=1 nqp-m -e 'my str $s; my int $i := 0;
my $n := nqp::time; while $i++ < 10_000_000 { $s := $i };
say(nqp::div_n(nqp::time - $n, 1000000000e0)); say($s)'` decreased from
~0.30s to ~0.27s and the number of instructions reported by callgrind
(with only 1_000_000 iterations and not tracking the time) decreased
from ~642.2m to ~608.7m.

I logged all coercions that didn't hit the cache during the Rakudo build
and of the 8.2m, 7.8m were for length 13 (so wasting 7 bytes), and the
rest were no more than 10 (so wasting 10-19 bytes).

Allocate exact space for int->str conversion

Add a Windows version of __builtin_clzll from LLVM
  • Loading branch information
MasterDuke17 committed May 2, 2024
1 parent 8ccb877 commit 62e6467
Showing 1 changed file with 41 additions and 16 deletions.
57 changes: 41 additions & 16 deletions src/core/coerce.c
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,35 @@
#if defined(_MSC_VER)
#define strtoll _strtoi64
#define snprintf _snprintf
/* slightly adapted to use in MoarVM */
//===-- int_lib.h - configuration header for compiler-rt -----------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file is a configuration header for compiler-rt.
// This file is not part of the interface of this library.
//
//===----------------------------------------------------------------------===//
#if defined(_M_ARM) || defined(_M_X64)
static int __inline __builtin_clzll(uint64_t value) {
unsigned long leading_zero = 0;
if (_BitScanReverse64(&leading_zero, value))
return 63 - leading_zero;
return 63;
}
#else
static int __inline __builtin_clzll(uint64_t value) {
uint32_t msh = (uint32_t)(value >> 32);
uint32_t lsh = (uint32_t)(value & 0xFFFFFFFF);
if (msh != 0)
return __builtin_clz(msh);
return 32 + __builtin_clz(lsh);
}
#endif
#endif

MVMint64 MVM_coerce_istrue_s(MVMThreadContext *tc, MVMString *str) {
Expand Down Expand Up @@ -89,36 +118,33 @@ static char * i64toa_jeaiii(int64_t i, char* b) {
}

/* End code */

static const int mag[] = { 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 19, 20 };
MVMString * MVM_coerce_i_s(MVMThreadContext *tc, MVMint64 i) {
char buffer[20];
int len;
/* See if we can hit the cache. */
int cache = 0 <= i && i < MVM_INT_TO_STR_CACHE_SIZE;
const int cache = 0 <= i && i < MVM_INT_TO_STR_CACHE_SIZE;
if (cache) {
MVMString *cached = tc->instance->int_to_str_cache[i];
if (cached)
return cached;
}
/* Otherwise, need to do the work; cache it if in range. */
len = i64toa_jeaiii(i, buffer) - buffer;
const int is_negative = i < 0;
const int msb = 64 - __builtin_clzll((is_negative ? -i : i) | 1);
char *buffer = MVM_malloc(mag[msb] + is_negative);
const int len = i64toa_jeaiii(i, buffer) - buffer;
if (0 <= len) {
MVMString *result = NULL;
MVMGrapheme8 *blob = MVM_malloc(len);
memcpy(blob, buffer, len);
result = MVM_string_ascii_from_buf_nocheck(tc, blob, len);
MVMString *result = MVM_string_ascii_from_buf_nocheck(tc, (MVMGrapheme8 *)buffer, len);
if (cache)
tc->instance->int_to_str_cache[i] = result;
return result;
}
else {
MVM_free(buffer);
MVM_exception_throw_adhoc(tc, "Could not stringify integer (%"PRId64")", i);
}
}

MVMString * MVM_coerce_u_s(MVMThreadContext *tc, MVMuint64 i) {
char buffer[20];
int len;
/* See if we can hit the cache. */
int cache = i < MVM_INT_TO_STR_CACHE_SIZE;
if (cache) {
Expand All @@ -127,17 +153,16 @@ MVMString * MVM_coerce_u_s(MVMThreadContext *tc, MVMuint64 i) {
return cached;
}
/* Otherwise, need to do the work; cache it if in range. */
len = u64toa_jeaiii(i, buffer) - buffer;
char *buffer = MVM_malloc(20);
int len = u64toa_jeaiii(i, buffer) - buffer;
if (0 <= len) {
MVMString *result = NULL;
MVMGrapheme8 *blob = MVM_malloc(len);
memcpy(blob, buffer, len);
result = MVM_string_ascii_from_buf_nocheck(tc, blob, len);
MVMString *result = MVM_string_ascii_from_buf_nocheck(tc, (MVMGrapheme8 *)buffer, len);
if (cache)
tc->instance->int_to_str_cache[i] = result;
return result;
}
else {
MVM_free(buffer);
MVM_exception_throw_adhoc(tc, "Could not stringify integer (%"PRIu64")", i);
}
}
Expand Down

0 comments on commit 62e6467

Please sign in to comment.