Skip to content

Commit

Permalink
Use 2-char-at-a-time version of jeaiii's itoa
Browse files Browse the repository at this point in the history
I originally selected the 1-char-at-a-time version based on a comment
that the 2-char-at-a-time version didn't work on 32-bit and/or big endian
systems. However, the author commented that "2 chars at a time should work
as far as a know. If there is a system where it doesn't work, I'll help
make it work." - #1618 (comment)

This version is slightly faster when using my original test of
`MVM_SPESH_BLOCKING=1 ./nqp-m -e 'my str $s; my int $i := 0; my $n := nqp::time; while $i++ < 10_000_000 { $s := $i }; say(nqp::div_n(nqp::time - $n, 1000000000e0)); say($s)'
with the time dropping from ~0.34s to ~0.31s and instructions reported
by callgrind dropping from ~4.313b to ~4.062b.
  • Loading branch information
MasterDuke17 committed Oct 4, 2022
1 parent 4575c47 commit fd0cadb
Showing 1 changed file with 19 additions and 14 deletions.
33 changes: 19 additions & 14 deletions src/core/coerce.c
Expand Up @@ -34,20 +34,25 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. */

#define W(N, I) b[N] = (char)(I) + '0'
#define A(N) t = ((uint64_t)(1) << (32 + N / 5 * N * 53 / 16)) / (uint32_t)(1e##N) + 1 - N / 9, t *= u, t >>= N / 5 * N * 53 / 16, t += N / 5 * 4, W(0, t >> 32)
#define D(N) t = (uint64_t)(10) * (uint32_t)(t), W(N, t >> 32)

#define L0 W(0, u)
#define L1 A(1), D(1)
#define L2 A(2), D(1), D(2)
#define L3 A(3), D(1), D(2), D(3)
#define L4 A(4), D(1), D(2), D(3), D(4)
#define L5 A(5), D(1), D(2), D(3), D(4), D(5)
#define L6 A(6), D(1), D(2), D(3), D(4), D(5), D(6)
#define L7 A(7), D(1), D(2), D(3), D(4), D(5), D(6), D(7)
#define L8 A(8), D(1), D(2), D(3), D(4), D(5), D(6), D(7), D(8)
#define L9 A(9), D(1), D(2), D(3), D(4), D(5), D(6), D(7), D(8), D(9)
typedef struct pair { char t, o; } pair;
#define P(T) T, '0', T, '1', T, '2', T, '3', T, '4', T, '5', T, '6', T, '7', T, '8', T, '9'
static const pair s_pairs[] = { P('0'), P('1'), P('2'), P('3'), P('4'), P('5'), P('6'), P('7'), P('8'), P('9') };

#define W(N, I) *(pair*)&b[N] = s_pairs[I]
#define A(N) t = ((uint64_t)(1) << (32 + N / 5 * N * 53 / 16)) / (uint32_t)(1e##N) + 1 + N/6 - N/8, t *= u, t >>= N / 5 * N * 53 / 16, t += N / 6 * 4, W(0, t >> 32)
#define S(N) b[N] = (char)((uint64_t)(10) * (uint32_t)(t) >> 32) + '0'
#define D(N) t = (uint64_t)(100) * (uint32_t)(t), W(N, t >> 32)

#define L0 b[0] = (char)(u) + '0'
#define L1 W(0, u)
#define L2 A(1), S(2)
#define L3 A(2), D(2)
#define L4 A(3), D(2), S(4)
#define L5 A(4), D(2), D(4)
#define L6 A(5), D(2), D(4), S(6)
#define L7 A(6), D(2), D(4), D(6)
#define L8 A(7), D(2), D(4), D(6), S(8)
#define L9 A(8), D(2), D(4), D(6), D(8)

#define LN(N) (L##N, b += N + 1)
#define LZ LN
Expand Down

0 comments on commit fd0cadb

Please sign in to comment.