Skip to content

A fast implentation of some print and scan functions.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

hermantb/fast_convert

Repository files navigation

fast_convert

A fast implentation of some print and scan functions. This code can for example be used in writing a very fast json implementation. I tested this on a json library and got a speedup of about 7 times.

Some speedup figures:

  • fast_ftoa is approx. 20 times as fast as sprintf (on x86_64).
  • fast_dtoa is approx. 26 times as fast as sprintf (on x86_64).
  • fast_strtof is approx. 4 times as fast as strtof (on x86_64).
  • fast_strtod is approx. 6 times as fast as strtod (on x86_64).

Some remarks:

  • All floating point routines have the same binary result as the glibc code (see tst_convert.c).
  • When ROUND_EVEN == 1 the strings produced by fast_ftoa and fast_dtoa are the same as sprintf.
  • The integer functions will never overflow but instead return the last character that would cause the overflow in endptr.
  • No checking is done on size of supplied strings.

Functions

These functions are implemented. See also fast_convert.h

unsigned int fast_sint32 (int32_t v, char *str);
unsigned int fast_sint64 (int64_t v, char *str);
unsigned int fast_uint32 (uint32_t v, char *str);
unsigned int fast_uint64 (uint64_t v, char *str);

int32_t fast_strtos32 (const char *str, char **endptr, int base);
int64_t fast_strtos64 (const char *str, char **endptr, int base);
uint32_t fast_strtou32 (const char *str, char **endptr, int base);
uint64_t fast_strtou64 (const char *str, char **endptr, int base);

unsigned int fast_ftoa (float v, int size, char *line);
unsigned int fast_dtoa (double v, int size, char *line);
float fast_strtof (const char *str, char **endptr);
double fast_strtod (const char *str, char **endptr);

The fast_strto[su] functions support 0x (hex), 0 (octal) and decimal support.
The fast_strto[fd] functions suppert 0x (hex) and decimal support. Also inf/nan is supported.

The above functions perform simular functions to below libc functions:

int sprintf (str, "%d", v);
int sprintf (str, "%ld", v);
int sprintf (str, "%u", v);
int sprintf (str, "%lu", v);

long int strtol(const char *nptr, &endptr, int base);
long long int strtoll(const char *nptr, &endptr, int base);
unsigned long int strtoul(const char *nptr, &endptr, int base);
unsigned long long int strtoul(const char *nptr, &endptr, int base);

int sprintf (line, "%.*g", size, v); // v = float
int sprintf (line, "%.*g", size, v); // v = double
float strtof(const char *str, char **endptr);
double strtod(const char *str, char **endptr);

Performance

The performance for all floating point code is (see tst_convert.c):

f test float convert
s test float sprintf convert
d test double convert
S test double sprintf convert
g test fast_strtof convert
G test fast_strtod convert
t test strtof convert
T test strtod convert
p test precision float
P test precision double
c count differences float
C count differences double
i test interger functions
if option after first one is 'n' then no check is done

64 bits (i7-4700MQ + fedora 30)
f:    625.17 (2510.37 / 625.17 = 4.02)
s:   2510.37
d:   1477.76 (5799.09 / 1477.76 = 3.92)
S:   5799.09
fn:    92.23 (1870.78 / 92.23 = 20.28)
sn:  1870.78
dn:   160.43 (4194.16 / 160.43 = 26.14)
Sn:  4194.16
g:    217.58 ((597.76 - 92.23) / (217.58 - 92.23) = 4.03)
t:    597.76
G:    391.04 ((1463.43 - 160.43) / (391.04 - 160.43) = 5.65)
T:   1463.43
gn:   212.41 ((588.91 - 92.23) / (212.41 - 92.23) = 4.13)
tn:   588.91
Gn:   392.33 ((1453.78 - 160.43) / (392.33 - 160.43) = 5.58)
Tn:  1453.78

32 bits (i7-4700MQ + fedora 30)
f:   1023.48 (2836.51 / 1023.48 = 2.77)
s:   2836.51
d:   2769.17 (8004.75 / 2769.17 = 2.89)
S:   8004.75
fn:   147.78 (1909.85 / 147.78 = 12.92)
sn:  1909.85
dn:   381.08 (5383.21 / 381.08 = 14.12)
Sn:  5383.21
g:    449.30 ((1083.00 - 147.78) / (449.30 - 147.78) = 3.10)
t:   1083.00
G:    941.57 ((2830.57 - 381.08) / (941.57 - 381.08) = 4.37)
T:   2830.57
gn:   436.55 ((1069.70 - 147.78) / (436.55 - 147.78) = 3.19)
tn:  1069.70
Gn:   927.53 ((2806.51 - 381.08) / (927.53 - 381.08) = 4.44)
Tn:  2806.51

raspberry pi (3b + raspbian buster):
f:   6005.82 (18654.65 / 6005.82 = 3.11)
s:  18654.65
d:  13205.23 (42775.59 / 13205.23 = 3.24)
S:  42775.59
fn:   819.58 (12308.98 / 819.58 = 15.02)
sn: 12308.98
dn:  1763.91 (30160.90 / 1763.91 = 17.10)
Sn: 30160.90
g:   2422.91 ((6340.77 - 819.58) / (2422.91 - 819.58) = 3.44)
t:   6340.77
G:   4387.61 ((19176.56 - 1763.91) / (4387.61 - 1763.91) = 6.64)
T:  18796.96
gn:  2373.73 ((6226.29 - 819.58) / (2373.73 - 819.58) = 3.48)
tn:  6226.29
Gn:  4347.37 ((18765.85 - 1763.91) / (4347.37 - 1763.91) = 6.58)
Tn: 18765.85

raspberry pi (4b + 32 bit raspbian buster):
f:   2813.47 (7676.53 / 2813.47 = 2.73)
s:   7676.53
d:   6802.98 (20389.28 / 6802.98 = 3.00)
S:  20389.28
fn:   407.26 (5141.19 / 407.26 = 12.62
sn:  5141.19
dn:   996.83 (13827.43 / 996.83 = 13.87)
Sn: 13827.43
g:   1003.31 ((2838.14 - 407.26) / (1003.31 - 407.26) = 4.08)
t:   2838.14
G:   2101.57 ((6817.40 - 996.83) / (2101.57 - 996.83) = 5.27)
T:   6817.40
gn:   957.41 ((2819.12 - 407.26) / (957.41 - 407.26) = 4.38)
tn:  2819.12
Gn:  2081.88 ((6791.61 - 996.83) / (2081.88 - 996.83) = 5.34)
Tn:  6791.61

raspberry pi (4b + 64 bit raspbian buster):
f:   1769.54 (6087.83 / 1769.54 = 3.44)
s:   6087.83
d:   4589.72 (20370.82 / 4589.72 = 4.44)
S:  20370.82
fn:   264.18 (4675.01 / 264.18 = 17.70
sn:  4675.01
dn:   526.18 (15347.77 / 526.18 = 29.16)
Sn: 15347.77
g:    684.24 ((1733.83 - 264.18) / (684.24 - 264.18) = 3.50)
t:   1733.83
G:   1213.86 ((4591.56 - 526.18) / (1213.86 - 526.18) = 5.91)
T:   4591.56
gn:   624.60 ((1715.98 - 264.18) / (624.60 - 264.18) = 4.03)
tn:  1715.98
Gn:  1213.86 ((4576.55 - 526.18) / (1213.86 - 526.18) = 5.89)
Tn:  4576.55

p:    286.03 fast: 0, libc: 0
P:   1090.16 fast: 0, libc: 0

c:   2072.77 0 0.00%
C:   4439.85 0 0.00%

Locale

The locale decimal point is set at startup. If an application needs to use a different locale it has to call localeconv() to update the decimal point.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in fast_convert by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

A fast implentation of some print and scan functions.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published