Faster c_count
? Perhaps a GHC rather than bytestring
issue?
#274
Labels
c_count
? Perhaps a GHC rather than bytestring
issue?
#274
Recent performance issues in
streaming-bytestring
'slineSplit
andbytestring
'sfindIndices
indirectly lead to observations about the relative performance of the C memchr(3) vs. native byte-by-byte comparison loops of the kind found inbytestring
'sc_count
function.In particular, it is at first surprising that in a 32K block with not unreasonably short lines, counting the newlines via repeated calls to
memchr()
turns out to noticeably outperform a single call tofps_count()
. The reason is of course that GCC's or clang'smemchr()
is not just a naïve byte-by-byte loop. It may use vector instructions of the CPU or clever portable tricks to efficiently check wether a 64-bit-aligned 64-bit block contains any instances of a given byte.Some of these techniques, can be found under an MIT license in repos with code extracted from Rust, such as:
https://github.com/Freaky/fast-bytecount
so could perhaps be integrated (with proper attribution, and perhaps after checking with the authors) into
bytestring
or GHC. Which brings me to the real question:bytestring
, or are they perhaps properly better implemented as primops in GHC itself, and not require FFI calls? GHC might then support these forByteArray#
and/orPtr Word8
?On some level I feel that CPU-optimised primitives of this sort belong in the compiler, and that libraries just need to use the provided primitives instead of rolling their own. There's some precedent for this with some of the SIMD vector primitives and the bit-level "popcount" primitives found in
GHC.Exts
.On the other hand if GHC for some reason is not the right vehicle for per-CPU-optimised primitives of this sort, then perhaps
bytestring
might at least include the clever portable C optimised versions, but I don't think thatbytestring
can reasonably support assembly optimisations targeted at given CPUs or LLVM code generation, ...So this issue is fairly open-ended. Is this worth thinking about? And if so is this a GHC topic or a
bytestring
topic?Cc: @bgamari
The text was updated successfully, but these errors were encountered: