Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branchless integer parsing #2111

Closed
jfaure opened this issue Jan 12, 2024 · 0 comments
Closed

Branchless integer parsing #2111

jfaure opened this issue Jan 12, 2024 · 0 comments

Comments

@jfaure
Copy link

jfaure commented Jan 12, 2024

One of my parsers for a non-json language parses numbers branchlessly, I share the method below.

The idea is to keep around a stage1 bitstring indicating locations of digits for stage2.
Then stage2 computes digit lengths via uint64_t length = __builtin_ctz(notDigitMask >> firstDigitIdx) , which allows us to unconditinally parse n (probably 8) digits, discarding trailing nonsense using integer divide by (10 ^ (8 - length))

full standalone example:

#pragma GCC target("arch=haswell")
#include <stdio.h>
#include <immintrin.h>
#include <stdint.h>

// helper function as in in simdjson
// It actually parses 16 digits so we're potentially missing out on half the work
uint32_t parse_8_Digits(__m128i in) {
  __m128i const t0 = _mm_subs_epu8(in , _mm_set1_epi8 ('0'));
  __m128i const t1 = _mm_maddubs_epi16 (t0 , _mm_setr_epi8 (10,1,10,1,10,1,10,1,10,1,10,1,10,1,10,1));
  __m128i const t2 = _mm_madd_epi16 (t1 , _mm_setr_epi16(100,1,100,1,100,1,100,1));
  __m128i const t3 = _mm_madd_epi16 (_mm_packus_epi32 (t2 , t2) , _mm_setr_epi16(10000,1,10000,1,10000,1,10000,1));
  return _mm_cvtsi128_si32(t3);
}

// ~digitMask allows __builtin_ctz to directly count digits from an offset
uint64_t parseU64(char const *inp , int firstDigitIdx , uint64_t notDigitMask) {
  int length = __builtin_ctz(notDigitMask >> firstDigitIdx);
  uint64_t eightDigits = parse_8_Digits(_mm_lddqu_si128((__m128i*) (inp + firstDigitIdx)));
  static uint64_t pow10[8] = {1,10,100,1000,10000,100000,1000000,10000000};
  return eightDigits / pow10[8 - length];
}

// Digit indexes are hardcoded here
// They are otherwise obtianed as (digit) scalar pseudo-structurals
int main() {
  char const *str = "  32815 , 122 3              ";
  __m128i inp = _mm_lddqu_si128((__m128i*) str);
  uint32_t notDigitMask = _mm_movemask_epi8(_mm_or_si128
    ( _mm_cmpgt_epi8(inp , _mm_set1_epi8('9'))
    , _mm_cmpgt_epi8(_mm_set1_epi8('0') , inp)));

  printf("%ld %ld %ld\n" ,  parseU64(str , 2 , notDigitMask) , parseU64(str , 10 , notDigitMask) , parseU64(str , 14 , notDigitMask));
}```
@jfaure jfaure closed this as completed Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant