Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Optimization: Reduce strlen calls in literal matching. 500% faster. #119

Merged
merged 2 commits into from Aug 31, 2021

Conversation

soasme
Copy link
Owner

@soasme soasme commented Aug 31, 2021

A bottleneck on the performance is spotted. Given the following callgrind report, strlen is called frequently.

root@e9e1f33a922d:/app/buildlinux# callgrind_annotate callgrind.out.4864
--------------------------------------------------------------------------------
Profile data file 'callgrind.out.4864' (creator: callgrind-3.15.0)
--------------------------------------------------------------------------------
I1 cache:
D1 cache:
LL cache:
Timerange: Basic block 0 - 2999929205
Trigger: Program termination
Profiled target:  ./cli ast --grammar-file ../tests/golang-v1.17.peg --grammar-entry SourceFile ../tables.go (PID 4864, part 1)
Events recorded:  Ir
Events shown:     Ir
Event sort order: Ir
Thresholds:       99
Include dirs:
User annotated:
Auto-annotation:  off

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
17,147,795,601  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir             file:function
--------------------------------------------------------------------------------
5,671,628,114  /build/glibc-eX1tMB/glibc-2.31/string/../sysdeps/x86_64/multiarch/strlen-avx2.S:__strlen_avx2 [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
2,285,786,104  /build/glibc-eX1tMB/glibc-2.31/stdio-common/vfprintf-internal.c:__vfprintf_internal [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
2,082,168,585  /build/glibc-eX1tMB/glibc-2.31/libio/genops.c:_IO_default_xsputn [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  751,814,539  /build/glibc-eX1tMB/glibc-2.31/malloc/malloc.c:_int_free [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  745,634,925  /app/peppa.c:push_frame [/app/buildlinux/libpeppa.so]
  727,083,049  /app/peppa.c:match_expression'2 [/app/buildlinux/libpeppa.so]
  491,138,084  /build/glibc-eX1tMB/glibc-2.31/malloc/malloc.c:malloc [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  448,750,552  /app/peppa.c:match_sequence'2 [/app/buildlinux/libpeppa.so]
  356,035,185  /build/glibc-eX1tMB/glibc-2.31/string/../sysdeps/x86_64/multiarch/strchr-avx2.S:__strchrnul_avx2 [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  337,972,930  /build/glibc-eX1tMB/glibc-2.31/stdio-common/_itoa.c:_itoa_word [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  281,391,553  /app/peppa.c:pop_frame [/app/buildlinux/libpeppa.so]
  270,636,280  /app/peppa.c:match_literal [/app/buildlinux/libpeppa.so]
  249,880,289  /build/glibc-eX1tMB/glibc-2.31/malloc/malloc.c:free [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  197,488,154  /app/peppa.c:match_choice'2 [/app/buildlinux/libpeppa.so]
  195,408,976  /build/glibc-eX1tMB/glibc-2.31/libio/iovsprintf.c:__vsprintf_internal [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  170,253,828  /app/peppa.c:match_repeat'2 [/app/buildlinux/libpeppa.so]
  145,621,806  /app/peppa.c:P4_GetWhitespaces [/app/buildlinux/libpeppa.so]
  145,144,850  /build/glibc-eX1tMB/glibc-2.31/stdio-common/../libio/libioP.h:__vfprintf_internal
  143,067,286  /build/glibc-eX1tMB/glibc-2.31/libio/strops.c:_IO_str_init_static_internal [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
  135,547,068  /app/peppa.c:match_reference'2 [/app/buildlinux/libpeppa.so]
  119,417,954  /app/peppa.c:match_range [/app/buildlinux/libpeppa.so]
  113,404,878  /app/peppa.c:P4_DeleteNode [/app/buildlinux/libpeppa.so]
   97,704,488  /build/glibc-eX1tMB/glibc-2.31/stdio-common/sprintf.c:sprintf [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   92,107,686  /app/peppa.c:u8_next_char [/app/buildlinux/libpeppa.so]
   92,018,744  /build/glibc-eX1tMB/glibc-2.31/stdio-common/printf-parse.h:__vfprintf_internal
   87,236,285  /build/glibc-eX1tMB/glibc-2.31/libio/genops.c:_IO_setb [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   84,817,686  /build/glibc-eX1tMB/glibc-2.31/libio/libioP.h:_IO_default_xsputn
   80,257,318  /build/glibc-eX1tMB/glibc-2.31/libio/genops.c:_IO_no_init [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   73,278,412  /build/glibc-eX1tMB/glibc-2.31/libio/genops.c:_IO_old_init [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   62,810,208  /build/glibc-eX1tMB/glibc-2.31/string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:__memset_avx2_unaligned_erms [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   62,469,889  /build/glibc-eX1tMB/glibc-2.31/malloc/malloc.c:_int_malloc [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   33,790,679  /app/peppa.c:P4_DiffPosition [/app/buildlinux/libpeppa.so]
   33,770,484  /app/peppa.c:cleanup_freep [/app/buildlinux/libpeppa.so]
   33,029,980  ???:0x000000000487c390 [???]
   29,849,862  /app/peppa.c:P4_CaseCmpInsensitive [/app/buildlinux/libpeppa.so]
   29,210,137  /build/glibc-eX1tMB/glibc-2.31/string/../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S:__memcmp_avx2_movbe [/usr/lib/x86_64-linux-gnu/libc-2.31.so]
   24,344,373  ???:0x000000000483a5f0 [???]

Compare the time stats:

Before:

real	0m47.054s
user	0m44.259s
sys	0m0.234s

After

real	0m12.089s
user	0m9.655s
sys	0m0.107s

Attachment:

@soasme soasme changed the title perf: delete unnecessary strlen calls, which is time consuming. Performance Optimization: Reduce strlen calls in literal matching. 500% faster. Aug 31, 2021
@soasme soasme merged commit ad8b281 into main Aug 31, 2021
@soasme soasme deleted the perf branch August 31, 2021 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant