New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grapheme iterator cached #661

merged 7 commits into from Sep 4, 2017


None yet
1 participant

samcv commented Aug 26, 2017

Here's a rundown on before/after this change:

Description caching master
index with single codepoint needle 0.21844425 0.2195167
index with word needle 0.8477450 1.0048890
index with word needle STRAND 0.832730 2.67385181
index with word needle INDEXINGOPT 0.8357464 1.01405131

samcv added some commits Aug 25, 2017

Always cache even for flat haystacks with KMP
Shows a 20% improvement compared to calling MVM_string_get_grapheme_at_nocheck
on flat strings (counterintuitively).
Only use MVMGraphemeIter_cached for strands in KMP index
If the Haystack is a strand, use MVM_string_gi_cached_get_grapheme
since it retains its grapheme iterator over invocations unlike
MVM_string_get_grapheme_at_nocheck and caches the previous grapheme. It
is slower for flat Haystacks though (ever since I got
MVM_string_get_grapheme_at_nocheck to be inlined).
Use 16bit to store KMP jump table. Set max needle to 8192
Also malloc the memory instead of putting it onto the stack if it is
going to be more than 4069 bytes. Not 100% sure this is needed, but
don't want to allocate too much to the stack since some platforms
may have smaller stack size than current versions of Linux, Windows
or MacOS do.

@samcv samcv force-pushed the samcv:grapheme_iterator_cached branch from 47bd20c to 1a7dafc Sep 4, 2017

@samcv samcv merged commit 4a998d5 into MoarVM:master Sep 4, 2017

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment