Improve the performance by ~30% for lexers that don't use predicates #16

Merged
merged 5 commits into from Feb 18, 2013

Conversation

Projects
None yet
2 participants
Contributor

dcoutts commented Feb 11, 2013

By specialising slightly for the case where left and right contexts
are not used we can simplify the processing of accept actions in
alex_scan_tkn.

When we have left or right contexts we end up with chains of AlexAcc
values and we have to do a traversal of these chains. When there are
no left or right contexts then these chains are always length 1 which
makes the traversal non-recursive. This change improves the performance
singnificantly, (presumably because by being non-recursive ghc can
inline and transform further). A further slight improvement can be had
by always returning the input and length from alex_scan_tkn in a (,,)
rather than putting them into some of the AlexLastAcc constructors.

The 30% improvement measurment was from a lexer for .cabal files, but
I don't think there is anything special about that example so would
expect similar improvements for other lexers that don't use contexts.

Change the representation of the alex_accept elements
Previously it was a [AlexAcc a user] and AlexAcc has constructors for
accept, skip, and predicated accept and skip. Then alex_scan_tkn had
to go through this list. But in fact the only times the list has more
than one element is when the list starts with predicated accept or
skip. An unconditional accept or skip always terminates the search in
alex_scan_tkn. So we can use a slightly more specialised
representation. This both saves one indirection and makes the search
function simpler.

The real motivation here is to make the search function simpler because
that will allow for greater performance improvements in the common
special case that the lexer uses no left or right contexts. In that
case the "search" actually becomes non-recursive.
Owner

dcoutts commented on src/Output.hs in fe081e8 Feb 11, 2013

Note that I'm assuming here that if we have an unconditional accept or skip, that we have exactly one. Certainly the lookup code will stop at the first one so emitting any more than one here is pointless. However if that list can be non-empty then the pattern should be _ rather than []. I used [] there mainly as an assertion.

dcoutts added some commits Feb 11, 2013

Fix warnings in the .cabal file
Fix the cabal-version (since we're using a testsuite section now)
and add a description, taken from the docs.
Improve the performance by ~30% for lexers that don't use predicates
By specialising slightly for the case where left and right contexts
are not used we can simplify the processing of accept actions in
alex_scan_tkn.

When we have left or right contexts we end up with chains of AlexAcc
values and we have to do a traversal of these chains. When there are
no left or right contexts then these chains are always length 1 which
makes the traversal non-recursive. This change improves the performance
singnificantly, (presumably because by being non-recursive ghc can
inline and transform further).

The 30% improvement measurment was from a lexer for .cabal files, but
I don't think there is anything special about that example so would
expect similar improvements for other lexers that don't use contexts.

The change is only implemented for the -g case, but that's fine since
that's what you'd use if you care about performance, and whether it'd
be an improvement for other compilers is anyone's guess.
Update the .gitignore
For the ./AlexTemplate-ghc-nopred file
Contributor

dcoutts commented Feb 12, 2013

I amended the patches because I discovered that an extra "optimisation" I'd made was wrong. It was an extra change, not directly related to the main change here.

simonmar added a commit that referenced this pull request Feb 18, 2013

Merge pull request #16 from dcoutts/master
Improve the performance by ~30% for lexers that don't use predicates

@simonmar simonmar merged commit e9402c4 into simonmar:master Feb 18, 2013

Owner

simonmar commented Feb 18, 2013

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment