Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ fixed #119 ] latin1 encoding: each byte counts as 1 char #156

Merged
merged 1 commit into from
Jan 27, 2020

Commits on Jan 26, 2020

  1. [ fixed haskell#119 ] latin1 encoding: each byte counts as 1 char

    The computation of the length component of AlexToken was tailored to
    the utf8 encoding, and didn't work correctly for latin1.
    
    This is fixed by having a new flag ALEX_LATIN1 in
    templates/GenericTemplate.hs that turns on code that increases the
    length by 1 for each byte, while for utf8 something more sophisticated
    is done.
    
    The fix requires more template instances to be generated.  To streamline
    the instance generation, now all 2^4 = 16 template instances are
    generated for the 4 flags
    
      - ghc
      - latin1
      - nopred
      - debug
    
    To ensure consistent reference to the template instance, a function
    
      templateFileName
    
    residing both in src/Main and gen-alex-sdist/Main needs to be kept
    consistent, should more dimensions be added to the template.
    
    (Putting this function into a separate file that is included by both
    modules could be an option, but seemed not enough in the spirit of
    cabal-organized projects.)
    andreasabel committed Jan 26, 2020
    Configuration menu
    Copy the full SHA
    ae525e3 View commit details
    Browse the repository at this point in the history