Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile gtk2hs failed under locale zh_CN.utf8 #7

Closed
ian-ross opened this issue Aug 1, 2013 · 6 comments
Closed

compile gtk2hs failed under locale zh_CN.utf8 #7

ian-ross opened this issue Aug 1, 2013 · 6 comments

Comments

@ian-ross
Copy link
Member

ian-ross commented Aug 1, 2013

Bug imported from C2HS Trac

Trac ticket created: 2008-06-19T14:16:39-0700; last modified: 2013-02-08T19:41:17-0800


Bug previously reported in the [http://hackage.haskell.org/trac/ghc/ticket/2384 ghc trac]


when compile gtk2hs under locale zh_CN.utf8, I got:

c2hsLocal: Error in C header file. <built-in>:1: (column 0) [FATAL]
    Lexical error!
    The character '#' does not fit here.

when I compile it using locale POSIX, it succeed.

@ian-ross
Copy link
Member Author

ian-ross commented Aug 1, 2013

Replying to duncan:

Bug previously reported in the ghc trac ---- when compile gtk2hs under locale zh_CN.utf8, I got: {{{ c2hsLocal: Error in
C header file. :1: (column 0) [FATAL] Lexical error! The character '#' does not fit here. }}} when I compile it
using locale POSIX, it succeed.

I can confirm c2hs fails like this with Serbian locale (sr_RS.UTF8) as well. I suppose c2hs fails to read unicode characters. To fix that I added a small wrapper script arount c2hs (just "LANG=C c2hs-bin $*") and moved the real c2hs to c2hs-bin. Maybe that could be included in c2hs as a quick fix until this is properly handled?

Best regards & keep up the good work Filip Brcic (brcha AT gna DOT org)

@ian-ross
Copy link
Member Author

ian-ross commented Aug 1, 2013

The latest darcs version may fix this. It uses the latest language-c which when built with alex 3 may handle the unicode correctly. Worth someone trying and confirming.

@ian-ross
Copy link
Member Author

I've tried this on a small test case with Unicode characters (not yet on gtk2hs) and it works, so if it is an issue to do with reading Unicode, it should be OK in the latest version.

@ian-ross
Copy link
Member Author

ian-ross commented Jan 8, 2014

This is still a problem in HEAD with the locale zh_CN.utf8, not just for gtk2hs, but even for simple test cases.

@ian-ross
Copy link
Member Author

I've figured out what this is. It requires a change to the lexer in language-c, which doesn't accept all valid UTF-8 characters (just ASCII). I've asked the maintainer of language-c to look into it.

What's happening here is that C2HS runs CPP to generate a preprocessed header file that it then attempts to parse using language-c. CPP inserts locale-dependent text in the header file which, for some locales, means UTF-8 text with characters from higher codepages. The language-c lexer then fails to read the header file, leading to a failure of C2HS.

Looking at the C99 standard, it appears that it is acceptable for strings in C source files to be encoded as UTF-8, so this should be fixable.

ian-ross added a commit that referenced this issue Jan 13, 2014
@ian-ross
Copy link
Member Author

This should be fixed by a new release of language-c, coming soon from Benedikt Huber. I'll update the lower version bounds when this becomes available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant