Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cabal config corrupted when using Unicode #2557

Closed
hvr opened this issue Apr 27, 2015 · 4 comments · Fixed by #5804
Closed

Cabal config corrupted when using Unicode #2557

hvr opened this issue Apr 27, 2015 · 4 comments · Fixed by #5804

Comments

@hvr
Copy link
Member

hvr commented Apr 27, 2015

$ git clone https://github.com/hvr/-.git 無
...
$ cd 無
$ runghc Setup.hs configure
Configuring 無-0...
$ runghc Setup.hs build
Setup.hs: Saved package config file header is corrupt. Try re-running the 'configure' command.
@ttuegel
Copy link
Member

ttuegel commented Apr 27, 2015

Looks like we just need to let GHC handle the text encoding for us, rather than going through ByteString.

@23Skidoo
Copy link
Member

Once this is fixed, we should also add a regression test.

@hvr
Copy link
Member Author

hvr commented Dec 1, 2017

@ttuegel you self-assigned this bug 2 years ago... did you make any progress? Do you mind if I try giving this one a shot myself? ;-)

@ttuegel ttuegel removed their assignment Dec 2, 2017
@ttuegel
Copy link
Member

ttuegel commented Dec 2, 2017

@hvr Have at it! I think at the time I had some reason to believe this would be simple to fix, but I never got to it.

@hvr hvr self-assigned this Dec 3, 2017
hvr added a commit to hvr/cabal that referenced this issue Dec 17, 2018
The config-state header is a human readable line prepended to the
binary serialisation which looks like

    Saved package config for pkgname-1.2.3 written by Cabal-2.5.0.0 using ghc-8.6

However, the functions generating and parsing this header didn't take into
account that package names are not limited to the ASCII subset and blindly used
the ByteString `pack` function which truncates away the high bits of the `Char`
code point resulting in a corrupted header with a non-sensical package-name.

The fix is simply to serialise the package-name with the UTF-8 encoding which
works nicely with the rest of the UTF-8 unaware string handling functions.
Hence the fix is a lot shorter than this commit message.

Fixes haskell#2557
hvr added a commit to hvr/cabal that referenced this issue Dec 17, 2018
…encoding

This takes care of knock-off effects of haskell#2557

Specifically, the `Paths_*.hs` and `cabal_macros.h` files would result being incorrectly
by a `rewriteFileEx` which isn't UTF-8 capable.

Now the `cabal_macros.h` file is written out exactly like the `.h` file generated
internally by `ghc` is generated; note however that standard CPP doesn't support
non-ASCII characters in CPP symbols and will thus not work with a standard CPP
preprocessor.
hvr added a commit to hvr/cabal that referenced this issue Jan 16, 2019
The config-state header is a human readable line prepended to the
binary serialisation which looks like

    Saved package config for pkgname-1.2.3 written by Cabal-2.5.0.0 using ghc-8.6

However, the functions generating and parsing this header didn't take into
account that package names are not limited to the ASCII subset and blindly used
the ByteString `pack` function which truncates away the high bits of the `Char`
code point resulting in a corrupted header with a non-sensical package-name.

The fix is simply to serialise the package-name with the UTF-8 encoding which
works nicely with the rest of the UTF-8 unaware string handling functions.
Hence the fix is a lot shorter than this commit message.

Fixes haskell#2557
hvr added a commit to hvr/cabal that referenced this issue Jan 16, 2019
…encoding

This takes care of knock-off effects of haskell#2557

Specifically, the `Paths_*.hs` and `cabal_macros.h` files would result being incorrectly
by a `rewriteFileEx` which isn't UTF-8 capable.

Now the `cabal_macros.h` file is written out exactly like the `.h` file generated
internally by `ghc` is generated; note however that standard CPP doesn't support
non-ASCII characters in CPP symbols and will thus not work with a standard CPP
preprocessor.
hvr added a commit to hvr/cabal that referenced this issue Mar 3, 2019
The config-state header is a human readable line prepended to the
binary serialisation which looks like

    Saved package config for pkgname-1.2.3 written by Cabal-2.5.0.0 using ghc-8.6

However, the functions generating and parsing this header didn't take into
account that package names are not limited to the ASCII subset and blindly used
the ByteString `pack` function which truncates away the high bits of the `Char`
code point resulting in a corrupted header with a non-sensical package-name.

The fix is simply to serialise the package-name with the UTF-8 encoding which
works nicely with the rest of the UTF-8 unaware string handling functions.
Hence the fix is a lot shorter than this commit message.

Fixes haskell#2557
hvr added a commit to hvr/cabal that referenced this issue Mar 3, 2019
…encoding

This takes care of knock-off effects of haskell#2557

Specifically, the `Paths_*.hs` and `cabal_macros.h` files would result being incorrectly
by a `rewriteFileEx` which isn't UTF-8 capable.

Now the `cabal_macros.h` file is written out exactly like the `.h` file generated
internally by `ghc` is generated; note however that standard CPP doesn't support
non-ASCII characters in CPP symbols and will thus not work with a standard CPP
preprocessor.
@hvr hvr closed this as completed in #5804 Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants