Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hakyll can't handle unicode? #614

Closed
davidspies opened this issue Mar 4, 2018 · 9 comments
Closed

hakyll can't handle unicode? #614

davidspies opened this issue Mar 4, 2018 · 9 comments

Comments

@davidspies
Copy link

Trying to build the default (hakyll-init) site I get this:

$ stack exec site build
Initialising...
  Creating store...
  Creating provider...
  Running rules...
Checking for out-of-date items
Compiling
  [ERROR] ./about.rst: hGetContents: invalid argument (invalid byte sequence)

It looks like it's a problem with the unicode character (it builds when I remove it):
The line:
1. Amamus Unicode 碁

Hakyll and site built with nix and stack --resolver=lts-10.7

@cbzehner
Copy link

cbzehner commented Mar 4, 2018

I don't think Unicode is the root cause here. I have built Unicode pages successfully with Hakyll.

Could you check the encoding of your about.rst file? It might be in some other encoding like Big5?

@jaspervdj
Copy link
Owner

This is a readFile exception that most commonly pops up if your system locale is set to something like ISO 8859-1 rather than UTF-8. On Linux this usually means that LANG / LC_ALL is set incorrectly.

@davidspies
Copy link
Author

$ file -i about.rst 
about.rst: text/plain; charset=utf-8

@davidspies
Copy link
Author

$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

However I built with nix so maybe it wipes that out?

@davidspies
Copy link
Author

davidspies commented Mar 4, 2018

Oh, yeah:

$ stack --nix exec -- locale
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

Any idea how to deal with that?

@jaspervdj
Copy link
Owner

I'm not very experienced with Nix but just googling for "nix lang lc_all" led me to NixOS/nix#318 which provides a solution near the bottom of the thread. Closing the issue as it's not really related to Hakyll.

@teto
Copy link

teto commented Aug 8, 2018

I\ve just had this. The problem stems from stack using nix in --pure mode by default (i.e., with an empty environment). Seems like a global fix for nix is writing a ~/.stack/config.yaml akin to

# see https://docs.haskellstack.org/en/stable/nix_integration/#configuration
# for confiugration options
nix:
  enable: true
  # true by default but will cause problems such as https://github.com/jaspervdj/hakyll/issues/614
  pure: false
  # packages: [ zlib ] # zlib for hakyll

@dunnl
Copy link

dunnl commented Aug 8, 2018

The problem is that operations like hGetContents require information about the locale settings and generated locale files. It's not super clear from the other thread, but in the Nix world this means having something like LOCALE_ARCHIVE = "${pkgs.glibcLocales}/lib/locale/locale-archive" in the environment, besides the regular LC / LANG variables.

LOCALE_ARCHIVE is a Nixpkgs-specific patch to glibc so that the locale archive can be stored in /nix/store, see this patch. For a real example see this blog post. Note that you do not need to put export statements in your builder script like the author does; it suffices to pass the above definition as an attribute in the mkDerivation parameters.

That's why pure:false works: LOCALE_ARCHIVE is set in your regular environment when you install Nix or run NixOS.

Some of this isn't exquisitely well-documented right now, unfortunately.

@teto
Copy link

teto commented Aug 9, 2018

Thanks for the precision. I knew what is --pure was but was just surprised that this was the default for stack as even on nix, it's not advertised much. There must be a good reason for it.

teto added a commit to teto/home that referenced this issue Jan 8, 2019
... addding a stack.yaml pour eveiter le genre de probleme que j'ai eu
avec hakyll du style jaspervdj/hakyll#614.
  [ERROR] ./about.rst: hGetContents: invalid argument (invalid byte sequence)

parce que nix n\avait pas LANG dans l'env
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants