The character encoding of GITenberg books should always be UTF-8 (aka Unicode). Project Gutenberg source files were often created before Unicode existed and need to be converted.
If you get an error like the following, you will need to do this step:
pandoc: Cannot decode byte '\xb0': Data.Text.Encoding.Fusion.streamUtf8: Invalid UTF-8 stream
To re-encode a file as UTF-8, you need to use the iconv
commandline tool.
iconv -c -t "UTF-8" < input.html > output.asciidoc
For instance, if 164-h/164-h.htm