Skip to content

Clarify Binary properties of the streams #82

@ndmitchell

Description

@ndmitchell

The process library is largely silent on whether it gets binary or text handles to the process input/output. The one place where it is explicit is runInteractiveCommand, and there it seems to be wrong (as far as I can tell). I think this confusion has reared it's head as ndmitchell/hoogle#194, commercialhaskell/stack#2582, #59 and #36.

The command ghc-pkg dump is documented by GHC HQ to return UTF8 content, and to a first approximation, let's believe them (I have no reason to doubt it). If we install QuickCheck then the copyright field will include Björn Bringert. In Linux, if I set my $LANG variable to the empty string, then run the code:

import System.Process
import System.IO

main = do
    (inp,out,err,pid) <- runInteractiveCommand "ghc-pkg dump"
    hClose inp
    src <- hGetContents out
    print $ length src

With the command:

runhaskell -package=process-1.4.3.0 ProcessTest.hs

I get the output:

ProcessTest.hs: fd:13: hGetContents: invalid argument (invalid byte sequence)

If I leave the $LANG variable as normal (e.g. en_GB.UTF-8) then it works. That suggests the handles are impacted by the $LANG variable, and thus they aren't really binary. If I modify the code to add:

    hSetBinaryMode out True

Then it works once more.

It would be nice if there was a general statement at the top about whether the handles are in binary format or not, and it would be nice if the statements that were there were accurate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions