Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The startProcess API is incomplete #7999

Open
zah opened this issue Jun 8, 2018 · 6 comments
Open

The startProcess API is incomplete #7999

zah opened this issue Jun 8, 2018 · 6 comments

Comments

@zah
Copy link
Member

zah commented Jun 8, 2018

At the moment startProcess allows you to provide flags such as poParentStreams and poStdErrToStdOut to control the behaviour of standard input and output stream of the new process.

This limits the usage to two possible outcomes:

  1. New pipes are created inside startProcess for all of the streams.
  2. All streams are inherited.

There are no options for inheriting only some of the streams or passing in already created pipes to be used.

The API can be further enhanced by introducing the notion of process groups, which allows you to manage indirect sub-processes (e.g. when launching a script that will execute multiple sub-commands):
https://www.boost.org/doc/libs/1_65_0/doc/html/boost_process/tutorial.html#boost_process.tutorial.group

@yglukhov
Copy link
Member

yglukhov commented Jun 8, 2018

Another useful consideration is async friendliness, that is, ensuring all relevant handles are accessible to be wrapped into async-flavoured api.

@FedericoCeratto
Copy link
Member

Partially related: execCmd / execCmdEx also have a different API from startProcess and execProcess , e.g. they don't support env and a timeout.

@timotheecour
Copy link
Member

timotheecour commented Dec 14, 2018

/cc @zah
as mentioned here (#9953 (comment)) I like D's API for std.process : https://dlang.org/phobos/std_process.html, which is organized into low-level, mid-level, high-level; Nim doesn't have the lowest level one, and has a clunkier API that is less flexible (eg: we can't redirect to a File, we can't selectively redirect , etc) (and, well, has issues like #9953 and #9975)

I think we could take inspiration from that design, but with some modifcations to take advantage of Nim language.
It's organized into (simplifying a bit):

# lowest level: just return Pid (containing process id, etc); take File inputs (defaulting to stdin/stdout/stderr), plus the rest
@trusted Pid spawnProcess(scope const(char[])[] args, File stdin = std.stdio.stdin, File stdout = std.stdio.stdout, File stderr = std.stdio.stderr, const string[string] env = null, Config config = Config.none, scope const char[] workDir = null); 

# middle level: return pipes
@safe ProcessPipes pipeProcess(scope const(char[])[] args, Redirect redirect = Redirect.all, const string[string] env = null, Config config = Config.none, scope const(char)[] workDir = null); 

# high level: return (exitCode, output)
@trusted auto execute(scope const(char[])[] args, const string[string] env = null, Config config = Config.none, size_t maxOutput = size_t.max, scope const(char)[] workDir = null); 

in Nim this could be:

proc spawnProcess(cmd: string, args: seq[string] = @[], stdinFile = stdin, stdoutFile = stdout, stderrFile = stderr, env: Table[string,string] = nil, config =initConfig(), workDir = ""): Pid
proc pipeProcess(cmd: string, args: seq[string] = @[], redirect: Redirect,  env: Table[string,string] = nil, config =initConfig(), workDir = ""): Pipes
proc execute(cmd: string, args: seq[string] = @[], env: Table[string,string] = nil, config =initConfig(), workDir = ""): (exitCode: int, output: string, error: string)

@timotheecour
Copy link
Member

timotheecour commented Nov 7, 2020

if you've ever seen garbled cgen errors like this:

/Users/timothee/git_clone/nim/timn/build/nimcache/@mt11253.nim.c:94:2: error: use of /Users/timothee/git_clone/nim/timn/build/nimcache/@mt11253b.nim.cundeclared:87:2: error:  identifieruse  'Foo'of undeclared
 identifier 'Foo'
        Foo a;        Foo a;

        ^        ^

/Users/timothee/git_clone/nim/timn/build/nimcache/@mt11253b.nim.c:89:22: error: use of undeclared identifier 'a'
        nimZeroMem((void*)(&a), sizeof(Foo));

it's related to this.

the problem is that poParentStreams is not so good when dealing with multiple child processes (execProcesses), as their stdout/stderr get interleaved (especially when partial lines are written), causing hard to interpret logs.

the fix is to implement proper stream capture, and a user supplied callback that handles what to do with each captured line/piece of stdout/stderr stream.
The way to do that correctly is the same way as to handle separate stdout/stderr streams: via the selectors API (which uses platform specific API's eg kqueue on OSX); each time a pipe gets data it calls the user supplied callback with that data. That way user can also attribute to which stream/child the data came from. It also doesn't block, nor cause any memory hog since it's up to user callback to handle.

minimized example

# main.nim:
when true:
  import osproc, strformat, random, os
  proc main()=
    let file = currentSourcePath.parentDir / "t11254b.nim"
    let outFile="/tmp/z01"
    const nim = getCurrentCompilerExe()
    doAssert execShellCmd(fmt"{nim} c -o:{outFile} {file}")==0
    var cmds: seq[string]
    let n = 3
    for i in 0..<n:
      cmds.add outFile
    let res = execProcesses(cmds, {poUsePath, poParentStreams}) # garbled
    # let res = execProcesses(cmds, {poUsePath, poParentStreams}, n=1) # ok
    doAssert res == 0
  main()

# t11254b.nim:
when true:
  proc main=
    for i in 0..<3:
      stderr.write "{"
      for j in 0..<10:
        stderr.write $j
      stderr.write "}\n"
  main()

output with n=1:

{0123456789}
{0123456789}
{0123456789}
{0123456789}
{0123456789}
{0123456789}
{0123456789}
{0123456789}
{0123456789}

output with n = countProcessors():

{{{000111222333444555666777888999}
}
}
{{{00011122233344455566677788899}
9}
{}
{0{010121232343454565676787898}
99}
}

@Araq
Copy link
Member

Araq commented Nov 9, 2020

Well I'm still waiting for your Nimble package that offers an osproc alternative. (The rule should still be Nimble first, fusion second.)

@disruptek
Copy link
Contributor

@leorize ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants