process substitution / anonymous named pipes #66

warpfork · 2012-07-07T08:50:31Z

I have a situation where I want to call one program which can only accept a certain kind of input via a file which must be named in the arguments, but I want to feed it content generated from another program.

In other words, I have a situation that would be expressed in bash with a process substitution like this:

 tail -f <(echo "generated")

(Relevant: https://en.wikipedia.org/wiki/Process_substitution )

In python, I can solve this with a tempfile fairly easily.

A step better: I can also solve it with a named pipe with a mkfifo call fairly easily, which gives me the joys of in-memory rather than actually hitting the filesystem needlessly.

However, that still leaves something to be desired; I have to pick a name for my fifo, and I have to remove it again when I'm done. If I get SIGKILL, I leave a dangling fifo hanging around on my filesystem. What would really be excellent is if I could tap into the magic stuff in the /proc/$pid/fd and /dev/fd/$fd areas common in a linux world... that would give me a system where the kernel itself is functioning as my cleanup.

That example of process substitution in bash up above does something clever like that. If you run that example and then look at what actually happened with ps, you'll see something like this:

tail -f /dev/fd/63

Bash created a fifo somewhere where I don't have to worry about it (I think it's somewhere under /proc/ so it just goes away when the processes die?); stdout of the echo writes into the fifo and the reading end of the fifo is made into file descriptor 63 for tail. And then the "/dev/fd/63" part is magic that happens to be a name for the fifo that is fd 63 to the current process.

What would really be excellent is if I could tap into the same level of magic up in the python world.

In the course of writing this, I ended up realizing that I can use "/dev/fd/0" as an argument to get a program to read its own standard in as a file, and since I don't happen to be using stdin already in my current case, this solves my immediate problem. A more general solution would still be excellent, though, and for that we would need the ability to pass arbitrarily numbered file descriptors into child processes, instead of being limited to stdin/stdout/stderr aka 0/1/2.

Also, I'm not sure how portable the "/dev/fd/$fd" stuff is; I feel a little uncomfortable hardcoding that in, and bash takes care of it for me, but I have no idea how I'd go about finding out in a cross platform way what the location is for the magic filenames-to-selfprocess-file-descriptors.

The text was updated successfully, but these errors were encountered:

amoffat · 2012-07-10T23:51:09Z

I kind of follow what you're getting at. Could you write up a few pbs example use-cases here of how you envision it? It will become more clear to me then.

Are you thinking of something like this?

import pbs
pbs.tail(pbs.cat("/tmp/test", _out="fifo"))

warpfork · 2012-07-11T05:37:33Z

Okay, so suppose for example I want to do something like this in bash (this was my original use case):

git config -f <(curl http://raw.github.com/heavenlyhash/projectWhatever/master/.gitmodules) -l

Now I want to do that in PBS, and it's a little tough, but what I ended up hacking into being was this:

from pbs import git;
with closing(urllib.urlopen(githubRawUrl+"/.gitmodules")) as f:
    remoteModulesStr = f.read();
git.config("-f", "/dev/fd/0", "-l", _in=remoteModulesStr)

And that works, because /dev/fd/0 is already a magic file in my system that is a fifo that will read from standard in of that process.

In the more general case though, what if stdin is already used by that process for something special? Or I want to do

diff <(curl http://thingy.com/resource1)  <(curl http://thingy.com/resource2)

Now that trick with stdin won't work; I need other channels, or several of them.

To see what bash is doing here, you can do something like this:

diff <(tail -f /dev/null) <(tail -f /dev/null) &
ps -f | grep diff

...and you'll see something like "diff /dev/fd/63 /dev/fd/62". Possibly exactly that.

So, the most direct way to expose this from pbs might look like this:

pbs.diff("/dev/fd/63", "/dev/fd/62", __63=inMemStrA, __62=inMemStrB)

That's a little ugly. Cooler would be maybe more like...

pbs.diff(pbs.stream(inMemStrA), pbs.stream(inMemStrB));

Actually, coolest might be somewhere in the middle. Gimme a syntax to pass arbitrarily numbered channels in and out (and then _in, _out, and _err become mere special cases of that system and are synonymous to __0, __1, and __2), just in case I'm interfacing with some crazy program that uses the higher numbers. Then also have a wrapper object that makes the pbs command invocation step aware that there's something here that should be shunted via an anonymous pipe, and there hide all the numbers (and more importantly, the /dev/ shenanigans) from the library user.

amoffat · 2012-07-17T01:51:43Z

I think I'm going to hold off on this one for right now, but it will go on the roadmap. The dev branch desperately needs to get finished up and merged to master, and it has a complete rewrite of subprocesss.Popen in it. So doing this feature might be easier on the dev branch.

brentp · 2013-04-17T01:54:44Z

is there a newer way to do process substitution since this was originally issued?

amoffat · 2013-04-17T02:23:10Z

@brentp negative

StyXman · 2013-07-28T21:42:21Z

for the record, what bash does (as per what strace -ff shows):

[...]
pipe([3, 4])                            = 0
fcntl(63, F_GETFD)                      = -1 EBADF (Bad file descriptor)
dup2(3, 63)                             = 63
close(3)                                = 0
[...]
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe4425f59d0) = 16235
[...]
execve("/usr/bin/tail", ["tail", "-f", "/dev/fd/63"], [/* 40 vars */]) = 0

and on the echo side:

dup2(4, 1)                              = 1
[...]
write(1, "generated\n", 10)             = 10

So it's the same as setting up a pipe (|), but then using an arbitrary fd (63) to dup2() the reading end of it, and use /dev/fd/63 (which point to /proc/self/fd, where /proc/self point to the subdir in /proc for the current process) as the input file for tail. This should be not very difficult to reproduce in sh.

ecederstrand · 2021-05-24T16:11:15Z

@amoffat This is a really old suggestion and hasn't seen any support from others for the last many years. Maybe we should just close it without fixing?

fracai · 2021-05-24T17:05:51Z

I for one would still be interested in this. Granted, my use case is currently handled by just using sh to call a script that handles the named pipes, but in the interest of feature completeness I think it'd be useful to at least keep this on the roadmap.

amoffat · 2021-05-25T01:09:13Z

@ecederstrand I think I tend to agree with you. It is a cool idea, and it seems like when people need it, it would very very convenient, but it also seems like people don't need it very often. I'll close it and we can re-open if more momentum builds behind it.

StyXman mentioned this issue Sep 10, 2013

process substitution StyXman/ayrton#7

Open

amoffat added the feature label Dec 30, 2014

amoffat mentioned this issue Feb 18, 2016

Process Substitution Support? #300

Closed

amoffat added this to the release 2.0.0 milestone Apr 26, 2020

amoffat closed this as completed May 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process substitution / anonymous named pipes #66

process substitution / anonymous named pipes #66

warpfork commented Jul 7, 2012

amoffat commented Jul 10, 2012

warpfork commented Jul 11, 2012

amoffat commented Jul 17, 2012

brentp commented Apr 17, 2013

amoffat commented Apr 17, 2013

StyXman commented Jul 28, 2013

ecederstrand commented May 24, 2021 •

edited

Loading

fracai commented May 24, 2021

amoffat commented May 25, 2021

process substitution / anonymous named pipes #66

process substitution / anonymous named pipes #66

Comments

warpfork commented Jul 7, 2012

amoffat commented Jul 10, 2012

warpfork commented Jul 11, 2012

amoffat commented Jul 17, 2012

brentp commented Apr 17, 2013

amoffat commented Apr 17, 2013

StyXman commented Jul 28, 2013

ecederstrand commented May 24, 2021 • edited Loading

fracai commented May 24, 2021

amoffat commented May 25, 2021

ecederstrand commented May 24, 2021 •

edited

Loading