## 23. Process Substitution

Piping the stdout of a command into the stdin of another is a powerful technique.   
But, what if you need to pipe the stdout of multiple commands?   
This is where process substitution comes in.

Process substitution feeds the output of a process (or processes) into the stdin of another process.

#### Template

Command list enclosed within parentheses
##### >(command_list)
##### <(command_list)

#### 关于设备文件目录/dev/fd

In [1]:
# /dev/fd是一个链接文件
# 连接到/proc/self/fd目录
ls -l /dev/fd

lrwxrwxrwx 1 root root 13  6月 16 04:35 /dev/fd -> /proc/self/fd


In [2]:
ls -l /proc/self/fd

total 0
lrwx------ 1 liheyi liheyi 64  8月 31 15:14 0 -> /dev/pts/3
lrwx------ 1 liheyi liheyi 64  8月 31 15:14 1 -> /dev/pts/3
lrwx------ 1 liheyi liheyi 64  8月 31 15:14 2 -> /dev/pts/3
lr-x------ 1 liheyi liheyi 64  8月 31 15:14 3 -> /proc/95437/fd


In [4]:
# 可以看到连接文件0,1,2都是连接到虚拟终端/dev/pts/3
# 这里的0,1,2应该可以解释为当用户登录一个终端时；
# 该终端自动打开的三个文件：
# 标准输入     -- 0
# 标准输出     -- 1
# 标准错误输出 -- 2
file /dev/fd/0 /dev/fd/1 /dev/fd/2

/dev/fd/0: symbolic link to `/dev/pts/3' 
/dev/fd/1: symbolic link to `/dev/pts/3' 
/dev/fd/2: symbolic link to `/dev/pts/3' 


In [5]:
ls -l /dev/pts/

total 0
crw--w---- 1 root   tty  136, 0  8月 26 15:17 0
crw--w---- 1 liheyi tty  136, 1  8月 31 15:17 1
crw------- 1 liheyi tty  136, 2  8月 31 15:06 2
crw--w---- 1 liheyi tty  136, 3  8月 31 15:16 3
crw--w---- 1 root   tty  136, 5  6月 16 04:35 5
crw--w---- 1 root   tty  136, 6  6月 16 04:35 6
crw--w---- 1 root   tty  136, 8  6月 16 04:35 8
c--------- 1 root   root   5, 2  6月 16 04:35 ptmx


In [6]:
tty

/dev/pts/3


在不同的终端查看标准输入、标准输出和标准错误输出所连接到终端是不同的，  
即，在哪个终端上查看，则三者连接到执行的终端上  

In [7]:
echo >(true)

/dev/fd/63


In [8]:
echo <(true)

/dev/fd/63


In [9]:
echo >(true) <(true)

/dev/fd/63 /dev/fd/62


In [11]:
wc <(cat /usr/share/dict/american-english)

  99171   99171  938848 /dev/fd/63


In [13]:
grep script /usr/share/dict/american-english | wc

     67      67     845


In [14]:
wc <(grep script /usr/share/dict/american-english)

     67      67     845 /dev/fd/63


Process substitution can compare the output of two different commands,   
or even the output of different options to the same command.

In [18]:
comm <(ls -l /home/liheyi/jupyter/linux) <(ls -al /home/liheyi/jupyter/linux)

total 60
comm: file 1 is not in sorted order
-rw-rw-r-- 1 liheyi liheyi 14387  8月 26 18:04 linux_console.ipynb
-rw-rw-r-- 1 liheyi liheyi 10451  8月 26 18:04 linux_startup_process.ipynb
-rw-rw-r-- 1 liheyi liheyi 30854  8月 26 17:17 The_difference_of_tty_pty_pts_tts.ipynb
	total 72
comm: file 2 is not in sorted order
	drwxrwxr-x 3 liheyi liheyi  4096  8月 26 18:04 .
	drwxrwxr-x 7 liheyi liheyi  4096  8月 31 09:49 ..
	drwxr-xr-x 2 liheyi liheyi  4096  8月 26 17:25 .ipynb_checkpoints
	-rw-rw-r-- 1 liheyi liheyi 14387  8月 26 18:04 linux_console.ipynb
	-rw-rw-r-- 1 liheyi liheyi 10451  8月 26 18:04 linux_startup_process.ipynb
	-rw-rw-r-- 1 liheyi liheyi 30854  8月 26 17:17 The_difference_of_tty_pty_pts_tts.ipynb


Process substitution can compare the contents of two directories:  
to see which filenames are in one, but not the other.

In [19]:
# 输出说明：
# 主要用于比较两个目录中文件名的差异

# 23d22
#< data_box.config
# 说明仅config1目录中有data_box.config文件

# 30c29
#< data_common.config
#---
#> data_common.config.bak
# 说明config1目录中存在data_common.config文件
# 而config2目录中存在data_common.config.bak文件

# 109a109
# > data_ranking.config
# 说明仅config2目录中有data_ranking.config文件

diff <(ls config1) <(ls config2)

23d22
< data_box.config
30c29
< data_common.config
---
> data_common.config.bak
109a109
> data_ranking.config


Some other usages and uses of process substitution:

In [20]:
#  Read a list of random numbers from /dev/urandom,
#+ process with "od"
#+ and feed into stdin of "read" . . .
#  From "insertion-sort.bash" example script.
#  Courtesy of JuanJo Ciarlante.
read -a list < <( od -Ad -w24 -t u2 /dev/urandom )



In [None]:
# bittorrent
PORT=6881
# Scan the port to make sure nothing nefarious is going on.
netcat -l $PORT | tee>(md5sum ->mydata-orig.md5) |
gzip | tee>(md5sum - | sed 's/-$/mydata.lz2/'>mydata-gz.md5)>mydata.gz

# Check the decompression:
gzip -d<mydata.gz | md5sum -c mydata-orig.md5)
# The MD5sum of the original checks stdin and detects compression issues

In [23]:
# Same as ls -ltr | cat
cat <(ls -ltr)

total 624
-rw-rw-r-- 1 liheyi liheyi   5546  6月 30 17:40 command_line_shortcut.ipynb
-rw-rw-r-- 1 liheyi liheyi  35745  8月  2 18:11 variables_and_parameters.ipynb
-rw-rw-r-- 1 liheyi liheyi   7252  8月  3 09:43 exit_and_exit_status.ipynb
-rw-rw-r-- 1 liheyi liheyi  55789  8月  3 12:18 test.ipynb
-rw-rw-r-- 1 liheyi liheyi  31133  8月 17 14:37 quoting.ipynb
-rw-rw-r-- 1 liheyi liheyi  28335  8月 17 15:24 operations_and_related_topics.ipynb
-rw-rw-r-- 1 liheyi liheyi 122146  8月 18 12:39 another_look_at_variables.ipynb
-rw-rw-r-- 1 liheyi liheyi  67279  8月 18 17:45 manipulate_variables.ipynb
-rw-rw-r-- 1 liheyi liheyi  19707  8月 19 10:49 command_substitution.ipynb
-rw-rw-r-- 1 liheyi liheyi 117471  8月 19 17:57 loops_and_branches.ipynb
-rw-rw-r-- 1 liheyi liheyi   4849  8月 23 15:07 arithmetic_expansion.ipynb
-rw-rw-r-- 1 liheyi liheyi   4253  8月 31 14:58 restricted_shells.ipynb
-rw-rw-r-- 1 liheyi liheyi  65223  8月 31 14:58 IO_redirection.ipynb
-rw-rw-r-- 1 liheyi liheyi  16004  

In [28]:
# Lists all the files in the 2 directories, and sorts by filename.
# Note that two (count 'em) distinct commands are fed to 'sort'.
sort -k9 <(ls -l ~/jupyter/bash/basics/) <(ls -l ~/jupyter/bash/commands/file_manage/file/data)

total 228
total 752
-rw-rw-r-- 1 liheyi liheyi 122146  8月 18 12:39 another_look_at_variables.ipynb
-rw-rw-r-- 1 liheyi liheyi   4849  8月 23 15:07 arithmetic_expansion.ipynb
-rw-rw-r-- 1 liheyi liheyi   5546  6月 30 17:40 command_line_shortcut.ipynb
-rw-rw-r-- 1 liheyi liheyi  19707  8月 19 10:49 command_substitution.ipynb
drwxrwxr-x 3 liheyi liheyi  12288  8月 31 15:51 config1
drwxrwxr-x 3 liheyi liheyi  12288  8月 31 15:52 config2
-rw-rw-r-- 1 liheyi liheyi   7252  8月  3 09:43 exit_and_exit_status.ipynb
-rw-rw-r-- 1 liheyi liheyi  65223  8月 31 14:58 IO_redirection.ipynb
-rw-rw-r-- 1 liheyi liheyi 14184  8月 23 15:30 linux_command_cat.ipynb
-rw-rw-r-- 1 liheyi liheyi 21502  8月  8 17:58 linux_command_cut.ipynb
-rw-rw-r-- 1 liheyi liheyi 29276  8月 11 17:58 linux_command_diff.ipynb
-rw-rw-r-- 1 liheyi liheyi  6485  8月  8 17:58 linux_command_fold.ipynb
-rw-rw-r-- 1 liheyi liheyi 30176  8月 23 18:18 linux_command_grep.ipynb
-rw-rw-r-- 1 liheyi liheyi 11950  8月  9 10:18 linux_comman

In [None]:
# Gives difference in command output.
diff <(command1) <(command2)

In [29]:
# Calls "tar cf /dev/fd/?? $directory_name", 
# and "bzip2 -c > file.tar.bz2".
tar cf >(bzip2 -c > jupyter.tgz2) /home/liheyi/jupyter

tar: Removing leading `/' from member names
tar: /home/liheyi/jupyter/bash/basics/jupyter.tgz2: file changed as we read it


In [30]:
ls -lh jupyter.tgz2

-rw-rw-r-- 1 liheyi liheyi 3.3M  8月 31 16:41 jupyter.tgz2


In [None]:
# Because of the /dev/fd/<n> system feature,
# the pipe between both commands does not need to be named.
#
# This can be emulated.
bzip2 -c < pipe > file.tar.bz2&
tar cf pipe /home/liheyi/anaconda2
rm pipe

# or
exec 3>&1
tar cf /dev/fd/4 /home/liheyi/anaconda2 4>&1 >&3 3>&- | bzip2 -c > file.tar.bz2 3>&-
exec 3>&-

Here is a method of circumventing the problem of an echo piped to a while-read loop running in a subshell.

#### Example 23-1. Code block redirection without forking

In [32]:
cat wr-ps.bash

#!/bin/bash
# wr-ps.bash: while-read loop with process substitution.

# This example contributed by Tomas Pospisek.
# (Heavily edited by the ABS Guide author.)

echo

echo "random input" | while read i
do
    global=3D": Not available outside the loop."
    # ... because it runs in a subshell.
done

echo "\$global (from outside the subprocess) = $global"
# $global (from outside the subprocess) =

echo; echo "----------"; echo

while read i
do
    echo $i
    global=3D": Available outside the loop."
    # ... because it does NOT run in a subshell.
done < <( echo "random input" )
#    ^ ^

echo "\$global (using process substitution) = $global"
# Random input
# $global (using process substitution) = 3D: Available outside the loop.

echo; echo "##########"; echo

# And likewise . . .
declare -a inloop
index=0
cat $0 | while read line
do
    inloop[$index]="$line"
    ((index++))
    # It runs in a subshell, so ...
done
echo "OUTPUT = "
echo ${inlo

In [33]:
./wr-ps.bash


$global (from outside the subprocess) = 

----------

random input
$global (using process substitution) = 3D: Available outside the loop.

##########

OUTPUT = 


----------

OUTPUT = 
#!/bin/bash # wr-ps.bash: while-read loop with process substitution. # This example contributed by Tomas Pospisek. # (Heavily edited by the ABS Guide author.) echo echo "random input" | while read i do global=3D": Not available outside the loop." # ... because it runs in a subshell. done echo "$global (from outside the subprocess) = $global" # $global (from outside the subprocess) = echo; echo "----------"; echo while read i do echo $i global=3D": Available outside the loop." # ... because it does NOT run in a subshell. done < <( echo "random input" ) # ^ ^ echo "$global (using process substitution) = $global" # Random input # $global (using process substitution) = 3D: Available outside the loop. echo; echo "##########"; echo # And likewise . . . declare -a inloop index=0 cat $0 | while 

This is a similar example.

#### Example 23-2. Redirecting the output of process substitution into a loop.

In [34]:
cat psub.bash

#!/bin/bash
# psub.bash

# As inspired by Diego Molina (thanks!).

declare -a array0
while read
do
    array0[${#array0[@]}]="$REPLY"
done < <( sed -e 's/bash/CRASH-BANG!/' $0 | grep bin | awk '{print $1}' )
#  Sets the default 'read' variable, $REPLY, by process substitution,
#+ then copies it into an array.

echo "${array0[@]}"

exit $?


In [35]:
bash psub.bash

#!/bin/CRASH-BANG! done


A reader sent in the following interesting example of process substitution.

In [None]:
# Script fragment taken from SuSE distribution:

# --------------------------------------------------------------#
while read des what mask iface; do
# Some commands ...
done < <(route -n)
#    ^ ^   First < is redirection, second is process substitution.

# To test it, let's make it do something.
while read des what mask iface; do
  echo $des $what $mask $iface
done < <(route -n)

# Output:
# Kernel IP routing table
# Destination Gateway Genmask Flags Metric Ref Use Iface
# 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
# --------------------------------------------------------------#

#  As Stéphane Chazelas points out,
#+ an easier-to-understand equivalent is:
route -n |
  while read des what mask iface; do # Variables set from output of pipe.
    echo $des $what $mask $iface
  done   #  This yields the same output as above.
         #  However, as Ulrich Gayer points out . . .
         #+ this simplified equivalent uses a subshell for the while loop,
         #+ and therefore the variables disappear when the pipe terminates.
         
# --------------------------------------------------------------#

#  However, Filip Moritz comments that there is a subtle difference
#+ between the above two examples, as the following shows.

(
route -n | while read x; do ((y++)); done
echo $y   # $y is still unset

while read x; do ((y++)); done < <(route -n)
echo $y   # $y has the number of lines of output of route -n
)

More generally spoken
(
: | x=x
# seems to start a subshell like
: | ( x=x )
# while
x=x < <(:)
# does not
)

# This is useful, when parsing csv and the like.
# That is, in effect, what the original SuSE code fragment does.