Python has had many ways to run subprocesses over the years, including popen,
popen2, and os.exec*. With the Python of today, the best and simplest choice for
managing child processes is to use the subprocess built-in module.

Running a child process with subprocess is simple. Here, the Popen constructor starts
the process. The communicate method reads the child process’s output and waits for
termination.

In [2]:
import logging
from pprint import pprint
from sys import stdout as STDOUT


# Example 1
import subprocess
proc = subprocess.Popen( ['echo', 'Hello from the child!'],
                         stdout=subprocess.PIPE )
out, err = proc.communicate()
print(out.decode('utf-8'))



Hello from the child!



Child processes will run independently from their parent process, the Python interpreter.
Their status can be polled periodically while Python does other work.

In [3]:
# Example 2
from time import sleep, time
proc = subprocess.Popen(['sleep', '0.3'])
while proc.poll() is None:
    print('Working...')
    # Some time consuming work here
    sleep(0.2)

print('Exit status', proc.poll())

Working...
Working...
Exit status 0


Decoupling the child process from the parent means that the parent process is free to run
many child processes in parallel. You can do this by starting all the child processes
together upfront.

If these processes ran in sequence, the total delay would be 1 second, not the ~0.1
second I measured.

In [5]:
# Example 3
def run_sleep(period):
    proc = subprocess.Popen(['sleep', str(period)])
    return proc

start = time()
procs = []
for _ in range(10):
    proc = run_sleep(0.1)
    procs.append(proc)

# Example 4
for proc in procs:
    proc.communicate()
end = time()
print('Finished in %.3f seconds' % (end - start))


Finished in 0.134 seconds


You can also pipe data from your Python program into a subprocess and retrieve its
output. This allows you to utilize other programs to do work in parallel. For example, say
you want to use the openssl command-line tool to encrypt some data. Starting the child
process with command-line arguments and I/O pipes is easy.

## Example 6
Here, I pipe random bytes into the encryption function, but in practice this would be user
input, a file handle, a network socket, etc.:

## Example 7
The child processes will run in parallel and consume their input. Here, I wait for them to
finish and then retrieve their final output

## Example 8
You can also create chains of parallel processes just like UNIX pipes, connecting the
output of one child process into the input of another, and so on. Here’s a function that
starts a child process that will cause the md5 command-line tool to consume an input
stream:


Python’s hashlib built-in module provides the md5 function, so running a
subprocess like this isn’t always necessary. The goal here is to demonstrate how
subprocesses can pipe inputs and outputs.


In [6]:
# Example 5
import os

def run_openssl(data):
    env = os.environ.copy()
    env['password'] = b'\xe24U\n\xd0Ql3S\x11'
    proc = subprocess.Popen(
        ['openssl', 'enc', '-des3', '-pass', 'env:password'],
        env=env,
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE)
    proc.stdin.write(data)
    proc.stdin.flush()  # Ensure the child gets input
    return proc


# Example 6
import os
procs = []
for _ in range(3):
    data = os.urandom(10)
    proc = run_openssl(data)
    procs.append(proc)


# Example 7
for proc in procs:
    out, err = proc.communicate()
    print(out[-10:])


# Example 8
def run_md5(input_stdin):
    proc = subprocess.Popen(
        ['md5'],
        stdin=input_stdin,
        stdout=subprocess.PIPE)
    return proc



b'\xc4\x1f \x96e\x11@\xeaI\x8a'
b'L~:\ruZ\n,j\xb4'
b'd\xa0\xa4p4S\x97\x8a\xf5F'


Now, I can kick off a set of openssl processes to encrypt some data and another set of
processes to md5 hash the encrypted output.

## Example 10
The I/O between the child processes will happen automatically once you get them started.
All you need to do is wait for them to finish and print the final output.

In [8]:
# Example 9
input_procs = []
hash_procs = []
for _ in range(3):
    data = os.urandom(10)
    proc = run_openssl(data)
    input_procs.append(proc)
    hash_proc = run_md5(proc.stdout)
    hash_procs.append(hash_proc)

    
# Example 10
for proc in input_procs:
    proc.communicate()
for proc in hash_procs:
    out, err = proc.communicate()
    print(out.strip())


b'5ba63f05c1f82291b8e0f45cc4569a92'
b'07401ef3fcd6d9aa8e79f0e100a1a50f'
b'd09ab4fc7903600d741b2692a16b9bfe'


If you’re worried about the child processes never finishing or somehow blocking on input
or output pipes, then be sure to pass the timeout parameter to the communicate
method. This will cause an exception to be raised if the child process hasn’t responded
within a time period, giving you a chance to terminate the misbehaving child.

Unfortunately, the timeout parameter is only available in Python 3.3 and later. In earlier
versions of Python, you’ll need to use the select built-in module on proc.stdin,
proc.stdout, and proc.stderr in order to enforce timeouts on I/O.

In [9]:
# Example 11
proc = run_sleep(10)
try:
    proc.communicate(timeout=0.1)
except subprocess.TimeoutExpired:
    proc.terminate()
    proc.wait()

print('Exit status', proc.poll())

Exit status -15


* Use the subprocess module to run child processes and manage their input and output streams.
* Child processes run in parallel with the Python interpreter, enabling you to maximize your CPU usage.
* Use the timeout parameter with communicate to avoid deadlocks and hanging child processes.