# Interacting with the shell

## `sh` module

The [`sh` module](https://amoffat.github.io/sh/) is very convenient to interact with the shell.  Note that `sh` is not part of Python's standard library, if you prefer not to use extra modules, use the `subprocess` module in the standard library.  The statements below will install `sh` using `pip` is it isn't already installed.

In [18]:
try:
    import sh
except ModuleNotFoundError:
    print('installing sh using pip')
    !pip install sh
    import sh

Any shell command can be executed by calling it as a function on the `sh` module, passing command line arguments as arguments.

In [19]:
sh.ls('-l')

'total 704\n-rw-r--r--   1 jwkidd3  staff    1941 Jul 31 11:17 README.md\n-rw-r--r--   1 jwkidd3  staff    9294 Jul 31 14:33 compressed_files.ipynb\n-rw-r--r--   1 jwkidd3  staff   25120 Jul 31 15:01 filesystem_interaction.ipynb\n-rw-r--r--   1 jwkidd3  staff    5311 Jul 31 11:17 julia.ipynb\n-rw-r--r--   1 jwkidd3  staff    6270 Jul 31 11:17 julia_omp.f90\n-rw-r--r--@  1 jwkidd3  staff  264739 Jul 31 11:21 python_for_systems_programming.pdf\n-rw-r--r--   1 jwkidd3  staff   18375 Jul 31 11:31 shell_interaction.ipynb\ndrwxr-xr-x  21 jwkidd3  staff     672 Jul 31 11:18 \x1b[34msource-code\x1b[m\x1b[m\n-rw-r--r--   1 jwkidd3  staff    8754 Jul 31 11:17 system_information.ipynb\n'

The output can be used by assigning the command to a variable, and using the result's `stdout` attribute.  Note that the latter is a sequence of bytes, so it has to be decoded into a UTF-8 string for further processing.

In [20]:
cmd = sh.ls('-l', '-a', _encoding='UTF-8')
cmd

'total 720\ndrwxr-xr-x   13 jwkidd3  staff     416 Jul 31 15:01 \x1b[34m.\x1b[m\x1b[m\ndrwxr-xr-x  141 jwkidd3  staff    4512 Jul 31 14:22 \x1b[34m..\x1b[m\x1b[m\n-rw-r--r--@   1 jwkidd3  staff    6148 Jul 31 14:29 .DS_Store\ndrwxr-xr-x    5 jwkidd3  staff     160 Jul 31 14:38 \x1b[34m.ipynb_checkpoints\x1b[m\x1b[m\n-rw-r--r--    1 jwkidd3  staff    1941 Jul 31 11:17 README.md\n-rw-r--r--    1 jwkidd3  staff    9294 Jul 31 14:33 compressed_files.ipynb\n-rw-r--r--    1 jwkidd3  staff   25120 Jul 31 15:01 filesystem_interaction.ipynb\n-rw-r--r--    1 jwkidd3  staff    5311 Jul 31 11:17 julia.ipynb\n-rw-r--r--    1 jwkidd3  staff    6270 Jul 31 11:17 julia_omp.f90\n-rw-r--r--@   1 jwkidd3  staff  264739 Jul 31 11:21 python_for_systems_programming.pdf\n-rw-r--r--    1 jwkidd3  staff   18375 Jul 31 11:31 shell_interaction.ipynb\ndrwxr-xr-x   21 jwkidd3  staff     672 Jul 31 11:18 \x1b[34msource-code\x1b[m\x1b[m\n-rw-r--r--    1 jwkidd3  staff    8754 Jul 31 11:17 system_information.ipynb\n'

In [23]:
lines = cmd.split('\n')

In [24]:
len(lines[1:-1])

13

In [25]:
lines[1:-1]

['drwxr-xr-x   13 jwkidd3  staff     416 Jul 31 15:01 \x1b[34m.\x1b[m\x1b[m',
 'drwxr-xr-x  141 jwkidd3  staff    4512 Jul 31 14:22 \x1b[34m..\x1b[m\x1b[m',
 '-rw-r--r--@   1 jwkidd3  staff    6148 Jul 31 14:29 .DS_Store',
 'drwxr-xr-x    5 jwkidd3  staff     160 Jul 31 14:38 \x1b[34m.ipynb_checkpoints\x1b[m\x1b[m',
 '-rw-r--r--    1 jwkidd3  staff    1941 Jul 31 11:17 README.md',
 '-rw-r--r--    1 jwkidd3  staff    9294 Jul 31 14:33 compressed_files.ipynb',
 '-rw-r--r--    1 jwkidd3  staff   25120 Jul 31 15:01 filesystem_interaction.ipynb',
 '-rw-r--r--    1 jwkidd3  staff    5311 Jul 31 11:17 julia.ipynb',
 '-rw-r--r--    1 jwkidd3  staff    6270 Jul 31 11:17 julia_omp.f90',
 '-rw-r--r--@   1 jwkidd3  staff  264739 Jul 31 11:21 python_for_systems_programming.pdf',
 '-rw-r--r--    1 jwkidd3  staff   18375 Jul 31 11:31 shell_interaction.ipynb',
 'drwxr-xr-x   21 jwkidd3  staff     672 Jul 31 11:18 \x1b[34msource-code\x1b[m\x1b[m',
 '-rw-r--r--    1 jwkidd3  staff    8754 Jul 31 11:17 s

In [26]:
_ = sh.mkdir('tmp', '-p')

In [27]:
sh.ls()

'\x1b[34m-p\x1b[m\x1b[m                                 python_for_systems_programming.pdf\nREADME.md                          shell_interaction.ipynb\ncompressed_files.ipynb             \x1b[34msource-code\x1b[m\x1b[m\nfilesystem_interaction.ipynb       system_information.ipynb\njulia.ipynb                        \x1b[34mtmp\x1b[m\x1b[m\njulia_omp.f90\n'

### Exit codes

When a shell command fails, an exception is thrown which contains the full command as it was run, the exit code, the standard output and error.

In [None]:
try:
    sh.ls('bla.txt')
except Exception as error:
    err_msg = error.stderr.decode(encoding='utf8').rstrip()
    print(f'command "{error.full_cmd}" exited with exit code {error.exit_code} and message "{err_msg}"')

### I/O redirection

Redirecting output can be done using the `_out` optional argument.

In [28]:
with open('tmp/date_file.txt', 'w') as file:
    for i in range(10):
        print(f'{i} ', end='', file=file, flush=True)
        sh.date(_out=file)
        sh.sleep('1')

Note the use of the `flush` optional argument in the print function.  If this is omitted, the Python interpreter will only flush the results of its own print calls after the `sh` modules has written its output.

In [29]:
sh.cat('tmp/date_file.txt')

'0 Mon Jul 31 15:06:00 CDT 2023\n1 Mon Jul 31 15:06:01 CDT 2023\n2 Mon Jul 31 15:06:02 CDT 2023\n3 Mon Jul 31 15:06:03 CDT 2023\n4 Mon Jul 31 15:06:04 CDT 2023\n5 Mon Jul 31 15:06:05 CDT 2023\n6 Mon Jul 31 15:06:06 CDT 2023\n7 Mon Jul 31 15:06:07 CDT 2023\n8 Mon Jul 31 15:06:08 CDT 2023\n9 Mon Jul 31 15:06:09 CDT 2023\n'

Input redirection works similarly using the optional `_in` argument.

In [30]:
with open('tmp/date_file.txt', 'r') as file:
    print(sh.wc('-l', _in=file))

      10



### Piping

The output of a command can be used as the input for another command.

Pipe the output of `ls` into `grep` to select only the files with names that end with `.py`.

In [31]:
sh.grep(sh.ls('-l'), r'\.ipynb$')

ErrorReturnCode_2: 

  RAN: /usr/bin/grep 'total 704
drwxr-xr-x   2 jwkidd3  staff      64 Jul 31 15:05 [34m-p[m[m
-rw-r--r--   1 jwkidd3  staff    1941 Jul 31 11:17 README.md
-rw-r--r--   1 jwkidd3  staff    9294 Jul 31 14:33 compressed_files.ipynb
-rw-r--r--   1 jwkidd3  staff   25120 Jul 31 15:01 filesystem_interaction.ipynb
-rw-r--r--   1 jwkidd3  staff    5311 Jul 31 11:17 julia.ipynb
-rw-r--r--   1 jwkidd3  staff    6270 Jul 31 11:17 julia_omp.f90
-rw-r--r--@  1 jwkidd3  staff  264739 Jul 31 11:21 python_for_systems_programming.pdf
-rw-r--r--   1 jwkidd3  staff   18330 Jul 31 15:05 shell_interaction.ipynb
drwxr-xr-x  21 jwkidd3  staff     672 Jul 31 11:18 [34msource-code[m[m
-rw-r--r--   1 jwkidd3  staff    8754 Jul 31 11:17 system_information.ipynb
drwxr-xr-x   3 jwkidd3  staff      96 Jul 31 15:06 [34mtmp[m[m
' '\.ipynb$'

  STDOUT:


  STDERR:
grep: brackets ([ ]) not balanced


Pipe the output of `cut` into `sort`.

In [32]:
sh.sort(sh.cut('-d', ' ', '-f', '5', 'tmp/date_file.txt'), '-r')

ErrorReturnCode_2: 

  RAN: /usr/bin/sort '15:06:00
15:06:01
15:06:02
15:06:03
15:06:04
15:06:05
15:06:06
15:06:07
15:06:08
15:06:09
' -r

  STDOUT:


  STDERR:
sort: No such file or directory


### Backgrounding & time out

Long running processes can be placed in the background.

In [None]:
process = sh.sleep(10, _bg=True)

In [None]:
for i in range(10):
    print(i)

In [None]:
process.wait()

In [None]:
print(process.exit_code)

A time out can be specified for a command, and on time out, the resulting exit code will be the number of the signal (SIGKILL by default).

In [None]:
try:
    process = sh.sleep(10, _bg=True, _timeout=3)
except TimeoutError as error:
    print(error)

### Clean up

Remove the `tmp` directory.

In [33]:
sh.rm('-rf', 'tmp')

''

## `subprocess` module

If you prefer to use standard library modules only, `subprocess` is a good choice.

In [34]:
import subprocess

This module has a high-level function `run` that can be used for almost any processing.  The API is still being improved in subsequent releases of Python.

In [35]:
process = subprocess.run(['ls', '-l'], stdout=subprocess.PIPE, encoding='utf8')

In [36]:
process.stdout.split('\n')

['total 736',
 'drwxr-xr-x   2 jwkidd3  staff      64 Jul 31 15:05 \x1b[34m-p\x1b[m\x1b[m',
 '-rw-r--r--   1 jwkidd3  staff    1941 Jul 31 11:17 README.md',
 '-rw-r--r--   1 jwkidd3  staff    9294 Jul 31 14:33 compressed_files.ipynb',
 '-rw-r--r--   1 jwkidd3  staff   25120 Jul 31 15:01 filesystem_interaction.ipynb',
 '-rw-r--r--   1 jwkidd3  staff    5311 Jul 31 11:17 julia.ipynb',
 '-rw-r--r--   1 jwkidd3  staff    6270 Jul 31 11:17 julia_omp.f90',
 '-rw-r--r--@  1 jwkidd3  staff  264739 Jul 31 11:21 python_for_systems_programming.pdf',
 '-rw-r--r--   1 jwkidd3  staff   32910 Jul 31 15:07 shell_interaction.ipynb',
 'drwxr-xr-x  21 jwkidd3  staff     672 Jul 31 11:18 \x1b[34msource-code\x1b[m\x1b[m',
 '-rw-r--r--   1 jwkidd3  staff    8754 Jul 31 11:17 system_information.ipynb',
 '']

Note that if you don't specify the `stdout` arugment, the output of the command will not be captured.  Python 3.7 makes this easier by adding a `capture_output` argument.

### Exit codes

The `run` function returns a `CompletedProcess` object that has an attribute for the exit code returned by the process.

In [37]:
process = subprocess.run(['mkdir', '-p', 'tmp'])

In [38]:
process.returncode

0

### I/O redirection

Output of a running command can be redirected to a file.

In [39]:
with open('tmp/data.txt', 'w') as file:
    for i in range(10):
        subprocess.run(['echo', '-n', str(i) + ' '], stdout=file)
        subprocess.run(['date'], stdout=file)
        subprocess.run(['sleep', '1'])

Note that mixed I/O from the `print` function and `run` doesn't work as expected.

In [40]:
print(subprocess.run(['cat', 'tmp/data.txt'], stdout=subprocess.PIPE,
                     encoding='utf8').stdout)

0 Mon Jul 31 15:08:49 CDT 2023
1 Mon Jul 31 15:08:50 CDT 2023
2 Mon Jul 31 15:08:51 CDT 2023
3 Mon Jul 31 15:08:52 CDT 2023
4 Mon Jul 31 15:08:53 CDT 2023
5 Mon Jul 31 15:08:54 CDT 2023
6 Mon Jul 31 15:08:55 CDT 2023
7 Mon Jul 31 15:08:56 CDT 2023
8 Mon Jul 31 15:08:57 CDT 2023
9 Mon Jul 31 15:08:58 CDT 2023



Input redirection is similar.

In [None]:
with open('tmp/data.txt', 'r') as file:
    process = subprocess.run(['wc', '-l'], stdin=file, stdout=subprocess.PIPE,
                             encoding='utf8')
    print(process.stdout)

### Piping

Piping can also be done using `subprocess`.  It is less user friendly than using the `sh` module, but it allows more control.  You will have to resort to the low-level `Popen` function.

In [None]:
p1 = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['grep', r'\.ipynb$'], stdin=p1.stdout, stdout=subprocess.PIPE, encoding='utf8')
p1.stdout.close()
output, _ = p2.communicate()
print(output)

In [None]:
p1 = subprocess.Popen(['cut', '-d', ' ', '-f', '5', 'tmp/data.txt'], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['sort', '-r'], stdin=p1.stdout, stdout=subprocess.PIPE, encoding='utf8')
p1.stdout.close()
output, _ = p2.communicate()
print(output)

In [None]:
_ = subprocess.run(['rm', '-r', 'tmp'])

### Shell file globbing and environment variables

In [None]:
import os

For file globbing to work in subprocesses, provide the entire command, including all arguments as a string, rather than a list to `run`. Also, set `shell` to `True`.

In [None]:
process = subprocess.run('ls *.py', stdout=subprocess.PIPE, encoding='utf8', shell=True)

In [None]:
process.stdout.split()

The same applies when you want environment variables to expand.

In [None]:
process = subprocess.run('echo "hello ${USER}"', stdout=subprocess.PIPE, encoding='utf8', shell=True)
print(process.stdout.rstrip())

If you need to add or modify environment variables, it is good practice to do that on a copy of `os.environ`.

In [None]:
environ = os.environ.copy()
environ['greeting'] = 'bye'
process = subprocess.run('echo "${greeting} ${USER}"', stdout=subprocess.PIPE, encoding='utf8',
                         env=environ, shell=True)
print(process.stdout.rstrip())

### Clean up

Remove the `tmp` directory.

In [None]:
process = subprocess.run(['rm', '-rf', 'tmp'])

In [None]:
process.returncode