In [1]:
import sys
import os

# 8.4. The Shell and Command Line Tools¶


Nearly all computers provide access to a shell interpreter, such as sh or bash. Shell interpreters typically perform operations on the files on a computer, and they have their own language, syntax, and built-in commands.

We use the term command-line interface (CLI) tools to refer to the commands available in a shell interpreter. Although we only cover a few CLI tools in this section, there are many useful CLI tools that enable all sorts of operations on files. For instance, running the following command in the bash shell produces a list of all the files in the figures/ folder along with their file sizes:

ls -l -h figures/

The basic syntax for a shell command is:

command -options arg1 arg2

CLI tools often take one or more arguments, similar to how Python functions take arguments. In the shell, we wrap arguments with spaces, not with parentheses and commas. The arguments appear at the end of the command line, and they are usually the name of a file or some text. In the ls example above, the argument to ls is figures/. Additionally, CLI tools support flags that provide additional options. These flags are specified immediately following the command name using a dash as a delimiter. In the ls example above, we provided the flags -l (to provide extra information about each file) and -h (to provide filesizes in a more human-readable format). Many commands have default arguments and options, and the man command prints a list of acceptable options, examples, and defaults for any command. For example, man ls describes the 30 or so flags available for ls.

:::{note}

All CLI tools we cover in this book are specific to the sh shell interpreter, the default interpreter for Jupyter installations on MacOS and Linux systems at the time of writing. Windows systems have a different interpreter and the commands shown in the book may not run on Windows, although Windows gives access to a sh interpreter through its Linux Subsystem.

The commands in this section can be run in a terminal application, or through a terminal opened by Jupyter.

:::



We begin with an exploration of the file system for this chapter, using the ls tool.

To dive deeper and list the files in the data/ directory, we provide the directory name as an argument to ls.

We also added the -l flag to the command, which specifies the format of the output to have information about each file on a separate line along with additional metadata. In particular, the fifth column of the listing shows the file size. To make the file sizes more readable, we used the -h flag. When we have multiple simple option flags like -l, -h, and -L, we can combine them together as a shorthand:

In [2]:
ls -lLh data/

total 701008
-rw-r--r--@ 1 li2  _lpoperator   267M Jun 15 00:15 DAWN-Data.txt
-rw-r--r--  1 li2  _lpoperator    33M Jun  1 17:06 babynames.csv
-rw-r--r--@ 1 li2  _lpoperator    41M Jun  6 22:39 babynames.db
-rw-r--r--@ 1 li2  _lpoperator   645K Jun 15 00:22 businesses.csv
-rw-r--r--@ 1 li2  _lpoperator   455K Jun 14 23:52 inspections.csv
-rw-r--r--  1 li2  _lpoperator   391B Jun  1 17:06 nyt_names.csv


:::{note}

When working with datasets in this book, our code will often use an additional -L flag for ls and other CLI tools, such as du. We do this because we set up the datasets in our book using shortcuts (called symlinks). Usually, your code won't need the -L flag unless you're working with symlinks too.

:::