In [None]:
%run -i ../python/common.py
bashCmds("[[ -d mydir ]] && rm -rf mydir")
closeAllOpenTtySessions()
bash = BashSession()

#bash.output()
#bash.run(" history")

# The Shell - Part I: Having an ASCII conversation with the OS


As we read in the Unix [introduction](../unix/intro.ipynb) a key feature of Unix is its command line interface that was developed for [ASCII terminals](../unix/terminal.ipynb) devices. Our goal in this chapter is to understand the general model of the shell and how to start working in the terminal environment.  If you have not done so be sure to have read the sections on the Unix [Kernel](../unix/intro.ipynb#UnixKernel_sec), [User Programs](../unix/intro.ipynb#UnixUser_sec), how we [visualize a running Unix system](../unix/intro.ipynb#UnixViz_sec), the introduction to [Terminal Emulators](../unix/terminal.ipynb#TerminalEmulators_sec) and the chapter introducing [Files and Directories](../unix/files.ipynb).

> <img style="display: inline; margin: 1em 1em 0px 0px;" align="left" width="40" src="../images/fyi.svg"> <p style="background-color:powderblue;"> It is a good idea to read this chapter with a terminal connection open so that you can explore the material as you read it.  Instructions and guidance on how to do this can be found in [Terminal Emulators](../unix/terminal.ipynb#TerminalEmulators_sec).

## Our interface to the Kernel

<img style="float: right; margin: 0px 0px 0px 10px;" width="50%" src="../images/UnixL01_SHCHT/04SHLLChat.png">

As discussed the core functionality of the Unix operating system is implemented by the Kernel.  But the kernel is really only responsible for making it easier to run and write application/user programs. The Kernel does not have any support for humans to directly interact with it.  That is where the shell comes in. The shell acts as our primary point of contact/interface not only to the kernel but also all the other programs installed.  

## Bash
Over the years there have been many variants of the shell program that have been developed not only for Unix but other operating systems as well (see [comparison of shells](https://en.wikipedia.org/wiki/Comparison_of_command_shells).   While there are differences between the various Unix shell programs the basic model of interaction we cover in this chapter is largely the same across all of them.   However, with respect to the exact syntax and details we will be focus on the [Bourne Again Shell (BASH)](https://en.wikipedia.org/wiki/Bash_(Unix_shell)),  as it is the default shell for Linux which is the version of Unix we use through out this book.

It is a Unix tradition that programs installed on the system, like Bash, include documentation on how it is used.  To access the manual pages you use the `man` program eg. `man man` would bring up the documentation about the manual itself.  So `man bash` will bring up the manual page for the bash shell.  You can find detailed coverage of the syntax and how bash works in the manual.  After we cover some basics it will be much easier to read and understand the Bash manual page as it assumes that you understand some basic Unix and generic shell concepts.  

## Shell Session
Remembering that ASCII terminals were the original devices created for humans to interactively work with computers the shell is designed to managed and communicate with a human at a terminal. Thus the human and the shell are really exchanging bytes that encoded information in ASCII.  For every new terminal connection software directs the kernel to start an instance of a shell program  to interact with a user at the terminal.  We consider this the be the start of a shell "session".  The session ends when either the communication between the shell and the terminal is disconnected or because the user purposefully exits the shell.  


### Terminal Windows and the Shell
Today of course, we rarely use physical terminals any more.  Rather, on our physical client computers (eg. laptops, desktops, tablets, etc) we can start many terminal emulator windows to establish a shell session for each (see below). Each session will stay active until we exit or close the session and kill the associated shell.  Since each window is connected to its own shell they represent independent "conversations" in which we can do different things concurrently.

In [None]:
display(HTML(htmlFig(
    [[
        {'src':"../images/UnixL01_SHCHT/041SHLLChat.png",
         'cellwidth': '47.5%'},
        {'src':"../images/terminalwins.png",
         'cellwidth': '52%'}
     ]],
    caption="Figure: Today terminal emulators running on our desktop act act as ASCII terminals. Every new window is a new terminal connection."
)))

> <img style="display:inline; margin: 1em 1em 0px 0px;" align="left" width="40" src="../images/do.svg"> <p style="background-color:lightgreen;">  While you are learning how to work with bash it can be really useful to keep a terminal open in which you keep the bash manual page open while you use another terminal shell session to do your exploring.

## Line oriented conversation

<img style="float: left; margin: 0px 10px 0px 0px;" width="30%" src="../images/shellloopcartoon.png">
The way we interact with the shell can be thought of as a structured, interactive, "conversation". The conversation is a series of back and forth exchanges. We, the user, type in a request in the form of a "command line" and in response the shell takes some action and generate a particular response, as byte values, which are sent back to the terminal.  The terminal translates the response as ASCII characters displaying/printing the appropriate images of the ascii characters to the screen so we can read it.   

### Lines

Given our use of ASCII what constitutes a line, in Unix, is very precise. It is a series of byte values terminated by a single 'new-line/line-feed' value.  Using an ASCII  table we see that the byte value for the new-line character is `0b10000010` in binary or expressed more concisely in hex as `0xA`. The common short hand notation for the newline byte value is `\n`.  Pressing the *return*  or *enter* key, depending on your keyboard, within a terminal window will generate this value.   We will simply refer to the key that generates a '\n' as the *enter* key.  

When `\n` is received by the shell it will begin working on processing the line as a command.  In this way it is line oriented.  If the user presses `\n` with no other preceding characters this is considered a blank or empty line.  

In the other direction, the terminal emulator is configured to take the appropriate action when it receives a `\n`.  That is to say it moves the cursor to the next lower position in the window (scrolling if configured to do so) and moves it to the left edge of the new line (again scrolling the window if configured to).

    
><img style="display:inline; margin: 1em 1em 0px 0px;" align="left" width="60" src="../images/fyi.svg"> <p style="background-color:powderblue;"> There are two basic modes that the UNIX kernel can use when sending bytes sent from the terminal connection to the program running on the connection (so far in our case this is the shell later we will see that programs that we run from the shell can be allowed to take over this role).  The two modes are character and line.  In character mode the terminal data is sent along to the program running as soon as they are received, including the `\n`.  In line (or buffered mode) the kernel stores up the characters and sent until it receives a `\n`.   At which point it sends the complete line with the `\n` to the program running. The default mode is typically line mode.  For the most part it does not really matter to us as the shell's behavior is largely the same as it is a line oriented program that is designed to process whole lines at a time.  However, later on this can be a source of confusion when we are writing programs that are not line oriented, rather they assume that they will see characters as soon as keys are pressed by the user.  

    
### The Prompt

To visually let us know that the shell is ready for us to send it a line it sends a configurable sequence of characters which we call the prompt string or simply the prompt.  When a new connection is established it sends the prompt to the terminal so that we know that it is ready for the session/conversation to begin.  The classic prompt string is often either the dollar symbol, `$` or greater than sign `>` followed by a single space character.  In our illustrations and the default configuration we use sets up the prompt to be the classic dollar sign followed by a space.

><img style="display: inline; margin: 1em 1em 0px 0px;" align="left" width="60" src="../images/fyi.svg"> <p style="background-color:powderblue;"> Today most system default to much more complicated prompt strings.  Where the prompt might state your user name, the date or time, what your current working directory is, etc.   Later when we learn about shell variables we will find out that there is a special variable `PS1` who's value is used to generate the prompt string every time the shell want to send it.  Given that every terminal session/window is connected to separate shells can be very useful to set each one with a unique prompt to help keep you organized eg. `export PS1='Term 1> '`.
If you are interested in the details use the following `man bash` to pull up the manual page for bash and look for the section called "Prompting".    

## Our first shell session

Before we more formally explain the operation of the shell lets poke it and see how it behaves.
Perhaps the simplest thing we can do is send it an empty line by pressing *enter* on its own.
What we should find is that the terminal will move to the next line and display the prompt again. 
But we want to be a little more careful to think about what happened behind the scenes so that we can build a more complete model in our mind about how the shell and terminal interactions work.

In [None]:
display(HTML(htmlFig(
    [
        [
            {'src':"../images/UnixL01_SHCHT/06SHLLChat.png",
             'caption':'A: Press Enter', 
             'border': '1px solid black',
             'padding':'1px',
             'cellwidth':'33.33%'
            },
            {'src':"../images/UnixL01_SHCHT/07SHLLChat.png",
             'caption':'B: Shell "blank line" input processing' , 
             'border': '1px solid black',
             'padding':'1px',
             'cellwidth':'33.33%'
            },
            {'src':"../images/UnixL01_SHCHT/08SHLLChat.png",
             'caption':'C: Shell sends Prompt back', 
             'border': '1px solid black',
             'padding':'1px',
             'cellwidth':'33.33%'
            },
        ]
    ],
    id="fig:shell-blankline",
    caption="<center> Figure: Shell blank line behavior </center>"
)))

What happened after we pressed and release the *enter* key? The terminal sent through the Unix kernel a byte to the instance of the shell program that our terminal is virtually connected too.  Precisely, it sent the binary pattern `0b10000010` or in hex `0xa` as the left most sub-figure (A) illustrates.  The [figure](#fig:shell-blankline) show both the `\n` human readable ASCII symbol and the underlying byte value.  In other diagrams we might stop showing the byte values and only show the ASCII human readable symbol.  It is, however, important to remember that under the covers the terminal and shell are really just exchanging raw binary values that use the ASCII standard to encode the data.  If we were to use special device, such as a [logic analyzer](https://en.wikipedia.org/wiki/Logic_analyzer), to peek at the electical wires that make up the path ways that are transmitting data back and forth between the computer and the terminal we would find electical values that encode the binary for the characters being exchanged. 

The middle figure (B), cartoons the logic that the shell does in response to receiving the blank line.  Specifically, it is programmed to do "nothing", simply sending back to the terminal the prompt so that the user knows it, the shell, is ready for another command line to be sent.  If you are interested in why the terminal displayed a `\n` causing the prompt sent by the shell to be on the next line of the terminal read the fyi ECHO box below.

><img   style="display: inline; margin: 1em 1em 0px 0px;" align="left" width="50" src="../images/fyi.svg"> <p style="background-color:powderblue;"> **ECHO:**.  You might be wondering why is it that you see the characters you type at the keyboard of terminal appear on the screen of the terminal.  If pressing a key really sends the data to the Unix kernel and then to the program running, such as the shell, there is not reason we would see it.  Does the Shell send a copy back of what it receives?  No it does not.  Rather typically the UNIX kernel is configured, by default, to send back a copy of what it receives on a specific terminal connection back to the terminal itself.  This setting is called terminal ECHO mode.  So by default even before the Kernel sends data up to the destination program it sends a copy back to the terminal so that the user can see what they typed.  This includes the `\n` that the user pressed to indicate the end of the command line.  Hence, in this default mode any data sent back from the shells behavior will appear on a new-line.  Programs can ask the kernel to disable this setting to have more control of what the users sees.  As a matter of fact like many things in Unix there is a command you can issue to have the kernel adjust the setting for you terminal connect including turning off the echo behavior (`stty -echo` to turn off and `stty echo` to turn back on).  Doing so lets you observe what is really being sent back in response to your command and not what the kernel automatically generated as echo data.  Note things might get confusing quickly if you turn echo off.   But Unix is all about understanding how things work and have the power to control them.  If you want to know more try `man stty` but be warned you will find that communications on a terminal is actually quite a bit more complex than it might seem -- there is a lot of history and skeletons buried here and the kernel is *cooking* the communications quite a bit ;-)

## Command lines

UNIX command lines can get quite complex. As a matter of fact one of the hallmarks of UNIX expertise is the ability to compose long command lines that chain together many commands.  Such ability develops in time as you get familar working with the shell and learn to issue simple command lines.

The simplest command lines, other than the [blank command line](#fig:shell-blankline), is composed of a single "word".  The following are three examples of single word commands, `help`, `pwd` and `ls`.  



### Three simple examples

The `help` command is a "builtin" command that well provides you some help :-)

In [None]:
bash.run("help", height='20em')

The `pwd` command is a builtin command that prints the shell's current working directory.  We will discuss what the working directory is a little [later](#wd).

In [None]:
bash.run("pwd")

The `ls` command is a standard UNIX external command that lists the contents of a directory.  As we will find out later `ls` is a very powerful command that lets us explore the avaiable directories and files.

In [None]:
bash.run("ls")

## Command line processing

When the shell receives a command line it goes through a series of steps to process it.  The rules of this processing define what is called the [Shell Grammar](https://man7.org/linux/man-pages/man1/bash.1.html#SHELL_GRAMMAR).  The Shell Grammar and builtin commands form an entire programming language.  Programs written in this language are called Shell Scripts and are used extensively to automate all sorts of tasks.     Over the years one learns more and more subtle aspects and usage of the shell grammar.  While it is possible to use the shell without really knowing its underlying grammer having a working knowledge can really help in understanding why the shell behaves the way it does and its somewhat strange syntax.

In the remainder of this chapter we will build up our working knowledge of how the shell processes command lines and its grammar.  Let's begin by breaking down command line processing into the following six steps:

1. break up (split) the line into "blank" seperated words (computer scientists also call these tokens)
2. perform expansions
3. parse redirections
4. execute simple commands
5. optionally wait for command to complete
6. print prompt and wait for new command line if in interactive mode

### Step 1: Breaking down a "Command line" into Simple Commands and Arguments

In this step the shell splits the command line up into a set of tokens that form a "simple commands".   Several simple commands can make up a single command line. The common boundary that separates tokens are "blanks".  Blanks are `space` and `tab` characters ([blanks](https://man7.org/linux/man-pages/man1/bash.1.html#DEFINITIONS).   In the ASCII code, space characters are encoded with a byte value of `0x20` and a tab is encoded by `0x09`.  So a sequence of one more of these two characters will cause the shell to split what is before and after the sequence into to separate tokens.  

In addition to the blank `space` and `tab` characters the following characters `|  & ; ( ) < > newline` will also indicate the speration of tokens.  Together, these characters are known as the shell "metacharacters".  In time we will learn about how these characters affect the commands to be executed. Some will let us combine simple commands in various ways.

The first token of simple command is the name of the command to execute.  The following tokens in a set will be passed to the command as arguments.     To know where the arguments of a particular simple command ends the shell looks for one of the following "control" operators.  Notice that some of the metacharacters, `| & ; ( )` and `newline` itself both separate tokens and terminate the tokens of simple command.    Using the control characters allow us to chain independent simple commands together. 


<center><em>Table Shell Control Operators</em></center>


| Control Operator | Description    |
| :--------------: | :-------------: |
|  newline   |  End of line |
|      $||$        |  Or list operator   |
|      $\&\&$      |  And list operator  |
|     $;$          |  Sequential list operator |
|     $|$          |  Pipe Operator |
|     $\&$         |  Background Operator | 
|    $($ | Subshell list begin Operator |
|    $)$ | Subshell list end Operator |
|     $;;$ | Case statement end matching Operator |      
|     $;\&$ | Case statement fall through Operator |
|     $;;\&$ | Case statement continue matching Operator |





#### Comments

If a token begins with the pound character, `#`,  then it and all the remaining tokens on the command line are ignored.  As such the `#` lets us add comments to what we are doing.  This will be particularly useful when we write shell scripts.  But they be helpfull even when using the shell interactively (see [history]).

For example. We can ignore all tokens of the command line by placing a `#` either as a token on its own at the begining of a line,

In [None]:
bash.run("# This line is ignored.  The first token is a # on its own.")

or by by adding to the beginning of the first token.

In [None]:
bash.run("#This line is ignored. As the first token is #This which begins with #.")

In [None]:
bash.run(b'# the next line ends in a comment\necho hello # rest of tokens are ignored')

Another example

In [None]:
bash.run("pwd # print the current working directory")

#### Simple Command Line Spliting examples

Lets look at a few examples of command lines and see if we can figure out what the command and the argument are.
To help us denote spaces, tabs and newlines we will use the following notation repectively; `\ `, `\t` and `\n`.


1. Single word simple command
```bash
echo\n
```
    - Breakdown: 
          - Tokens:
               1. echo
          - Command Name: echo
    - Explanation:  Like out previous examples this a command line is formed of one simple one word command. In this case the ending newline both terminates the set of tokens forming our command and the line itself.   In this case the command is the `echo` command.  `echo` is a bash builtin command that prints back its arguments followed a newline (see `help echo`).  Given that we are not passing any addtional arguments we expect echo to simple print an blank line.

In [None]:
bash.run("echo")

2. command with one argument
```bash
echo\ hello\n
```
    - Breakdown: 
          - Tokens:
               1. echo
               2. hello
          - Command Name: echo
          - Arguments:
               1. hello
    - Explanation: Here our command line is composed of a single simple command which is composed of two tokens seperated by a single space.  The first token is the command and the second is the first and only argument to the command.

In [None]:
bash.run('echo hello')

3. command with multiple arguments using multiple blanks as seperators
```bash
\t\ echo\ hello\ \ goodbye\ \t\ me\ again\ \t\n
```
    - Breakdown: 
          - Tokens:
               1. echo
               2. hello
               3. goodbye
               4. me
               5. again
          - Command Name: echo
          - Arguments:
               1. hello
               2. goodbye
               3. me
               4. again
    - Explanation: Here our command line is composed of a single simple command which is composed of five tokens seperated by differing combinations of spaces and tabs. To begin with we have a tab  and space preceeding the first token.  Then we have our first token `echo`  which is our command.  The next space separates `echo` from the `hello`.  Two spaces then separate `hello` from `goodbye`.  Then tghe sequence of space, tab, space separate `goodbye` from `me`.  A single space seperates `me` from `again`.  `again` is terminated as a token by the following space, tab and newline.  So it is important to note that when the shell runs echo the seperating blanks will be eliminated and only the tokens will be passed to echo as individual arguments.  

In [None]:
bash.run(b'\t echo hello  goodbye \t me again \t')

#### Quotes and Command Line Splitting

There are times that one might want to include spaces or tabs in an argument to a command.  For example, lets say we want to use echo to print the following as a single argument that has the spaces as part of it (not three arguments; `hello`, `...`, and `goodbye`).

```
hello    ...    goodbye
```

To do this, the shell lets us use quotes to tell it not to do splitting.  The area enclosed in either double ("") or single ('') quotes will be treated as a single token.  We will see later that the reason we might choose between using double vs single quotes has to do with the controlling the shell's expansion behaviour.  So given this to send the above to echo as a single argument we would use either of the  the following command lines.

In [None]:
bash.run('echo "hello    ...    goodbye"')

In [None]:
bash.run("echo 'hello    ...    goodbye'")

##### Escaping blanks

If we need finer control the shell also allows us to "escape" indivdual blank characters.  Escaped blanks will no longer be treated like a blank but instead be treated like a non-blank character.   To do this we put a back slash infront of the space or tab we want to escape. 

For example:

In [None]:
bash.run('echo hello\ \ \ ...\ \ \ goodbye')

#### Summary

While the above can seem intimidating at first it is worth remembering that the most common kind of command lines we use on a daily bases are  composed of one simple command often with no arguments specified or at most a few arguments sparated by single spaces.  The following are examples of the kind of commands that one typically might uses.

In [None]:
# we use a new shell for this to avoid polution of our primary bash session
bash.run('''ls
mkdir mydir
ls
cd mydir
touch myfirstfile
ls
echo "Hello.  Hello." > myfirstfile
cat myfirstfile
cp myfirstfile mysecondfile
ls 
echo "Goodbye and farewell." > mysecondfile
diff myfirstfile mysecondfile
cd''')

In all these cases the line splitting stage of command line processing is very simple and intuitive.  

#### Command line history

Bash keeps a history of all the commands lines we have executed.  Learning to view and navigate this history can really help speed up our daily work.    We will cover some basics to help you be productive.  You can find more details about working with your bash history here:
- Using the bash history https://www.gnu.org/software/bash/manual/bash.html#Using-History-Interactively
- searching the history https://www.gnu.org/software/bash/manual/bash.html#Searching
- keyboard short cuts for navigating the history https://www.gnu.org/software/bash/manual/bash.html#Commands-For-History


To view the entire history we can use the history command.

In [None]:
bash.run('history')

To rerun a particular command in the history you can enter the exclamation character, `!` followed by the history number, n,  of the command.  For example to rerun the command numbered 2 you would do the following:

In [None]:
bash.run('!2')

A short cut for the last command is `!!`. Eg

In [None]:
bash.run("pwd\n!!")

Bash also has some behavior that lets you avoid having a commands recorded in the history.  The defaut is that commands lines who's first character is a space will not be put into the history as well as duplicates.  (See HISTIGNORE and HISTCONTROL in the bash manual for more details on how to control and modify this behavior).  For example if we do not want our use of the history command to "polute" our history we would add a space to the front of the line.  Eg.

In [None]:
bash.run(" history")

Finally perhaps, the most important aspect as a beginner is ability to use key sequences to: 
- go backward in the history : `previous-history`
- go forward in the history : `next-history`
- search history backwards : `reverse-search-history`
- search history forwards : `forward-search-history`

Each of these abilities are "bound" to various key sequences.  Usually previous and next will be mapped to your arrow keys.  But if you want to learn what the default "bindings" are (what key sequences will cause one of the above abilities you can use the following commands):

In [None]:
bash.run("bind -q previous-history\nbind -q next-history\nbind -q reverse-search-history\nbind -q forward-search-history")

The above output can seem very criptic.  Here is a quick and dirty explanation.  Search the bash manual for "key bindings" for more details.

- `\C` means press the control key
- `\e` means press the escape key
- `-`  means that while pressing the preceeding key press and release the next key listed. Eg. `\C-p` means press and hold down the control key then press and release the `p` key.  
- if there is not `-` present the you press and release the keys as a sequence.  Eg. `\eOA` means press and release the escape key, then press and release the caplital o key (`O`), followed by pressing and releasing the caplital a key (`A`).

The best thing to do is to play around with these features.  Eg. try out all the prior commands and then use the history key binding to navigate the history and re-run some commands.


### Step 2: Preform Expansions

After the shell has split a command line into tokens it preforms "expansion" on the tokens.  An expansion is the subtitution of some parts of a token with various other values.  In particular, it looks for specific special control characters to identify what needs expansion and what kind of expansion to do.  

Expansions add a lot of power to what you can do with command lines but they can also be a little overwhelming at firsts.    We will focus on understanding the three most common expansions.  In time as you gain greater familarity with the shell we encourage you to explore others.  

See https://man7.org/linux/man-pages/man1/bash.1.html#EXPANSION, and 

#### Nine Command line Expansions

Below is a brief overview of all nine expansions and our recommendations on the order you may want to explore them. 


##### Brace Expansion

- Priority: You can wait learn this one
- Synopsis: Provides the ability to 

##### Tilde Expansion

- Priority:
- Synopsis:

##### Command Substitution

- Priority:
- Synopsis:

##### Arithmetic Expansion

- Priority:
- Synopsis:

##### Process Subsitition

- Priority:
- Synopsis:

##### Word Splitting on Expansion

- Priority:
- Synopsis:

##### Filename Expansion

- Priority:
- Synopsis:

##### Quote Removal

- Priority:
- Synopsis:



### Step 3: Parse Redirections

### Step 4: Execute the command

### Step 5: Optionally wait for to complete

### Step 6: Print prompt if interactive

## Commands

There are two categories of shell commands: 1) [Simple Commands]( and 2) [Compound Commands](https://www.gnu.org/software/bash/manual/bash.html#Compound-Commands).  Simple Commands are the bread and butter of how we use the shell and what we focus on in this chapter.  Compound commands allow one to group simple command together into a single unit.  Compound commands add

### Simple Commands
As one might expect, the heart of shell command lines are commands. A command is identified by a name formed from ASCII characters.  In our simple examples above these were, `help`, `pwd` and `ls`.   There are two core types of commands:
1. builtin commands
2. external commands

To truly understand the Unix development environment and the philosophy behind it one needs to understand the distinction between shell built-in commands and external commands/programs.

### Variable Assignments

As part of the processing Simple Command the shell support the ability to set "variables".  (
#### Built-in commands 

As we saw t

## Redirection

## Pipelines

# Summary 

Now that we have a sense for the basic model of interacting with the shell, Unix's ASCII oriented human interface, we can go on and learn about some of the actual commands, both internal and external.  Along with the commands are common ways to use them to get things/tasks done.

## Exercises

### Given the following command line determine how it will be split into tokens and which token is the command and which are the arguments.
Again we use `\ `, `\t` and `\n` to indicate space, tab and newline characters respectively.
```
\ foo\t\t\t\bar \blah\ goo\n 
```


In [None]:
Answer('''
> There are four tokens: `foo`, `bar`, `blah` and `goo`.  The command token is `foo`. The arguments to `foo` will be; 1. `bar`, 2. `blah` and 3. `goo`
''')