# File->Save Notebook As...!!!

Save this notebook as **CapturingOutput.ipynb**!!! (remove "_orig")


# Capturing command output

It can be useful to save the output of a command to a variable for later use.

## Syntax
```BASH
     varname=value
     varname=$( command )
```
A beginner error is to put spaces around the equal sign. That's OK in some languages, but not BASH.     

**WRONG:**
```

varname =value
varname= value

varname =$(command)
varname= $(command)

varname=(command)
 ```

Notice that we need the `$` this time on the right side of the equal sign `=`. It must accompany the parentheses.
 
--- 
Capture the output of the command `whoami` to the variable `myname`. Remember to enclose the command `whoami` in `$( )`

In [None]:
myname=...

Check the outcome of this expression with the command `echo`

In [None]:
echo ...

You can now wrap the variable in other text. Use echo to print 
```BASH
"My name is $myname. Hello!"
```

In [None]:
echo ...

You can capture a lot of types of data this way. Save the outcome of `hostname` to the variable `whereami`

Now print a message using both of those variables.

In [None]:
echo ...

## More practice

Save `pwd` to varname `dir` and echo the following statement:

    "My current directory is $dir"


In [None]:
dir=...
echo ...

---
Save `date +%Y` to varname `year` and echo the following statement:
    
    "The best year so far is: $year"
    

In [None]:
year=...
echo ...

What does `+%Y` do? What would I use instead to get the Day of the Week?

---
Save `du -h --summary` to `usage` and echo the following statement:

    "My disk usage is $usage"

In [None]:
usage=...
echo ...

There is extra output (a '.' for the current dir) that is a little messy. We'll deal with parsing output later.

What are these arguments for `du`?
 * `-h`
 * `--summary`
 
 ---

# Capturing output from pipes

It's OK to capture the outcome of a pipe, just like a single command.
```BASH
        pipeoutput=$( command1 | command2 )
```

Let's sort our directory contents by size. Get directory contents in detail form below: `ls -l`

In [None]:
ls -l

Now, pipe those results to `sort`, `sort -k1`, `sort -k5`, and `sort -k5n`:

In [None]:
ls -l | sort  -k5n

What did each version do? Notice that sorting numerically on column 5 is the same as sorting by file size, which has a flag in `ls`.

These all do the same:
```BASH

# get long-form from 'ls', then sort by column 5 (n = numerically)
ls -l | sort -k5n

# make `ls` sort by size, and do long form
ls -S -l

# same as above, flags combined
ls -Sl

```

Now let's select the largest file only, gotten by adding a final `tail` command.

```BASH

ls -l | sort -k5n | tail -n 1

```

Save it to the variable `largestFile` using the `$( ... )` construct. `echo` that variable on the second line.

In [None]:
largestFile=...
echo ...

Change the variable and the command to get the smallest file.

In [None]:
smallestFile=...
echo ...

Did that work? What command can you add to filter out the line "total"?

### Extracting the file name

But, it's cleaner to just grab the filename. **What command can we use to extract the file name?**

Unfortunately, `cut` does not split on "whitespace". 

### Whitespace: definition ###

*Definition:* "whitespace". Any sequence of spaces, tabs, and newlines.

There is a command that will do it, but it's syntax is more complex. It is `awk`.

Example: 
```BASH
ls -l | awk '{print $5}'
```

 will ***print the 5th column*** in a whitespace-delimited stream of input, line by line. In `ls -l`, the 5th column is the file size. 
 
 Try finding the column for the filename by changing the `5` to the right column number:

In [None]:
ls -l | awk ...

Now, add the `awk` command to pipes in order to make the `echo` statement accurate:

In [None]:
smallestFile=...
largestFile=...
echo "The smallest file in the current directory is: $smallestFile. The largest file is $largestFile."

---

This was educational for building up a pipeline of commands. Know that the entire pipeline can be handled with flags to `ls`, except for the `head` and `last` commands.

**Man-page Challenge:** 
* How do you rewrite the pipes with just two commands each? 
* Understanding `ls` output: What additional `grep` command could you use to exclude directories?

## Script Version

Now, copy the following code to ***a new file in your terminal*** called `smallestLargest.sh`

```BASH
#!/bin/bash

smallestFile=$(ls -l $@ | sort -k5n | grep -v total | head -n 1 | awk '{print $9}')
largestFile=$(ls -l $@ | sort -k5n | grep -v total | tail -n 1 | awk '{print $9}')
echo "The smallest file in the input is: $smallestFile." 
echo "The largest file in the input is $largestFile."
```
*Note- the syntax highlighting above breaks down...too bad.*


Save the file and set execute permissions. `chmod 755 smallestLargest.sh`

Now run the script. `./smallestLargest.sh`

I have introduced a new special variable here `$@` that works only in scripts to refer to arguments on the command line.

Now you may run the script with specific files: `./smallestLargest.sh *.ipynb`

What did that do differently?

We will explore script variables next.

---

### New commands in this notebook:

* `sort -k5n`
* `awk '{print $5}'`
* `du -h --summary`
* `date +%Y`

### New syntax in this notebook:

* Capturing output: `$( )`
* Special script variable: `$@`

### New definitions in this notebook:
* *TAB* is a special character, not just a bunch of spaces.
* *Whitespace* is a sequence of TABs, spaces, or newlines (although most commands we use work a line at a time.)