# Connecting Commands in Unix

### Questions:
- How can you use pipes to connect commands and perfrom powerful functions in Unix?

### Objectives:
- Learn how to connect Unix commands to analyze and refine data.

### Keypoints:

- Unix commands can be put together in powerful ways.


In [None]:
# Make sure you have the most up to date code
%cd ~/be487-fall-2024
!git stash
!git pull

### Section 1: Finding the Number of Unique Users

Find the number of unique users on a shared system

We know that "w" will tell us the users logged in.  Try it now on a system that has many users \(i.e., the HPC\) and see the output.  We'll connect the output of "w" to "head" using a pipe "\|", but we only want the first five lines:

In [None]:
'''
Type the commands below, and run the cell
!w | head -5
'''

#### What do you get?

You should get something that looks like this...

```
$ w | head -5
 14:36:01 up 21 days, 21:51, 176 users,  load average: 3.83, 4.31, 4.47
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
antontre pts/1    149.165.156.129  Sat14    3days  0.10s  0.10s /bin/sh -i
huddack  pts/3    128.4.131.189    09:38    4:56m  0.15s  0.15s -bash
```

What if we want to see the first five _users_, not the first five lines of output.  To skip the first two lines of headers from "w," we first pipe "w" into "awk" and tell it we only want to see output when the Number of Records \(NR\) is greater than 2:

In [None]:
'''
Type the commands below, and run the cell
!w | awk 'NR>2' | head -5
'''

#### You should see this...

```
antontre pts/1    149.165.156.129  Sat14    4days  0.10s  0.10s /bin/sh -i
huddack  pts/3    128.4.131.189    09:38    5:13m  0.15s  0.15s -bash
antontre pts/5    149.165.156.129  Sun19    2days  0.14s  0.14s /bin/sh -i
minyard  pts/8    129.114.64.18    29Jul16  4:24m  3:46m  3:46m top
antontre pts/11   149.165.156.129  Sun23    2days  0.24s  0.24s /bin/sh -i
```

Let's "cut" out just the first column of data.  The manpage for "cut" says that it defaults to using the tab character to determine columns, so we'll need to tell it to use spaces:

In [None]:
'''
Type the commands below, and run the cell
!w | awk 'NR>2' | head -5 | cut -d ' ' -f 1
'''

#### You should see this...

```
antontre
huddack
antontre
minyard
antontre
```

We can see right away that the some users like "antontre" are logged in multiple times, so let's "uniq" that output:

In [None]:
'''
Type the commands below, and run the cell
!w | awk 'NR>2' | head -5 | cut -d ' ' -f 1 | uniq
'''

#### You should see something like this

```
antontre
huddack
antontre
minyard
antontre
```

Hmm, that's not right.  Remember I said earlier that "uniq" only works _on sorted input_?  So let's sort those names first:

In [None]:
'''
Type the commands below, and run the cell
!w | awk 'NR>2' | head -5 | cut -d ' ' -f 1 | sort | uniq
'''

#### Now you should just see a uniq list of names, nothing repeated...

```
antontre
huddack
minyard
```

OK, that is correct.  Now let's remove the "head -5" and use "wc" to count all the lines \(-l\) of input:

In [None]:
'''
Type the commands below, and run the cell
!w | awk 'NR>2' | cut -d ' ' -f 1 | sort | uniq | wc -l
'''

So what you see is that we're connecting small, well-defined programs together using pipes to connect the "standard input" \(STDIN\) and "standard output \(STDOUT\) streams.  There's a third basic file handle in Unix called "standard error" \(STDERR\) that we'll come across later.  It's a way for programs to report problems without simply dying.  You can redirect errors into a file like so:

```
$ program 2>err
$ program 1>out 2>err
```

The first example puts STDERR into a file called "err" and lets STDOUT print to the terminal.  The second example captures STDOUT into a file called "out" while STDERR goes to "err."

> Protip: Sometimes a program will complain about things that you cannot fix, e.g., "find" may complain about file permissions that you don't care about.  In those cases, you can redirect STDERR to a special filehandle called "/dev/null" where they are forgotten forever.  Kind of like the "memory hole" in 1984.

```
find / -name my-file.txt 2>/dev/null
```

## Count "oo" words

On almost every Unix system, you can find "/usr/share/dict/words."  Let's use "grep" to find how many have the "oo" vowel combination.  It's a long list, so I'll pipe it into "head" to see just the first five:

In [None]:
'''
Type the commands below, and run the cell
!grep 'oo' /usr/share/dict/words | head -5
'''

#### You should see something like this

```
abloom
aboon
aboveproof
abrood
abrook
```

Yes, that works, so redirect those words into a file and count them:

In [None]:
'''
Type the commands below, and run the cell
!grep 'oo' /usr/share/dict/words > oo-words
!wc -l !$
'''

# you should get
10460 oo-words

Let's count them directly out of "grep":

In [None]:
'''
Type the commands below, and run the cell
!grep 'oo' /usr/share/dict/words | wc -l
'''

#### How many of those words additionally contain the "ow" sequence?

In [None]:
'''
Type the commands below, and run the cell
!grep 'ow' /usr/share/dict/words | wc -l
'''

#### How many _do not_ contain the "ow" sequence?

In [None]:
'''
Type the commands below, and run the cell
!grep 'oo' /usr/share/dict/words | grep -v 'ow' | wc -l
'''

In [None]:
'''
Do those numbers add up?
Type the commands below, and run the cell
!bc <<< 158+10302
'''

In [None]:
# The End
!cp ~/be487-fall-2024/exercises/02_intro_unix/ex02-04_connecting_commands.ipynb  /xdisk/bhurwitz/bh_class/$netid/exercises/02_intro_unix/ex02-04_connecting_commands.ipynb