# Command-Line Programs

---

### Objectives

- Use the values of command-line arguments in a program
- Handle flags and files separately in a command-line program
- Read data from standard input in a program so that it can be used in a pipeline

---

#### Background

- Jupyter Notebook and interactive tools are great for prototyping
- For processing large datasets, we need command-line programs
- Goal: Create programs that work like Unix command-line tools
- Example: Program to read a dataset and print average inflammation per patient

---

#### Switching to Shell Commands

- We'll use shell terminal window (e.g., bash) instead of Python interpreter
- `$` indicates a shell command

Examples:

**Example 1**: Print the average inflammation per patient for a given `inflammation-01.csv` file


```bash
$ python ../code/readings_04.py --mean ../data/inflammation-01.csv
```

**Example 2**: Print the minimum of the first four lines

```bash
$ head -4 ../data/inflammation-01.csv | python ../code/readings_06.py --min
```

**Example 3**: Print the maximum inflammations in several files one after another

```bash
$ python ../code/readings_04.py --max ../data/inflammation-*.csv
```

#### Script Requirements

1. Read from standard input if no filename is given
2. Read data from files if filenames are provided
3. Report statistics for each file separately
4. Use flags (--min, --mean, --max) to determine statistic to print

---

#### Command-Line Arguments

- Use `sys` module to handle command-line arguments
- `sys.argv` contains list of command-line arguments

Example: Using the text editor of your choice, save the following python code in a text file named `argv_list.py` within the `code` directory.

```python
import sys
print('sys.argv is', sys.argv)
```


Use the bash shell to run the code.
```bash
$ python ../code/argv_list.py
```

---

**Example**: Consider the script `./code/readings_02.py` which calculates the mean for each row in the data file.

```bash
$ cat ../code/readings_02.py
```

```python
import sys
import numpy

def main():
    script = sys.argv[0] # always be the name of our script
    filename = sys.argv[1]
    data = numpy.loadtxt(filename, delimiter=',')
    for row_mean in numpy.mean(data, axis=1):
        print(row_mean)


if __name__ == '__main__':
    main()
```

**Data file** `../data/small-01.csv`

```bash
$ cat ../data/small-01.csv
```

```csv
0,0,1
0,1,2
```

**Run the script**

```bash
$ python ../code/readings_02.py ../data/small-01.csv
```

----

#### Handling Multiple Files

- Use a loop to process each file separately
- Be careful with `sys.argv[0]` (script name)
- Use `sys.argv[1:]` to get only the filenames

---

**Example**: 
Consider the script `./code/readings_03.py` which calculates the mean for each row for multiple data files.

```bash
$ cat ../code/readings_03.py
```


```python
import sys
import numpy

def main():
    script = sys.argv[0]
    for filename in sys.argv[1:]:
        data = numpy.loadtxt(filename, delimiter=',')
        for row_mean in numpy.mean(data, axis=1):
            print(row_mean)

if __name__ == '__main__':
    main()
```


**Run in bash**

```bash
$ python ../code/readings_03.py ../data/small-01.csv ../data/small-02.csv
```

---

#### Handling Command-Line Flags

- Check for flags before processing files
- Use `assert` to validate flag input
- Separate file processing into a function

---

**Example**: Let us inspect the content of `../code/readings_05.py`

The first argument as usual is the script name, the second argument is a flag for `--min`, `--mean`, and `--max`. From the third arguments is a list of filenames.

```bash
$ cat ../code/readings_05.py
```


```python
import sys
import numpy

def main():
    script = sys.argv[0]
    action = sys.argv[1]
    filenames = sys.argv[2:]
    assert action in ['--min', '--mean', '--max'], \
           'Action is not one of --min, --mean, or --max: ' + action
    for filename in filenames:
        process(filename, action)

def process(filename, action):
    data = numpy.loadtxt(filename, delimiter=',')

    if action == '--min':
        values = numpy.amin(data, axis=1)
    elif action == '--mean':
        values = numpy.mean(data, axis=1)
    elif action == '--max':
        values = numpy.amax(data, axis=1)

    for val in values:
        print(val)

if __name__ == '__main__':
    main()
    
```

**Run the script**

```bash
$ python ../code/readings_05.py --max ../data/small-01.csv
```

---

#### Handling Standard Input

- Read from `sys.stdin` if no filenames are provided
- Use `<` to redirect file content to standard input

---

**Example 1**:
Inspect the content of `../code/count_stdin.py`

```bash
$ cat ../code/count_stdin.py
```

```python
import sys

count = 0
for line in sys.stdin:
    count += 1

print(count, 'lines in standard input')
```

```bash
$ python ../code/count_stdin.py < small-01.csv
```

**Example 2**:

Inspect the file `../code/readings_06.py`

```bash
$ cat ../code/readings_06.py
```

```python
import sys
import numpy

def main():
    script = sys.argv[0]
    action = sys.argv[1]
    filenames = sys.argv[2:]
    assert action in ['--min', '--mean', '--max'], \
           'Action is not one of --min, --mean, or --max: ' + action
    if len(filenames) == 0:
        process(sys.stdin, action)
    else:
        for filename in filenames:
            process(filename, action)

def process(filename, action):
    data = numpy.loadtxt(filename, delimiter=',')

    if action == '--min':
        values = numpy.amin(data, axis=1)
    elif action == '--mean':
        values = numpy.mean(data, axis=1)
    elif action == '--max':
        values = numpy.amax(data, axis=1)

    for val in values:
        print(val)

if __name__ == '__main__':
    main()
```


Run the script

```bash
$ python ../code/readings_06.py --mean < ../data/small-01.csv
```

---

#### Final Program Structure

1. Parse command-line arguments
2. Validate input
3. Process files or standard input
4. Apply requested operation (min, mean, max)
5. Print results

---

### Exercises

---

#### Exercise 1: Arithmetic on the Command Line

Write a Python program that performs arithmetic operations:

```bash
$ python arith.py --add 1 2
$ python arith.py --subtract 3 4
```

- Implement addition, subtraction, multiplication, and division
- Parse command-line arguments for operation and numbers

---

#### Exercise 2: Finding Particular Files

Use the `glob` module to create a simple version of `ls`:

```bash
$ python my_ls.py py
left.py
right.py
zero.py
```

- List files in the current directory with a specific suffix
- Use command-line argument to specify the suffix

#### Exercise 3: Changing Flags

Duplicate the file `code/readings_06.py` into a new file `readings.py`. Modify `readings.py` to use shorter flags:

- Replace `--min` with `-n`
- Replace `--mean` with `-m`
- Replace `--max` with `-x`

Consider:
- Is the code easier to read?
- Is the program easier to understand?

#### Exercise 4: Adding a Help Message

Modify `readings-07.py` to display usage information:

- Print a help message if no parameters are given
- Explain how to use the program
- Include examples of valid commands

#### Exercise 5: Adding a Default Action

Enhance `readings.py` with a default behavior:

- If no action is specified, display the means of the data
- Keep existing functionality for when actions are specified

#### Exercise 6: A File-Checker

Create a program called `check.py`:

- Take names of inflammation data files as arguments
- Check that all files have the same number of rows and columns
- Design and implement tests for your program

#### Exercise 7: Counting Lines

Write a program called `line_count.py` similar to Unix `wc`:

- If no filenames given, report number of lines in standard input
- If filenames given, report lines in each file and total
- Handle both file input and standard input

#### Exercise 8: Generate an Error Message

Create `check_arguments.py`:

- Print usage and exit if no arguments are provided
- Use `sys.exit()` to terminate the program

Example output:
```bash
$ python check_arguments.py
usage: python check_argument.py filename.txt

$ python check_arguments.py filename.txt