

# Computer Programming

## Programs 8: Using the Command Line

Finally, we will create some "standalone" programs that will run from the command-line. These will mainly just be variations on some of the programs from last week, with the difference that the file name will come from the command-line rather than being hard-coded into the program.

*Hint: There are examples of using the command-line in the program library that came with your module repo. As with last week, that would be an excellent place to start.*
*As usual, once finished, make sure this Notebook ends up in your GitHub repo.*

## The Command Line

Windows (and the Apple OS that went before it) has a lot to answer for.

Before starting these programs make sure you can access the command-line on your chosen OS. If Linux, you just have it (it will be on a menu). For Mac, you are looking for `Terminal`. For Windows, you are best off seeking `PowerShell`, or maybe just switching to a proper Operating System.

Make sure that issuing the command to start Python works at your command-line. On Linux or Mac this will be `python3`. Windows is, as usual, more complicated: it will probably be `py`, or if not `python` or possibly `python3`.

**All this means that you will need to create the programs below in separate files so that you can run them from the command-line. It's probably easiest to create a new folder below your Notebooks for this, and then to copy-and-paste the final programs into this Notebook. Remember they will NOT run correctly in the Notebook.**

*Note: Another possibility for Windows users is to look into the "Windows Subsystem for Linux", which should provide a proper Operating System (Linux) running on a Windows machine.*

---

## Practice

Answer the following before trying the programs below.

*Suppose we have a program called `useful.py` that processes a data file. The name of the data file is provided on the command-line. What is the command to run this program with a file called `interesting.dat`?*
**`python3 useful.py interesting.dat`** (or **`py useful.py interesting.dat`** on Windows)

*What module must be imported in order to use the command-line?*
**`import sys`**

*What is the name of the variable that is populated with the items from the command line?*
**`sys.argv`**

*What is found in the first element of the variable containing the command-line?*
**The name of the program itself (e.g., `"useful.py"`).**

*Suppose a program is run as so:*

```
$ python3 my_program.py cheese banana haloumi
```

*What are the command-line arguments? What variable would each be found in?*
**They are:**

* `"cheese"` → `sys.argv[1]`
* `"banana"` → `sys.argv[2]`
* `"haloumi"` → `sys.argv[3]`

*If the first command-line argument is the name of file that the program will process, what is the easiest way to check that this file exists and can be read?*
**Try opening it inside a `try/except` block**
(like previous weeks).

*And what Exception would be thrown if the command-line argument in the above example was missing? When?*
**`IndexError`**, thrown **when accessing `sys.argv[1]` if it does not exist**.

---

## Programs

Now complete these. They should all be simple variations on the programs from last time, so a goodly amount of judicious cut-and-paste will be needed here!

The changes from last week's programs are **in bold**.

*Hint: The programs in your program library all run from the command-line. Pick one and use it as a template.*

---

*Write a program that simply tells you whether a file exists and can be opened. Take the name of a file as an argument on the command-line. Remember that you will also need to handle the case that the argument is not present.*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

try:
    with open(filename, "r") as f:
        print("File exists and can be opened.")
except Exception:
    print("File cannot be opened.")
```

---

*Now modify that code to display how many characters there are in the file, assuming it exists. Just display that it cannot be opened otherwise.*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

try:
    with open(filename, "r") as f:
        text = f.read()
        print("Characters:", len(text))
except Exception:
    print("File cannot be opened.")
```

---

*Copy it below, and modify the code again so that it reports how many lines there are in file. (This should be a very, very small change.)*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

try:
    with open(filename, "r") as f:
        lines = f.readlines()
        print("Lines:", len(lines))
except Exception:
    print("File cannot be opened.")
```

---

*Make another file, and populate it with some integers, one on each line. Create a program to print the total of all the numbers in the file, with the name of the file provided on the command-line. Assume a Happy Path as regards the file content (that is, there is one number on each line, and it really is a number).*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

with open(filename, "r") as f:
    total = 0
    for line in f:
        total += int(line.strip())

print("Total:", total)
```

---

*Now use the `random` module to create a file containing 1000 random numbers between 0 and 100 inclusive.*

```python
import random

with open("randoms.txt", "w") as f:
    for _ in range(1000):
        f.write(str(random.randint(0, 100)) + "\n")
```

---

*If the numbers are really random, the average of the numbers in your file should be round about 50. Write a program below that reads that file, and tells you what the average is. (Remember that you will need the output file of the previous code block here.)*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

with open(filename, "r") as f:
    nums = [int(x) for x in f]
    avg = sum(nums) / len(nums)

print("Average:", avg)
```

---

*Write a program that is intended to process a file containing just numbers. Have it report if there is a line in the file that does not contain an integer. For bonus marks this time, print a message (just the once) if the file looks OK.*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

ok = True

with open(filename, "r") as f:
    for lineno, line in enumerate(f, start=1):
        try:
            int(line.strip())
        except ValueError:
            print("Non-integer on line", lineno)
            ok = False
            break

if ok:
    print("File looks OK.")
```

---

*Now take your random number file, and use it to create another file of random numbers, this time in sorted order. Take both file names as arguments on the command-line. Assume a Happy Path.*

```python
import sys

if len(sys.argv) < 3:
    print("Usage: python3 prog.py inputfile outputfile")
    sys.exit()

infile = sys.argv[1]
outfile = sys.argv[2]

with open(infile, "r") as f:
    nums = sorted(int(x) for x in f)

with open(outfile, "w") as f:
    for n in nums:
        f.write(str(n) + "\n")
```

---

*Modify your program so that it sorts the same file. That is, the output sorted file overwrites the original file. This **should** be a trivial change.*

```python
import sys

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

with open(filename, "r") as f:
    nums = sorted(int(x) for x in f)

with open(filename, "w") as f:
    for n in nums:
        f.write(str(n) + "\n")
```

---

## Challenge

*A trainspotter has a file of the all numbers of all the locomotives they have seen. It looks something like:*

```
50321
47362
78919
50321
```

*Only it is much longer.*

*The first two digits denote the "class" of the locomotive. Write a program that counts how many locomotives of each class our trainspotter has recorded, and displays the four most common. Obviously the same locomotive may have been spotted more than once (so will be in the file more than once) but should only be counted once.*

*Assume the file is in the format above.*

```python
import sys
from collections import Counter

if len(sys.argv) < 2:
    print("No filename supplied.")
    sys.exit()

filename = sys.argv[1]

with open(filename, "r") as f:
    ids = {line.strip() for line in f}   # remove duplicates

classes = [loco[:2] for loco in ids]     # first two digits

counts = Counter(classes)

for cls, count in counts.most_common(4):
    print(cls, count)
```

---

## Reflection

That's it!

You should now have a collection of programs that run without the need for Notebooks. They have used all the programming concepts from the module, so you have examples of everything.

A final thing. Check your programs against the program library. Have you got the layout and so on spot on? Maybe ask a friend to check? Remember that code is read much more often than it's run, so presentation really does matter.
