# Parsing arguments

## Introduction

This notebook is about creating command line interface (CLI) programs. The topic is not covered by the book but there are good tutorials available online, such as **[Real Python's tutorial on argparse](https://realpython.com/command-line-interfaces-python-argparse/)**, which we recommend you read.

If you're not familiar with how command-line tools typically work, it's highly recommended you understand them from an user's point of view first. In the summary below, there is an explanation of the most important conventions you need to know.

### Optional resources

For more in-depth coverage about `argparse`:

- Python documentation: [Argparse Tutorial](https://docs.python.org/3/howto/argparse.html)
- Python documentation: [`argparse` — Parser for command-line options, arguments and sub-commands](https://docs.python.org/3/library/argparse.html)
- PyMOTW ("Python module of the week"): [argparse — Command-Line Option and Argument Parsing](https://pymotw.com/3/argparse/index.html)

There also are some great third-party libraries which make it easier to parse command line arguments with a more [declarative](https://en.wikipedia.org/wiki/Declarative_programming) approach - the most common ones being [Click](https://click.palletsprojects.com) and [Typer](https://typer.tiangolo.com/).

## Summary

### Command-line interfaces

You can run command-line tools right from the Jupyter notebook by prepending them with a `!` sign. Try this by running the cell below, which shows you the files in the current directory:

In [None]:
!ls

Such command-line tools can have arguments. For example, the `ls` tool accepts a path to show, instead of the current working directory:

In [None]:
!ls /home/jovyan/work/AutPy

Arguments are separated by spaces. If you pass an argument containing spaces, you need to *quote* it to prevent it from being split up:

In [None]:
!ls "/home/jovyan/work/Originale (nicht schreibbar)"

Commands typically also accept *flags*, which consist of a single letter prepended by a hyphen/minus (`-`). The `-l` flag tells `ls` to produce **l**ong output:

In [None]:
!ls -l

Multiple flags can be given. If we add `-a`, `ls` will show us **a**ll directories, including `.` (current directory) and `..` (parent directory):

In [None]:
!ls -l -a

Those single-letter flags can be combined into one, so this is equivalent:

In [None]:
!ls -la

Often, a long variant of those flags exist as well. Those begin with a double minus (`--`). While `-l` has no long eqivalent, `-a` does (`--all`). Thus, this is again eqivalent to the command above:

In [None]:
!ls -l --all

Flags might have a value associated with them. This can be passed as `-x VALUE`, `--option VALUE`, or `--option=VALUE`. For example, with ls, we can use `--sort=...` to specify how to sort the files:

In [None]:
# show the newest blocks first instead of sorted by name
!ls --sort=time /home/jovyan/work/AutPy

Often, those tools show a terse "syntax" specification for how to use them, with
square brackets (`[...]`) denoting optional arguments, pipes (`|`) denoting
alternatives, `CAPS` denoting variables, and `...` denoting repetition. Often, such a
summary will be shown when you run a tool with `--help`.

The subset you've learned about `ls` above could be specified as:

```bash
ls [-l] [-a|--all] [--sort=ORDER] [FILE]...
```


## argparse

When you use a library such as `argparse` for argument parsing, all of the conventions above
(including `--help` output) will be taken care of. All you need to do is to declare
which flags/arguments exist, and what their behavior should be.

You use `argparse` by creating a "parser" object, adding arguments to it, and then telling it to parse the arguments given on the command line:

```python
import argparse
parser = argparse.ArgumentParser(description='Does something.')
# ... add arguments
arguments = parser.parse_args()
```

Add positional arguments with `add_argument`:

```python
parser.add_argument('input_file')
```

Add optional arguments by prefixing them with hyphens:

```python
parser.add_argument('-f', '--format', help='select output format')
```

To add optional boolean parameters, use `action='store_true'`.
This will result in getting `True` in Python if the flag was given, and `False` otherwise.

```python
parser.add_argument('-v', '--verbose', help='show more verbose output', action='store_true')
```

The full list of available actions is [documented in the Python docs](https://docs.python.org/3/library/argparse.html#action).


The `add_argument` calls above would result in a command line application with a help like this:

```
usage: your_script.py [-h] [-f FORMAT] [-v] input_file

Does something.

positional arguments:
  input_file

optional arguments:
  -h, --help            show this help message and exit
  -f FORMAT, --format FORMAT
                        select output format
  -v, --verbose         show more verbose output
```

## The `__main__` block

When writing a script in Python, often there's some code you want to run when the file is executed as a command-line tool -- typically, some sort of `main` function.

However, importing a file (`import yourscript` in Python, rather than running `python yourscript.py` in a shell) **also** runs the complete code in it! This is a problem if you want to e.g. try a function from your file interactively, or also if you'd want to write automated tests for your code. You **should always be able** to import your code, without it starting to magically run when you do so.

Python automatically defines a special `__name__` variable (with two leading/trailing underscores). If your script is launched from the command-line instead of being imported, that variable is set to the special string `"__main__"`. You don't need to understand this in detail, but if you want to see more details about how this works, check [this Stack Overflow answer](https://stackoverflow.com/a/419185/2085149).

Thus, it's considered good practice for scripts to implement a `main` function, and call that at the very bottom in an `if __name__ == "__main__":` block, like so:

```python
import argparse

def run(name):
    print(f"Hello, {name}!")

def main():
    parser = argparse.ArgumentParser(description='Say hello')
    parser.add_argument('name', help='Your name')
    args = parser.parse_args()
    run(args.name)

if __name__ == '__main__':
    main()
```


## Exercises

### Exercise 1: Creating a Parser
Create a parser with the following properties:
* Takes two positional arguments `first_name` and `last_name`.
* Optional argument `--title` that defaults to "Lady" if not provided.
* Optional argument `--underage`, a boolean flag, sets value `True` if set.
* **Note:** When trying your code with an invalid argument, you will see "an exception has occurred" and an `UserWarning`, because `argparse` is trying to exit on errors (which in this case tries to "exit" the notebook). You can ignore the exception and warning.

In [1]:
import argparse

def get_parser():
    parser = argparse.ArgumentParser()
    # todo: add arguments
    parser.add_argument('first_name')
    parser.add_argument('last_name')
    parser.add_argument('--title', default='Lady', required=False)
    parser.add_argument('--underage', action='store_true', required=False)
    return parser

Use this separate cell to try out your code.
Your code should work with the example below, but you're free to change it.

In [2]:
parser = get_parser()

# We can pass a list of strings to parse_args instead of launching this as a
# command-line tool.
args = parser.parse_args(["Margaret", "Thatcher"])

print(f"{args.title} {args.first_name} {args.last_name} (underage: {args.underage})")

Lady Margaret Thatcher (underage: False)


### Exercise 2: Drink-Generator as CLI-Application

Remember the drink generator you wrote in one of the earlier labs? Write a new version of it, but this time as a command-line application! Requirements:

* `-l` / `--list` lists all possible drinks that are available, no arguments needed.
* `-d` / `--drink` followed by a drink name as argument. Prints all the required ingredients for a drink.
* `-i` / `--ingredients` followed by **at least one** ingredient prints the possible drink based on the argument. Every ingredient gets passed as a separate argument.
    - **Note:** The older drink generator exercise allowed only three ingredients, and suggested a matching drink even with some ingredients missing. This time, less or more ingredients can be specified, and all ingredients needed for a drink need to be available to make it.
    - **Remember:** Arguments containing spaces will need to be quoted to avoid the shell splitting them up, e.g. `-u ice "gin tonic" water`. Your code will then get a list of strings: `["ice", "gin tonic", "water"]`.
    - **Hint:** `nargs`
* If a wrong number of ingredients is given, an error is shown.
* The three "modes" (list/drink/ingredients) are mutually exclusive, if you e.g. enter `-l`, it is not allowed to enter `-d` or `-i` or any other combination.
    - However, one of those always needs to be given
* For most of the requirements above, you shouldn't check for these two things manually, `argparse` provides ways to do this for you. In the examples below, this is specified as "(output by argparse)".
* **Important:** The cell writes a file and it's very tempting to directly edit it. Please be aware that what is used for submission (and then for grading) is the _content of the cell, not the file that is written_.
* To get you started, the core code of the drink generator is provided. Other than the parts with `# todo:` comments, you shouldn't change anything.
* If you want to print additional output (e.g. to look at the values in `args`), use the provided `print_debug` function instead of `print`, in order to not affect the tests.

Expected output -- **make sure you use the same messages, including trailing periods (`.`)!**:

```bash
# Getting available drinks
$ python drink_generator_cli.py -l
These drinks are available:
* caipirinha
* gin tonic
* mojito
* vodka martini

# Getting ingredients for a drink
$ python drink_generator_cli.py -d "gin tonic"
gin, ice, tonic water

# Specifying a drink which doesn't exist
$ python drink_generator_cli.py -d nodrink
nodrink does not exist.

# Getting a drink based on ingredients
$ python drink_generator_cli.py -i ice "tonic water" gin
gin tonic

# If no drink can be made with the specified ingredients
$ python drink_generator_cli.py -i ice not gin
No drink found.

# No ingredients given (output by argparse)
$ python drink_generator_cli.py -i
usage: drink-generator-cli.py ...
drink-generator-cli.py: error: argument -i/--ingredients: expected at least one argument

# If script is run without any flags (output by argparse)
$ python drink_generator_cli.py
usage: drink-generator-cli.py ...
drink-generator-cli.py: error: one of the arguments -l/--list -d/--drink -i/--ingredients is required

# Mutually exclusive arguments (output by argparse)
$ python drink_generator_cli.py -l -i "ice, tonic water, gin"
usage: drink-generator-cli.py ...
drink_generator_cli.py: error: argument -i/--ingredients: not allowed with argument -l/--list
```

In [51]:
%%writefile drink_generator_cli.py
import argparse
import sys


DRINKS = {
    "caipirinha": {"cachaca", "sugar", "lime"},
    "mojito": {"white rum", "sugar cane juice", "lime juice", "soda water", "mint"},
    "gin tonic": {"gin", "tonic water", "ice"},
    "vodka martini": {"vodka", "vermouth", "ice", "olives"},
}


def print_debug(*args):
    print(args, file=sys.stderr)


def find_drink(ingredients):
    for drink, drink_ingredients in DRINKS.items():
        if set(ingredients) == drink_ingredients:
            return drink
    return None


def run_list():
    print("These drinks are available:")
    for drink in sorted(DRINKS):
        print(f"* {drink}")


def run_find_by_name(name):
    drink = DRINKS.get(name, None)
    if drink:
        print(', '.join(sorted(drink)))
    else:
        print(f"{name} does not exist.")


def run_find_by_ingredients(ingredients):
    drink = find_drink(ingredients)
    if drink:
        print(drink)
    else:
        print("No drink found.")


def run(args):
    # todo:
    # - Check which of the flags have been given
    # - Do additional manual checking, where needed
    # - Call correct run_... function
    if args.list:
        run_list()
    if args.ingredients:
        run_find_by_ingredients(args.ingredients)
    if args.drink:
        run_find_by_name(args.drink)
    return


def get_parser():
    parser = argparse.ArgumentParser()
    # todo: Configure parser correctly
    group = parser.add_mutually_exclusive_group(required=True)
    group.add_argument('-l', '--list', action='store_true')
    group.add_argument('-i', '--ingredients', nargs='+')
    group.add_argument('-d', '--drink')
    return parser


def main(args=None):
    parsed = get_parser().parse_args(args)
    run(parsed)


# todo: call main() with no arguments, but only when run as a script

Overwriting drink_generator_cli.py


Then we use `!python` as an external subprocess to run your CLI application:

In [49]:
!python drink_generator_cli.py

3786.52s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
usage: drink_generator_cli.py [-h]
                              (-l | -i INGREDIENTS [INGREDIENTS ...] | -d DRINK)
drink_generator_cli.py: error: one of the arguments -l/--list -i/--ingredients -d/--drink is required
