# Command line inputs with argparse

Relying on positional command line arguments using `sys.argv` is fine for quick scripts. However, if you want to write something that you will use often, will share with others, or should support different modes of use, positional command line arguments are much more difficult to use than setting up a proper command line interface. argparse is a Python standard library module that provides functionality to process command line inputs. It provides the ability to accepts command line inputs of different types, optional or required inputs, mutually exclusive inputs, and automates the generation of a help message for your programs.

As a Jupyter notebook is not a command line application, I will not be able to demonstrate the command line components of argparse here. Instead, this document will illustrate a few of the basic options argparse provides to define arguments. You are then encouraged to try them out in your own scripts to see how they work in practice. In addition, you should look over [the argparse documentation](https://docs.python.org/3/library/argparse.html) to see what sorts of things you can do using this module.

## Setting up argparse

The main component of the argparse module is the `ArgumentParser` class. That class provides all the methods to define the inputs of your program. The first step in setting up command line arguments is to create an instance of that class. You can then add each desired argument to that instance.

Let's create an instance of the `argparse.ArgumentParser` class and print it to see what it looks like.

In [1]:
import argparse

p = argparse.ArgumentParser()

print(p)

ArgumentParser(prog='ipykernel_launcher.py', usage=None, description=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)


As you can see, it has a few attributes that are included in the printed representation of the instance. Of note, it stored the name of the prog in which is was created in an attribute, "prog", and has an "add_help" attribute that is set to `True`. The "add_help" setting means that this argparse instance is set up to print a nice help message if we call our program with "-h" or "--help". We can also view the help message using the following command.

In [2]:
p.print_help()

usage: ipykernel_launcher.py [-h]

options:
  -h, --help  show this help message and exit


That usage statement will be more sensible if you try this in your own script rather than a Jupyter notebook. Otherwise, you can see that without any effort, we have an object that can produce a simple help message. Let's add to it and see how it changes. First, let's recreate the instance and add a description of our overall program.

In [3]:
p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet")

p.print_help()

usage: ipykernel_launcher.py [-h]

This is a test program that doesn't do anything yet

options:
  -h, --help  show this help message and exit


Adding a description upon instantiation results in a help message that includes a description of our program. We'll leave the other settings alone for this demonstration. Instead, let's move on to adding some arguments.

## Adding an argument to the parser

The most basic argument you can add stores a positional argument. You can name the argument so that when the arguments are processed, you can refer to the one you want.

In [4]:
p.add_argument("foo")

p.print_help()

usage: ipykernel_launcher.py [-h] foo

This is a test program that doesn't do anything yet

positional arguments:
  foo

options:
  -h, --help  show this help message and exit


As you can see, there is now a positional argument described in both the usage and in the help message below. However, the help message isn't very useful as it doesn't say what "foo" is. We can add a help message for our arguments as well.

Let's start again with a fresh instance of the `ArgumentParser` class and add another argument. We need to start with a new instance as otherwise we would have two positional arguments named "foo"

In [5]:
p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet") # Replace p with a new instance

p.add_argument("foo", help="Thing you want the script to analyze")

p.print_help()

usage: ipykernel_launcher.py [-h] foo

This is a test program that doesn't do anything yet

positional arguments:
  foo         Thing you want the script to analyze

options:
  -h, --help  show this help message and exit


Now that we have an argument set up so that users of our script know what inputs the script takes, how do we go about using the inputs they provide within the script? All we need to do is tell our `ArgumentParser` instance to parse the arguments. This part is going to look different in the Jupyter notebook than how it will look in a command line application so I strongly recommend you try it out yourself.

## Parsing command line arguments

To parse command line arguments, you simply use the `ArgumentParser.parse_args()` method. The default behaviour of that method is to get its inputs from `sys.argv[1:]`. However, in this notebook, we don't access to any `sys.argv` inputs, so I'm going to give it input as a `list` of `str`s (essentially what `sys.argv` is). This is the part that will look different in your script. Basically, you can just call the `parse_args()` method without input within a script, while here we have to pass arguments to that method. In either case, `parse_args()` will do what the name suggests and parse the args. It will then return an object to us that inlcudes each argument and its associated input.

In [6]:
args = p.parse_args(["input"])

print(args)

Namespace(foo='input')


As we had only set up our `ArgumentParser` instance with one positional argument named "foo", that's what our single input was assigned to. We can now work with out "foo" input as an attribute of our new `args` object.

In [7]:
print(args.foo)

input


That's all there is to it. If you simply wanted named positional arguments and a help message, you could just add more arguments to the `ArgumentParser` instance and it would generate the help message and handle organizing the command line inputs into a namespace for you.

However, we can also set up more complex command line interfaces using argparse. Let's take a look at some other argument settings we can use.

## Options, types, and required arguments

Positional arguments like the one we just used above aren't much different than just using `sys.argv` in that everything is determined by the order in which inputs are provided. You may prefer to use options to provide your inputs in any order while still being able to indicate what each is. You can specify options for your arguments in the `add_argument()` call you make to add an argument. You can specify short or long options or both.

In [8]:
%%capture 
# Ignore the above, it just suppresses the printed output from this code as we're not interested in seeing it

p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet")

p.add_argument(
    "-f", "--file", # both a short and a long option
    help="Input file to analyze"
)
p.add_argument(
    "-o", # just a short option
    help="Path to output file"
)


Now let's try providing those arguments and see what we end up with

In [9]:
args = p.parse_args(["-f", "input_file.txt", "-o", "out_file.png"])

print(args)

Namespace(file='input_file.txt', o='out_file.png')


As the printed output above suggests, we can now refer to those inputs using either "file" or "o". Note that when you define a long option, that is then the attribute name that is used to refer to the inputs. If you don't provide a long option then the short option name is used.

In [10]:
print(args.file)
print(args.o)

input_file.txt
out_file.png


We can provide those inputs in any order. argparse will check the list of inputs and associate each option with the string that follows it in the list. That means you can provide options in any order. It only matters that you give the associated input straight after each option.

In [11]:
args = p.parse_args(["-o", "out_file.png", "-f", "input_file.txt"])

print(args)

Namespace(file='input_file.txt', o='out_file.png')


What if we don't specify one of the inputs that are defined in our argument parsing class?

In [12]:
args = p.parse_args(["-f", "input_file.txt"])

print(args)

Namespace(file='input_file.txt', o=None)


The default setting is for each option to be optional. If you don't provide an input for it, the value is simply set to `None`. However, that might not be what you want. Perhaps you want to make certain inputs mandatory. You can do that as an option when specifying your arguments.

In [13]:
%%capture 
# Ignore the above, it just suppresses the printed output from this code as we're not interested in seeing it

p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet")

p.add_argument(
    "-f", "--file", # both a short and a long option
    help="Input file to analyze",
    required=True
)
p.add_argument(
    "-o", # just a short option
    help="Path to output file",
    required=False
)

Now we will always have to specify an input file, but the output file is optional. The output file was actually already optional as we saw above. That is the default setting, but now it is explicit.

In [14]:
args = p.parse_args(["-f", "input_file.txt"])

print(args)

Namespace(file='input_file.txt', o=None)


In [15]:
args = p.parse_args(["-o", "out_file.png"])

print(args)

usage: ipykernel_launcher.py [-h] -f FILE [-o O]
ipykernel_launcher.py: error: the following arguments are required: -f/--file


SystemExit: 2

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


The above error will look a bit more clear in a terminal. Jupyter notebook doesn't handle that error very well as it is calling an exit function. Don't worry too much about the specifics of how the above error looks. Try it in your own script and you'll see a message printed to the terminal telling you that you omitted a required argument and the script will abort.

Next let's take a look at a couple more things you might want to control about an argument: the class of object it is, and what you will refer to it using within the argument namespace. Those are both things that you can set when defining your arguments. We've already seen that the default names within the namespace object are derived from the long or short options. In addition, if you look at the printed arguments above, you'll see that our inputs are being stored as `str` instances. Let's read in some numbers using short options and specify what we want to call them.

In [16]:
p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet")

p.add_argument(
    "-1", # You can use numbers for your options if you like
    help="First number",
    required=True,
    dest="num_one", # This is going to be the name of our args attribute
    type=int
)
p.add_argument(
    "-2", 
    help="Second number",
    required=False,
    dest="num_two",
    type=int
)

args = p.parse_args(["-1", "10", "-2", "5"])

print(args)

Namespace(num_one=10, num_two=5)


As you can see, we now have the provided numbers stored in our `args` object using the names we specified. You can see that the inputs are stored as ints now as they have no quotes in the printed outputs. Furthermore we can check that.

In [17]:
print(type(args.num_one))

<class 'int'>


We can therefore use that attribute just like we would any `int` object

In [18]:
print(args.num_one + args.num_two)

15


Finally, let's add an argument to change the mode of operation for our program. argparse can create args which store boolean values (`True` and `False`). You can use these booleans within your script to do things like control `if`/`else` blocks.

In [19]:
p = argparse.ArgumentParser(description="This is a test program that doesn't do anything yet")

p.add_argument(
    "-1", # You can use numbers for your options if you like
    help="First number",
    required=True,
    dest="num_one", # This is going to be the name of our args attribute
    type=int
)
p.add_argument(
    "-2", 
    help="Second number",
    required=False,
    dest="num_two",
    type=int
)
p.add_argument(
    "--subtract", 
    help="operation mode",
    required=False,
    action="store_true" # Default is false
)

args = p.parse_args(["-1", "10", "-2", "5", "--subtract"])

if args.subtract:
    print(args.num_one - args.num_two)
else:
    print(args.num_one + args.num_two)

5


There are lots of other settings you can control with argparse such as specifying fixed choices for intputs, controling the number of inputs to be stored for positional arguments, and defining mutually exclusive groups of arguments. [The documentation](https://docs.python.org/3/library/argparse.html) has great information about all the possibilities that argparse offers. 