<a href="https://colab.research.google.com/github/OSGeoLabBp/tutorials/blob/master/english/data_processing/lessons/commandlineparameters.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Command line arguments

Command line arguments make our programs more flexible and useful in automation tasks. Command line arguments are widely used in CLI (Command Line Interface) programs.

Command line arguments are given after the program name in the command, for example:

```
python my_program.py something.csv ';'
```

In the example above, a file name and a separator character are given in the command line.

Mostly one of the following solutions is used to handle command line parameters:

*   **argv** variable from **sys** module;
*   **argparse** module.




**Important comment**

>On your own Windows machine you have to open a **command window** to pass command line parameters to your program.

## Using argv list

The argv list contains all parameters from the command line as strings. The first item (at zero index) is the name of the program.

Note

In colab it is not possible to use the command line arguments directly. We have to save the Python code to the colab virtual machine and start with **!python**. On your own machine, the exclamation mark is not necessary.

In [32]:
code = """
from sys import argv
print(f"{len(argv)} arguments given in the command line")
for i, arg in enumerate(argv):
    print(f"{i}th parameter: {arg} ({type(arg)}")
"""
with open("argv_test.py", "w") as f:
    print(code, file=f)

In [33]:
!python argv_test.py abc 12 this.txt

4 arguments given in the command line
0th parameter: argv_test.py (<class 'str'>
1th parameter: abc (<class 'str'>
2th parameter: 12 (<class 'str'>
3th parameter: this.txt (<class 'str'>


Note that, all parameters are stored as string values (even 12).

Try to run the argv_test.py program with different parameters and check the output.

If you would like to try the example on your own machine then copy the following code into the *argv_test.py* file.

```
from sys import argv
print(f"{len(argv)} arguments given in the command line")
for i, arg in enumerate(argv):
    print(f"{i}th parameter: {arg} ({type(arg)}")
```

Open a command windows with the **cmd** command and enter:

```
python argv_test.py here you can add your parameters
```

## Argparse module

While **argv** is mainly used for positional parameters (e.g. the first parameter is the input file name, the second is the fields separator, etc). Using the **argparse** module, optional parameters can be handled easier. Argparse module supports switches and default values.

There are two types of switches, the short one like "-h" and the long one like "--help".

In [34]:
pcode = """
import argparse

parser = argparse.ArgumentParser()   # create a parser object
# definition of parameters
parser.add_argument('-i', '--input', # short and long switch
                    type=str,        # text string parameter
                    required=True,   # obligatory parameter
                    help="name of input text file")
parser.add_argument('-s', '--separator',
                    type=str, default=',',
                    help="field separator character in input file")
parser.add_argument('-l', '--headerlines',
                    type=int, default=0,
                    help="Number of header lines in input file")
# get parameters
args = parser.parse_args()
print(f"input file: {args.input}")
print(f"field separator: {args.separator}")
print(f"number of header lines: {args.headerlines}")
"""
with open("argparse_test.py", "w") as f:
    print(pcode, file=f)

In [35]:
!python argparse_test.py -h

usage: argparse_test.py
       [-h]
       -i
       INPUT
       [-s SEPARATOR]
       [-l HEADERLINES]

optional arguments:
  -h, --help
    show this
    help
    message and
    exit
  -i INPUT, --input INPUT
    name of
    input text
    file
  -s SEPARATOR, --separator SEPARATOR
    field
    separator
    character
    in input
    file
  -l HEADERLINES, --headerlines HEADERLINES
    Number of
    header
    lines in
    input file


In [36]:
!python argparse_test.py --input file_to_process.csv --separator ";"

input file: file_to_process.csv
field separator: ;
number of header lines: 0


In [37]:
!python argparse_test.py --i file_to_process.csv -l 2

input file: file_to_process.csv
field separator: ,
number of header lines: 2


In [38]:
!python argparse_test.py

usage: argparse_test.py
       [-h]
       -i
       INPUT
       [-s SEPARATOR]
       [-l HEADERLINES]
argparse_test.py: error: the following arguments are required: -i/--input


If you would like to try the example on your own machine then copy the following code into the argparse_test.py file.

```
import argparse

parser = argparse.ArgumentParser()   # create a parser object
# definition of parameters
parser.add_argument('-i', '--input', # short and long switch
                    type=str,        # text string parameter
                    required=True,   # obligatory parameter
                    help="name of input text file")
parser.add_argument('-s', '--separator',
                    type=str, default=',',
                    help="field separator character in input file")
parser.add_argument('-l', '--headerlines',
                    type=int, default=0,
                    help="Number of header lines in input file")
# get parameters
args = parser.parse_args()
print(f"input file: {args.input}")
print(f"field separator: {args.separator}")
print(f"number of header lines: {args.headerlines}")
```

Open a command windows with the cmd command and enter:

```
python argv_test.py here you can add your parameters
```


## Working example for argv

In the follwing example we write a simple Python program to filter lines of the input text files based on a regular expression. The first parameter is the regexp pattern and the following parameters are the input files. The filtered output is sent to the command window.

In [39]:
sample_code = """
from sys import argv
from os.path import exists
import re

if len(argv) < 3:
    print(f"Usage {argv[0]} regexp file1 [file2] ... [filen]")
    exit()
try:
    pattern = re.compile(argv[1])
except:
    print(f"ERROR invalid regexp: {argv[1]}")
    exit()
for fname in argv[2:]:
    if not exists(fname):
        print(f"ERROR {fname} does not exist")
        continue
    with open(fname, 'r') as fp:
        for line in fp:
            if pattern.match(line):
                print(line, end="")
"""
with open("grep.py", "w") as f:
    print(sample_code, file=f)

Let's create two data file for our program.

In [40]:
fo = open("hamlet.txt", "w")
print ("""To be, or not to be: that is the question:
Whether ’tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles,
And by opposing end them? To die: to sleep;
No more; and, by a sleep to say we end
The heart-ache and the thousand natural shocks
That flesh is heir to, ’tis a consummation
Devoutly to be wish’d. To die, to sleep;
To sleep: perchance to dream: ay, there’s the rub;
For in that sleep of death what dreams may come
When we have shuffled off this mortal coil""", file=fo)
fo.close()
fo = open("zen_of_python.txt", "w")
print("""The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!""", file=fo)
fo.close()

Usage examples

In [41]:
!python grep.py "^T" hamlet.txt # "T" at the beginning of the line

To be, or not to be: that is the question:
The slings and arrows of outrageous fortune,
The heart-ache and the thousand natural shocks
That flesh is heir to, ’tis a consummation
To sleep: perchance to dream: ay, there’s the rub;


In [42]:
!python grep.py "^[TS]" hamlet.txt zen_of_python.txt # "T" or "S" at the beginning of the line

To be, or not to be: that is the question:
The slings and arrows of outrageous fortune,
The heart-ache and the thousand natural shocks
That flesh is heir to, ’tis a consummation
To sleep: perchance to dream: ay, there’s the rub;
The Zen of Python, by Tim Peters
Simple is better than complex.
Sparse is better than dense.
Special cases aren't special enough to break the rules.
There should be one-- and preferably only one --obvious way to do it.


In [43]:
!python grep.py "[abc" hamlet.txt

ERROR invalid regexp: [abc


In [44]:
!python grep.py ";$" zen.txt

ERROR zen.txt does not exist


A complex example for **argparse** module is avalable at the end of *Text file processing in Python* lesson.