<img src='img/logo.png'>
<img src='img/title.png'>
<img src='img/py3k.png'>

# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
* [Command line interface](#Command-line-interface)
	* [Basic argument and flag parsing](#Basic-argument-and-flag-parsing)
		* [Import argparse and add arguments](#Import-argparse-and-add-arguments)
	* [Flags and Options](#Flags-and-Options)
	* [Adding Descriptions](#Adding-Descriptions)
	* [Files and stdin](#Files-and-stdin)


# Learning Objectives:

After completion of this module, learners should be able to:

* Understand the types of arguments passed to command line interfaces
* Use `argparse` to define simple positional arguments
* Use `argparse` to define optional arguments
* Use `argparse` to read `FileType` arguments include `sys.stdin`

# Command line interface

Several methods exist within Python to read and parse command line arguments. The following packages are among most most common.

* `input`: read data directly from `stdin`. Not recommended
* `sys.argv`: grab individual space-separated items from the command line
* `argparse`: reads `sys.argv` and makes user friendly command line interfaces
* `optparse`: deprecated in favor of `argparse`
* `docopt`: use docstrings to program command line arguments

For this section we will use [`argparse`](https://docs.python.org/3/library/argparse.html), which is part of the Python Standard Library. It has a good set of out-of-the box features to illustrate many aspects of argument handling within an application.

There are two types of items that are parsed on the command line

* **arguments** are positional specifiers that have no preceding dashes and are generally required by the Command Line Interface (CLI). Filenames are good examples of **arguments**; think of programs like `grep` and `less`. When more than one **argument** is required the order in which they are parsed is important.
* **flags** (or **options**) are specifiers that can appear in any order (and between **arguments**) and are preceded by one or two dashes. By convention a single dash is used with a one or two letter flag (`-h`) and two dashes are used with whole words (`--help`). Flags can be as switches to enable or disable features of your program. They can also be followed immediately by a **value** string to provide keyword or arbitrary input (`-f input-filename` or `--file input-filename`).

## Basic argument and flag parsing

For the remainder of this section we will be using the Jupyter notebook to write scripts and run them as if they were running on the command line.

In [None]:
%%file tmp/wc.py
#!/usr/bin/env python
"""
word count

Similiar to wc on Unix systems. For a given input file returns returns 

number-of-lines number-of-word number-of-characters filename
"""

import sys

def wc(myFile):
    with open(myFile) as f:
        lines=f.readlines()
    
    words = [word for words in lines for word in words.split()]
    chars = ''.join(lines)
    
    ret = '%8d %7d %7d %s' % (len(lines),len(words),len(chars),myFile)
    
    return ret


if __name__ == '__main__':
    print(wc(sys.argv[1]))

Run the script on itself.

In [None]:
!python tmp/wc.py tmp/wc.py

What happens if we forget to pass the filename?

In [None]:
!python tmp/wc.py

### Import argparse and add arguments

In [None]:
%%file tmp/wc.py
#!/usr/bin/env python
"""
word count

Similiar to wc on Unix systems. For a given input file returns returns 

number-of-lines number-of-word number-of-characters filename
"""

import sys
import argparse

def wc(myFile):
    """Read the file and return number of lines, words and characters"""
    with open(myFile) as f:
        lines=f.readlines()
    
    words = [word for words in lines for word in words.split()]
    chars = ''.join(lines)
    
    counts = '%8d %7d %7d %s' % (len(lines),len(words),len(chars),myFile)
    
    return counts


def cli():
    """Define the command line interface"""
    parser = argparse.ArgumentParser()
    parser.add_argument('filename')
    
    # parse_args automatically reads from sys.argv
    return parser.parse_args()
    

if __name__ == '__main__':
    args=cli()
    print(wc(args.filename))

In [None]:
!python tmp/wc.py tmp/wc.py

The `filename` argument is now required and `argparse` automatically setup the `-h/--help` arguments to display usage information.

In [None]:
!python tmp/wc.py

In [None]:
!python tmp/wc.py --help

## Flags and Options

We can use flags/options to change the behavior of `wc.py` to only show certain information. 

The `argparse.add_argument` function has a keyword argument called `action` whose default value is `store`. This means that the argument provided to the cli is stored, which allowed `args.filename` to be used in the above example. Other provided actions are

* `store_true`: `True` if the flag is provide `False` otherwise
* `store_false`: The opposite of above
* `store_const`: Allows default option value to be set
* `append`: Append the option value to a list. Useful for repeated usage of a flag.
* `append_const`: Default a default value to append
* `count`: Count the number of occurances of the flag

In [None]:
%%file tmp/wc.py
#!/usr/bin/env python
"""
word count

Similiar to wc on Unix systems. For a given input file returns returns 

number-of-lines number-of-word number-of-characters filename
"""

import sys
import argparse

# wc now take all of the arguments as input
def wc(args):
    """Read the file and return number of lines, words and characters"""
    with open(args.filename) as f:
        lines=f.readlines()
    
    counts = ''
    words = [word for words in lines for word in words.split()]
    chars = ''.join(lines)
    
    lineCount = '%8d' % len(lines)
    wordCount = '%8d' % len(words)
    charCount = '%8d' % len(chars)
    byteCount = '%8d' % len(chars.encode('utf-8'))
    
    if(args.l):
        counts += lineCount
    if(args.w):
        counts += wordCount
    if(args.m):
        counts += charCount
    if(args.c):
        counts += byteCount
    
    if( not (args.l or args.w or args.c or args.m)):
        counts += lineCount+wordCount+charCount
    
    counts += ' %s' % args.filename
    
    return counts


def cli():
    """Define the command line interface"""
    parser = argparse.ArgumentParser()
   
    # filename is a required argument
    parser.add_argument('filename')
    
    # flags are used to perform actions
    parser.add_argument('-c',action='store_true') #print number of bytes
    parser.add_argument('-l',action='store_true') #print number of lines
    parser.add_argument('-m',action='store_true') #print number of characters
    parser.add_argument('-w',action='store_true') #print number of words
    
    # parse_args automatically reads from sys.argv
    return parser.parse_args()
    

if __name__ == '__main__':
    args=cli()
    print(wc(args))  

In [None]:
!python tmp/wc.py -l tmp/wc.py

Only the specified flags can be provided to the cli

In [None]:
!python tmp/wc.py -z tmp/wc.py

## Adding Descriptions

Use the `help` keyword argument in `add_argument` to provide a description of the argument when using `--help`. Here we also show that the long and short form of the option/flag can provided. Notice that the `args` object now only has attributes for the long version of the option names.

In [None]:
%%file tmp/wc.py
#!/usr/bin/env python
"""
word count

Similiar to wc on Unix systems. For a given input file returns returns 

number-of-lines number-of-word number-of-characters filename
"""

import sys
import argparse

# wc now take all of the arguments as input
def wc(args):
    """Read the file and return number of lines, words and characters"""
    with open(args.filename) as f:
        lines=f.readlines()
    
    counts = ''
    words = [word for words in lines for word in words.split()]
    chars = ''.join(lines)
    
    lineCount = '%8d' % len(lines)
    wordCount = '%8d' % len(words)
    charCount = '%8d' % len(chars)
    byteCount = '%8d' % len(chars.encode('utf-8'))
    
    if(args.lines):
        counts += lineCount
    if(args.words):
        counts += wordCount
    if(args.chars):
        counts += charCount
    if(args.bytes):
        counts += byteCount
    
    if( not (args.lines or args.words or args.chars or args.bytes)):
        counts += lineCount+wordCount+charCount
    
    counts += ' %s' % args.filename
    
    return counts


def cli():
    """Define the command line interface"""
    parser = argparse.ArgumentParser(description='A basic line/word/character counting script.')
   
    # filename is a required argument
    parser.add_argument('filename',help='File to be parsed')
    
    # flags are used to perform actions
    parser.add_argument('-c','--bytes',action='store_true',
                        help='Print the number of bytes in the file')
    parser.add_argument('-l','--lines',action='store_true',
                        help='Print the number of lines in the file')
    parser.add_argument('-m','--chars',action='store_true',
                        help='Print the number of characters in the file')
    parser.add_argument('-w','--words',action='store_true',
                        help='Print the number of words in the file')
    
    # parse_args automatically reads from sys.argv
    return parser.parse_args()
    

if __name__ == '__main__':
    args=cli()
    print(wc(args))  

In [None]:
!python tmp/wc.py --help

In [None]:
!python tmp/wc.py -l tmp/wc.py --words

## Files and stdin

A good practice when developing CLI utilities is to be able to take the output of one command and *pipe* it's output to another over the standard input. This is similar to the `input` method we saw earlier in this course. `argparse` has a convenient way of specifying `FileType` arguments that allow for this usage by specifying the `type`. The `FileType` argument provides the `-` argument to the cli which instructs `argparse` to read from `sys.stdin` instead of a file. 

The `type` keyword argument to `add_argument` allows the developer to specify the expected type of the argument and automatically perform the type casting and error checking. 

In [None]:
%%file tmp/wc.py
#!/usr/bin/env python
"""
word count

Similiar to wc on Unix systems. For a given input file returns returns 

number-of-lines number-of-word number-of-characters filename
"""

import sys
import argparse

# wc now take all of the arguments as input
def wc(args):
    """Read the file and return number of lines, words and characters"""
    lines=args.filename.readlines()
    
    counts = ''
    words = [word for words in lines for word in words.split()]
    chars = ''.join(lines)
    
    lineCount = '%8d' % len(lines)
    wordCount = '%8d' % len(words)
    charCount = '%8d' % len(chars)
    byteCount = '%8d' % len(chars.encode('utf-8'))
    
    if(args.lines):
        counts += lineCount
    if(args.words):
        counts += wordCount
    if(args.chars):
        counts += charCount
    if(args.bytes):
        counts += byteCount
    
    if( not (args.lines or args.words or args.chars or args.bytes)):
        counts += lineCount+wordCount+charCount
    
    counts += ' %s' % args.filename
    
    return counts


def cli():
    """Define the command line interface"""
    parser = argparse.ArgumentParser(description='A basic line/word/character counting script.')
   
    # filename is a required argument
    parser.add_argument('filename',help='File to be parsed', type=argparse.FileType('r'))
    
    # flags are used to perform actions
    parser.add_argument('-c','--bytes',action='store_true',
                        help='Print the number of bytes in the file')
    parser.add_argument('-l','--lines',action='store_true',
                        help='Print the number of lines in the file')
    parser.add_argument('-m','--chars',action='store_true',
                        help='Print the number of characters in the file')
    parser.add_argument('-w','--words',action='store_true',
                        help='Print the number of words in the file')
    
    # parse_args automatically reads from sys.argv
    return parser.parse_args()
    

if __name__ == '__main__':
    args=cli()
    print(wc(args))  

In [None]:
!head tmp/wc.py | python tmp/wc.py -

<img src='img/copyright.png'>