# Setup

To run these examples, you'll need to have python, jupyter and the following packages

* numpy
* pandas


The cleanest and easiest way to get the requirements is to use the anaconda packaging and environment system. First navigate here:

https://conda.io/miniconda.html

Then download either python 2.7 or python 3. Both should work for these examples. When you run the installer, you'll have the option to choose the install path. You'll also choose whether to make this the default python interpreter. You can always edit your bashrc file to change this behavior.

# Introduction

## Scope

This tutorial focuses on the key concepts necessary for working with scientific data in python. To that end, we are interested in how to represent, transform, and visualize scientific data. 

* Objects and Data types
* Functions
* Flow control

# Numerical Types

Python has several built-in data types available to you. Use the ```type``` function if you need to see what you're working with. 

In [519]:
i = 1 #create a new integer
type(i)

int

In [520]:
j = 1.0 #create a new real number (float)
type(j)

float

In [521]:
type(i*j)

float

You can also use type-casting to convert one type to another where possible.

In [522]:
int(3.0)

3

Python automatically converts between integers and real numbers, so multiplying an int by a float produces another float. While extremely convenient, this can lead to arithmetic errors if you're not careful.

## Strings

These are alpha-numeric data of arbitrary length.

In [523]:
mystr = 'hello world!'
mystr

'hello world!'

In [524]:
mystr = """
Hello.
This is a multi-lined string"""
print(mystr)


Hello.
This is a multi-lined string


Strings have their own methods - operations you can perform to produce other strings.

In [525]:
mystr = "{0} x {1} = {2:3.2f}".format(3, 5, 15) # a formatted string
print(mystr)

3 x 5 = 15.00


Try tab completion to look for other methods that might be useful. 

    mystr.<tab>
    
This will pop up a list of other commands you can perform.

Use the jupyter help function to see more info.

In [526]:
help(mystr.format)

Help on built-in function format:

format(...) method of builtins.str instance
    S.format(*args, **kwargs) -> str
    
    Return a formatted version of S, using substitutions from args and kwargs.
    The substitutions are identified by braces ('{' and '}').



Alternatively, put ? at the end of a command to and the documentation will appear a the bottom of the notebook (Hit ```esc``` key to remove it).

More documentation of string formatting can be found here

https://docs.python.org/2/library/string.html#format-string-syntax

# Lists

Lists are created with brackets and commas, and can contain combination of data types, including other lists.

In [527]:
mylist = [3,'four',5., ['another', 'list']]
mylist

[3, 'four', 5.0, ['another', 'list']]

You can iterate over lists using a for loop.

In [528]:
for x in mylist:
    print(x)

3
four
5.0
['another', 'list']


In [529]:
mylist[0] # access the first element

3

In [530]:
mylist[-1] # access the last element

['another', 'list']

In [531]:
mylist[-2] # access the second-to-last element

5.0

In [532]:
mylist.append(8) # insert at the end of the list
mylist

[3, 'four', 5.0, ['another', 'list'], 8]

In [533]:
mylist.pop(0) # take off the first element

3

In [534]:
mylist

['four', 5.0, ['another', 'list'], 8]

**Lists are best when dealing with smaller numbers of elements (less than 1k).** For numerical data, numpy arrays are better, especially for larger datasets.

## Tuples

Tuples are like lists, but are immutable (cannot be edited). Tuples are created with parenthesis and commas:

In [535]:
mytuple = (3,4,5)
mytuple

(3, 4, 5)

In [536]:
# mytuple[2] = 0 # Uncomment to see what happens when you try to change an element of a tuple.
mytuple[2]

5

You can also make a tuple without parenthesis, which comes in handy when assigning multiple variables

In [537]:
x, y, z = 3, 4, 5

In [538]:
print(x,y,z)

3 4 5


## Dictionaries

Dictionaries are collections of {key, value} pairs. There are several ways to create a dictionary:

In [539]:
dict(a = 1, b = 'two', c = 3) # constructor notation

{'a': 1, 'b': 'two', 'c': 3}

In [540]:
{'a': 1, 'b': 'two', 'c': 3} # bracket notation

{'a': 1, 'b': 'two', 'c': 3}

In [541]:
mylist = [('c',3),('a',1),('b','two')] # from a list of 2-element tuples
d = dict(mylist)
d

{'c': 3, 'a': 1, 'b': 'two'}

elements of a dictionary are accessible using bracket notation. A ```KeyError``` will be raised if the key is missing.

In [542]:
d['a']

1

In [543]:
d['a'] = 'one'
d

{'c': 3, 'a': 'one', 'b': 'two'}

You can iterate over dictionaries using the ```items``` method.

In [544]:
for k,v in d.items():
    print(k, v)

c 3
a one
b two


The order of the dictionary is arbitrary, but you can use an ordered dictionary where necessary:

In [545]:
from collections import OrderedDict
mylist = [('youfirst',3),('second',1),('third','two')] 
OrderedDict(mylist) # Will be created in the same order as the list

OrderedDict([('youfirst', 3), ('second', 1), ('third', 'two')])

As of python 3.7, dictionaries are ordered by default. So, if you are running this notebook in python3, the following dictionary should be in the same order as the above OrdereDict

In [546]:
dict(mylist)

{'youfirst': 3, 'second': 1, 'third': 'two'}

## DefaultDictionaries

These allow you to give a default type to values of a dictionary.

In [547]:
from collections import defaultdict

d = defaultdict(list)
d['a'].append(2) #creates a list at 'a' and inserts 2 as the first element
d['a']

[2]

## None Type

It is worth noting the special ```None``` type which is useful when you need a default value that is not numeric.

In [548]:
a = 0
print(a == 0)
print(a is None)

True
False


# Scientific Data Types

## Numpy Arrays

These are highly optimized data structures suitable for number crunching. The methods available to numpy objects are coded in C or Fortran. Most python analysis packages use numpy under the hood.

In [549]:
import numpy as np # this puts the numpy library in the np namespace

In [550]:
a = np.linspace(0, 1, 12) # real number space in [0,1]
a.shape

(12,)

In [551]:
a = a**2 # square the array
a

array([0.        , 0.00826446, 0.03305785, 0.07438017, 0.1322314 ,
       0.20661157, 0.29752066, 0.40495868, 0.52892562, 0.66942149,
       0.82644628, 1.        ])

In [552]:
a.resize(4,3) # resize the array in-place
a

array([[0.        , 0.00826446, 0.03305785],
       [0.07438017, 0.1322314 , 0.20661157],
       [0.29752066, 0.40495868, 0.52892562],
       [0.66942149, 0.82644628, 1.        ]])

In [553]:
a[1,:] # grab the first row, fortran-style

array([0.07438017, 0.1322314 , 0.20661157])

In [554]:
np.linalg.norm(a, axis = 1) # normalize each row

array([0.03407525, 0.25633161, 0.72957   , 1.45984197])

### Iterating with numpy arrays
While you *could* iterate over numpy arrays, you probably do not need to: numpy has an optimized C or Fortran version for almost every numerical operation you'll need. As an example, suppose we want to square a very large array. 

In [555]:
a = np.linspace(0,2,10000000) #10 million points on [0,2]

Let's create a copy of a, which we will use to store the result. 

In [556]:
b = a.copy() 

Note: we could have used ```b = a```, but then changing b would also change a

The slow way of squaring the array:

In [557]:
for i in range(len(a)):
    b[i] = a[i]**2
b[-1]

4.0

The fast way:

In [558]:
b = a**2
b[-1]

4.0

Note that the same would not work with a list:

In [559]:
a = range(10) # handy way of creating a list of integers from [0,9]
try:
    a**2
except TypeError as e:
    print(e)

unsupported operand type(s) for ** or pow(): 'range' and 'int'


**If you are dealing with numerical data you should probably use numpy.** It avoids the need to iterate in python and it probably has everything you need.

# Pandas DataFrame

On their surface, pandas data types look much like excel spread sheets. Under the hood, they are built on numpy arrays and they bring together many powerful features we've seen in other data types, making them ideal for data processing. Together, pandas and numpy have become a mainstay in the data science community.

To see how pandas works, let's start with their most ubiquitous type, the ```DataFrame```

In [560]:
import pandas as pd

In [561]:
names = [('elvis', 'presley', 85), ('bob','smith', 30), ('jane','doe', 32)]

names = pd.DataFrame(names, columns = ['First','Last', 'Age'], index = ['first','second','third'])
names

Unnamed: 0,First,Last,Age
first,elvis,presley,85
second,bob,smith,30
third,jane,doe,32


The above dataframe renders like a spreadsheet when viewed in a jupyter notebook.

### Accessing columns
A given column (a ```pd.Series``` type) can be retrieved using dictionary-like syntax.

In [562]:
names['Age']

first     85
second    30
third     32
Name: Age, dtype: int64

We can also access the same column through dot notation, provided the column name follows python's naming conventions and is not already used by one of dataframe's methods.

In [563]:
names.First

first     elvis
second      bob
third      jane
Name: First, dtype: object

In [564]:
names.sort_values('Age')

Unnamed: 0,First,Last,Age
second,bob,smith,30
third,jane,doe,32
first,elvis,presley,85


In [565]:
names.Age**2

first     7225
second     900
third     1024
Name: Age, dtype: int64

### Accessing rows

A given row may be accessed using the ```loc``` and ```iloc``` indexers, both of which will return a ```pd.Series``` object. Use the first if you know the index of the row by name.

In [566]:
names.loc['second']

First      bob
Last     smith
Age         30
Name: second, dtype: object

In [567]:
names.iloc[1]

First      bob
Last     smith
Age         30
Name: second, dtype: object

You may also provide a boolean series object as the indexer.

In [568]:
names[names.Age > 30]

Unnamed: 0,First,Last,Age
first,elvis,presley,85
third,jane,doe,32


## Multi-indexed data

Suppose we want to represent a regular grid of 24 values indexed by i,j,k. We must first construct a ```MultiIndex```.

In [569]:
multi_index = pd.MultiIndex.from_product([range(2), range(3), range(4)], names = ['i','j','k'])

Now we need our data in a flattened array of compatible length.

In [570]:
data = np.linspace(0,1,24)

In [571]:
df = pd.DataFrame(data, index = multi_index)

df.head(10) #get the first 10 rows 

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,0
i,j,k,Unnamed: 3_level_1
0,0,0,0.0
0,0,1,0.043478
0,0,2,0.086957
0,0,3,0.130435
0,1,0,0.173913
0,1,1,0.217391
0,1,2,0.26087
0,1,3,0.304348
0,2,0,0.347826
0,2,1,0.391304


In [572]:
df.loc[1,:,2] # get all values for i = 1, k = 2

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,0
i,j,k,Unnamed: 3_level_1
1,0,2,0.608696
1,1,2,0.782609
1,2,2,0.956522


Documentation on multi-indexers can be found here: https://pandas.pydata.org/pandas-docs/stable/advanced.html

## Time Series Data

For data indexed by time, we may use the special DatetimeIndex.

In [573]:
timerange = pd.date_range('Jan 1, 2018', 'Dec 31, 2018', freq = 'd')
doy = pd.Series(1, index = timerange)
doy.head()

2018-01-01    1
2018-01-02    1
2018-01-03    1
2018-01-04    1
2018-01-05    1
Freq: D, dtype: int64

Now we can group data by month and integrate each group individually. In this case, we get the total number of days in each month.

In [574]:
doy.groupby(pd.Grouper(freq = '1 M')).sum()

2018-01-31    31
2018-02-28    28
2018-03-31    31
2018-04-30    30
2018-05-31    31
2018-06-30    30
2018-07-31    31
2018-08-31    31
2018-09-30    30
2018-10-31    31
2018-11-30    30
2018-12-31    31
Freq: M, dtype: int64

We can also group by column

In [575]:
df = pd.DataFrame(dict(count = range(6),
                 fruit =['orange','apple','pear','orange','pear','pear']))
df

Unnamed: 0,count,fruit
0,0,orange
1,1,apple
2,2,pear
3,3,orange
4,4,pear
5,5,pear


In [576]:
df.groupby('fruit').mean()

Unnamed: 0_level_0,count
fruit,Unnamed: 1_level_1
apple,1.0
orange,1.5
pear,3.666667


# Functions

Now that we have the basic data types and operation, we can use them to write our own functions. There are two ways to create functions: lambda syntax and def syntax, but most functions are written using the ```def``` syntax. Here the first line must be a ```def``` statement. The remaining lines must be indented from the first line. The ```return``` statement sends the result back to the expression that called it.

In [577]:
def myfunc(x):
    return x + 1

print("{} + 1 = {}".format(2, myfunc(2)))

2 + 1 = 3


You may include optional values as arguments.

In [578]:
def myfunc(x = 2):
    return x + 1

print(myfunc())

3


Documentation strings can be included beginning at the first line of your function.

In [579]:
def myfunc(x = 2):
    """Adds 1 to any input value"""
    return x + 1

Now you can view the documentation by calling the help function.

In [580]:
help(myfunc) # or use myfunc? 

Help on function myfunc in module __main__:

myfunc(x=2)
    Adds 1 to any input value



### Multiple arguments

Functions can have multiple positional and optional arguments, but the optional ones must be listed last

In [581]:
def myfunc(x,y, z = 0):
    return x + y + z
myfunc(3,4)

7

In [582]:
try:
    myfunc(z = 1)
except TypeError as e:
    print(e)

myfunc() missing 2 required positional arguments: 'x' and 'y'


### Decorators
Functions that return other functions are known as decorators.

In [583]:
def verbose(func):
    def wrapper(*args,**kwargs):
        print('calling {} with arguments'.format(func.__name__),args,kwargs)
        return func(*args,**kwargs)
    return wrapper

newfunc = verbose(myfunc)
newfunc(2,3,z = 1)

calling myfunc with arguments (2, 3) {'z': 1}


6

Now ```newfunc``` refers to a wrapped version of ```myfunc```. 


Instead of keeping around an extra function definition, python provides the ```@decorator``` syntax:

In [584]:
@verbose
def myfunc(x,y, z = 0):
    return x + y + z

In [585]:
myfunc(3,4)

calling myfunc with arguments (3, 4) {}


7

This "syntactic sugar" allows us to easily modify our functions with minimal changes to our code.

## Lambda functions
These are single-line functions that are defined with the following syntax:

In [586]:
a = lambda x, y: x + y

now call the function with parenthesis

In [587]:
a(2,3) 

5

lambdas are useful if your function is a single expression that doesn't need to be used anywhere else in your code and doesn't require documentation. Kamodo uses lambda functions liberally.

# Custom Types
To create a custom data type, you will need to write a ```class```, which is a blueprint that describes how to instantiate objects of your new ```type```. Classes define what functions are available to their objects, as well as the attributes associated with them.

By convention, we capitalize the name of the class to distinguish between the object's blueprint and the object itself.

In [588]:
class Person(object): 
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name
    
    def greet(self):
        print('hello, my name is {} {}'.format(self.first_name, self.last_name))

The above class defines a ```Person``` class with two attributes, ```fist_name``` and ```last_name```, as well as a ```greet``` method that will be different for every person.

In [616]:
asher = Person('Asher', 'Pembroke')

The above line instantiates a new person. Python automatically calls the ```__init__``` method, so our greeting will give the expected result.

In [617]:
asher.greet()

hello, my name is Asher Pembroke


Notice that we called ```greet``` with no arguments and python did not raise an error, even though ```self``` was a named argument in the definition of ```greet```: unlike functions, the first argument to a method refers to the class instance itself and is passed automatically to the body of the function. (We could have used any other name for the first argument, but ```self``` is the most common.)

Like methods, Attributes are also accessed through dot notation:

In [591]:
asher.last_name

'Pembroke'

## Inheritance
Types can be reused and extended to cover specific use cases.

In [622]:
class Animal(object): # base class inherits from object in python 2.7. Use Animal() for python 3
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
        return self.name
    
    def greet(self):
        print('my name is {}'.format(self.name))

Let's create two new animal types for ```Dog``` and ```Cat```, which will inheirt the properties of ```Animal``` but change the ```greet``` method.

In [623]:
class Dog(Animal):
    def greet(self):
        print('bark!')

class Cat(Animal):
    def greet(self):
        print('meow!')

In [624]:
dog = Dog('Princess')
dog.greet()

bark!


In [625]:
dog.name

'Princess'

In [626]:
cat = Cat('Penelope')
cat.greet()

meow!


In addition to ```__init__```, there are many other ways to customize a new function. For instance, ```__str__``` will define how the instance should be converted to ```str``` representation. 

In [631]:
print(cat)

Penelope


For a complete list of these of special methods, see: https://docs.python.org/3.4/reference/datamodel.html#basic-customization

# File Input/Output

The ```with``` statement creates a context that allows us to open, edit, and automatically close a file. This is generally the safest way to write to a file.

In [632]:
with open('test.txt', 'w') as f:
    f.write('testing testing ....\n')
    f.write('\t1...2...3')
    
# File automatically closes

Trying to write outside the file context will result in an exception:

In [633]:
try:
    f.write('3')
except ValueError as e:
    print(e)

I/O operation on closed file.


We can view the file using cat:

In [634]:
!cat test.txt #! symbol will execute cat command lineM

testing testing ....
	1...2...3

You can even read and write binary data this way, though pandas and numpy are better suited in most cases.

Read more about the ```with``` statement here http://effbot.org/zone/python-with-statement.htm

### Pandas I/O
Pandas can also be used to read and write tabular data.

In [635]:
df = pd.DataFrame(dict(count = [i**2 for i in range(6)], 
                       fruit = ['apple', 'orange','pear','apple','orange','pear']))
df

Unnamed: 0,count,fruit
0,0,apple
1,1,orange
2,4,pear
3,9,apple
4,16,orange
5,25,pear


In [636]:
df.to_csv('fruit.csv', index_label='id')

In [637]:
!cat fruit.csv #verify contents of file

id,count,fruit
0,0,apple
1,1,orange
2,4,pear
3,9,apple
4,16,orange
5,25,pear


In [638]:
pd.read_csv('fruit.csv', index_col = 'id')

Unnamed: 0_level_0,count,fruit
id,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0,apple
1,1,orange
2,4,pear
3,9,apple
4,16,orange
5,25,pear


You can read and write to json using the ```to_json``` method.

In [639]:
print(df.to_json())

{"count":{"0":0,"1":1,"2":4,"3":9,"4":16,"5":25},"fruit":{"0":"apple","1":"orange","2":"pear","3":"apple","4":"orange","5":"pear"}}


provide a filename to write to file:

In [640]:
df.to_json('fruit.json')

In [641]:
!cat fruit.json #very contents of file

{"count":{"0":0,"1":1,"2":4,"3":9,"4":16,"5":25},"fruit":{"0":"apple","1":"orange","2":"pear","3":"apple","4":"orange","5":"pear"}}

load the data back into python

In [642]:
pd.read_json('fruit.json')

Unnamed: 0,count,fruit
0,0,apple
1,1,orange
2,4,pear
3,9,apple
4,16,orange
5,25,pear


# Python scripting

Any python source code can be converted into a command line script by inspecting the global ```__name__``` variable.

In [767]:
def main():
    print('executing main program')

if __name__ == "__main__":
    main()

executing main program


By checking if ```__name__ == "__main__"```, we can tell if the document has been accessed from command line, and we can perform further operations.

## Using Argparse
As of this writing, the most commonly-used argument parser is ```argparse```, which is included in python 2 and 3. The approach is to obtain the command line arguments are accessed from ```sys.argv``` and pass them to ```argparse.ArgumentParser```.

This example borrows heavily from https://docs.python.org/2/library/argparse.html#adding-arguments

In [797]:
import argparse, sys

def main(argv=sys.argv[1:]):
    parser = argparse.ArgumentParser(description = "command line test")
    parser.add_argument('integers', metavar='N', type=int, nargs='+',
                    help='an integer for the accumulator')
    parser.add_argument('--sum', dest='accumulate', action='store_const',
                        const=sum, default=max,
                        help='sum the integers (default: find the max)')
    args = parser.parse_args(argv)
    print(args.accumulate(args.integers))

The ```main``` function will parse the positional input arguments, storing them in ```args.integers```. An optional ```--sum``` flag will be stored in ```args.accumulate```. 

For ```integers``` we have specified that the input type must be ```int``` and that we may accept a variable number of arguments ```nargs='+'```. 

For the ```--sum``` flag,  if the flag is set, the action to be taken is ```action=store_const```. In this case ```const=sum``` means that the ```sum``` function will be stored. However, if the flag is not set, the default behavior is to store the ```max``` function using ```default=max```.

In [778]:
main([str(i) for i in [3, 4, 5]]) # max(3, 4, 5)

5


In [779]:
main([str(i) for i in [3, 4, 5, '--sum']]) # 3+4+5

12


let's write our main method to a ```cli``` subdirectory as an executable python script.

In [921]:
argparse_cli = '''
import argparse,sys

def main(argv=sys.argv[1:]):
    parser = argparse.ArgumentParser(description = "command line test")
    parser.add_argument('integers', metavar='N', type=int, nargs='+',
                    help='an integer for the accumulator')
    parser.add_argument('--sum', dest='accumulate', action='store_const',
                        const=sum, default=max,
                        help='sum the integers (default: find the max)')
    args = parser.parse_args(argv)
    print(args.accumulate(args.integers))
    

if __name__ == '__main__':
    main(sys.argv[1:])
'''

In [922]:
with open('cli/test_argparse_cli.py', 'w') as f:
    f.write(argparse_cli)

Run the script and view the auto-generated help documentation

In [924]:
!python cli/test_argparse_cli.py -h

usage: test_argparse_cli.py [-h] [--sum] N [N ...]

command line test

positional arguments:
  N           an integer for the accumulator

optional arguments:
  -h, --help  show this help message and exit
  --sum       sum the integers (default: find the max)


test with and without ```--sum``` flag

In [925]:
!python cli/test_argparse_cli.py 3 4 5 8 --sum 

20


In [926]:
!python cli/test_argparse_cli.py 3 4 5 8

8


## Click

An alternative to argparse is click. It supports decorator syntax, which allows the specification of command line arguments to happen outside of the main function. It also allows complex applications to be composed of several smaller sub-aps. Let's show how we would write the same example using click.

    pip install click


In [958]:
%%python 
# the above line is just used for testing (not needed outside jupyter)

import click
import sys # this line needed for testing

@click.command()
@click.argument('integers', nargs = -1, type = int)
@click.option('--sum', 'accumulate', default=False, is_flag = True, 
              help='Whether to sum over inputs (default is max)')
def main(integers, accumulate):
    if accumulate:
        print(sum(integers))
    else:
        print(max(integers))
        

if __name__ == '__main__':
    sys.argv = ['', '3', '4', '5'] #for testing
    main() #click automatically calls sys.argv 

5


write the script to file, removing all testing lines

In [959]:
with open('cli/test_click_cli.py', 'w') as f:
    f.write('''
import click 

@click.command()
@click.argument('integers', nargs = -1, type = int)
@click.option('--sum', 'accumulate', default=False, is_flag = True, help='Whether to sum over inputs (default is max)')
def main(integers, accumulate):
    if accumulate:
        print(sum(integers))
    else:
        print(max(integers))
        

if __name__ == '__main__':
    main()
''')

In [962]:
!python cli/test_click_cli.py --help

Usage: test_click_cli.py [OPTIONS] [INTEGERS]...

Options:
  --sum   Whether to sum over inputs (default is max)
  --help  Show this message and exit.


In [963]:
!python cli/test_click_cli.py 3 4 5 --sum

12



To find out more on Click, visit [Click](https://click.palletsprojects.com/en/7.x/).

# Executable scripts

In the above examples, we invoke the script using

    python myscript.py <args> <options>

As long as we are running the script in a conda environment, this will work fine. But if you want to run the script as an executable outside an environment, there are two approaches:

## Using #/path/to/python (not recommended)

The easiest way to make a script executable is to hard-code a path to the appropriate python executable at the top of the script with a hash sign:

In [972]:
!which python #find out the path to your python interpreter

/Users/apembrok/miniconda2/envs/python101/bin/python


paste it as the first line at the top of your script

In [974]:
#/Users/apembrok/miniconda2/envs/python101/bin/python

if __name__ == '__main__':
    print('executing script')

executing script


Now write to fine and use ```chmod``` to make the script executable.

One reason this is not recommended is that it makes the installation of the script non-portable. Other reasons can be found here:

http://click.palletsprojects.com/en/7.x/setuptools/?highlight=setuptools

## Using setuptools (recommended)

First we write the program to file as before

In [935]:
with open('cli/test_click_exe.py', 'w') as f:
    f.write('''
import click 

@click.command()
@click.argument('integers', nargs = -1, type = int)
@click.option('--sum', 'accumulate', default=False, is_flag = True, 
    help='Whether to sum over inputs (default is max)')
def main(integers, accumulate):
    if accumulate:
        print(sum(integers))
    else:
        print(max(integers))
''')

Note that we did not require the ```__main__``` check as before. That's because we will specify the ```entry_point``` for the executable in ```setup.py```.

In [948]:
with open('cli/setup.py','w') as f:
    f.write("""
from setuptools import setup

setup(
    name='test_click_exe',
    version='0.1',
    py_modules=['test_click_exe'],
    install_requires=[
        'Click',
    ],
    entry_points='''
        [console_scripts]
        test_click_exe=test_click_exe:main
    ''',
)
""")

In [949]:
!cat cli/setup.py


from setuptools import setup

setup(
    name='test_click_exe',
    version='0.1',
    py_modules=['test_click_exe'],
    install_requires=[
        'Click',
    ],
    entry_points='''
        [console_scripts]
        test_click_exe=test_click_exe:main
    ''',
)


## test your executable

Now, with your environment activated:

    (python101) cd cli
    (python101) pip install --editable .

This will place the executable in your environment's bin directory.

In [970]:
!which test_click_exe

/Users/apembrok/miniconda2/envs/python101/bin/test_click_exe


In [968]:
!test_click_exe 3 4 5 --sum

12


Now we no longer need to have the environment activated to run it

    /Users/apembrok/miniconda2/envs/python101/bin/test_click_exe 3 3 3 --sum
    9

This means we can symlink the executable and make it available to other scripts.

In [971]:
!cat /Users/apembrok/miniconda2/envs/python101/bin/test_click_exe

#!/Users/apembrok/miniconda2/envs/python101/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'test-click-exe','console_scripts','test_click_exe'
__requires__ = 'test-click-exe'
import re
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(
        load_entry_point('test-click-exe', 'console_scripts', 'test_click_exe')()
    )
