### Python notebooks

* interactive
* contain code and presentation
* facilitate collaboration
* easy to write and test code
* provide quick results
* easy to display graphs

#### Start coding

In [None]:
"""
    MyNewShinyDataType - A class for demonstration purposes.
    The class has 2 attributes:
    - attribute1 - text (str)
    - attribute2 - numeric (int or float)
    
    The class allows for the update of the numeric attribute.
    - method1 updates attribute2
"""
class MyNewShinyDataType:
    def __init__(self, parameter1 = "default value", parameter2 = 0):
        self.attribute1 = parameter1
        self.attribute2 = parameter2
        
    def __str__(self):
        return f"MyNewShinyDataType object: attribute1 = '{self.attribute1}', attribute2 = {self.attribute2}"
        
    def __repr__(self):
        return f"MyNewShinyDataType('{self.attribute1}',{self.attribute2})"
    
    def method1(self, parameter1 = 0):
        """
        Add parameter value to attribute2.

        Keyword arguments:
        numeric: parameter1 - the number to add (0)
        
        Returns:
        str: updated attribute2
        """         
        old_value = self.attribute2
        try:
            self.attribute2 = self.attribute2 + parameter1
        except TypeError: 
            self.attribute2 = self.attribute2 + 2
            print(f"'{parameter1}' is not a numeric value, we added 2 instead")
        finally:
            print(f"Old value was {old_value}, new value is {self.attribute2}")
        return self.attribute2

#### Do more coding

In [None]:
"""
    EnhancedNewShinyDataType - A class for demonstration purposes.
    The class extends the MyNewShinyDataType:
    - method2 - updates attribute1
    - method3 - a len-based update of attribute2
    
"""
class EnhancedNewShinyDataType(MyNewShinyDataType):
    
    def method2(self, parameter1 = ""):
        """
        Add parameter text to attribute1.

        Keyword arguments:
        str: parameter1 - the string to add ("")
        
        Returns:
        str: updated attribute1
        """        
        old_value = self.attribute1
        try:
            self.attribute1 = self.attribute1 + " " + parameter1
            index = self.attribute1.index("test")
        except TypeError: 
            self.attribute1 = self.attribute1 + " " + str(parameter1)
            print(f"'{parameter1}' is not a string, we made the conversion and added it")
        except ValueError: 
            self.attribute1 = self.attribute1 + " test"
            print(f"'{self.attribute1}' does not contain 'test', we added 'test' to it")

        finally:
            print(f"Old value was '{old_value}', new value is '{self.attribute1}'")
        return self.attribute1
    

    def method3(self, parameter1 = ""):
        """
        Add parameter length to attribute2.
        """
        pass # implement this method



### From exploration work to production

### Python scripts

In [None]:
!touch test.py
!echo '#!/usr/bin/env python' > test.py
!echo 'print("This is a python script")' >> test.py
!chmod u+x test.py

In [None]:
import test

In [None]:
!python test.py

In [None]:
!./test.py

#### Adding a function

In [None]:
def test_function():
    print("This is a function in a python script")

In [None]:
test_function()

In [None]:
import test as t

In [None]:
dir(t)

In [None]:
t.test_function()

In [None]:
# add a test variable, restart kernel, import
import test as t
dir(t)

In [None]:
t.test_variable

### __main__ — Top-level script environment

'__main__' is the name of the scope in which top-level code executes. A module’s __name__ is set equal to '__main__' when read from standard input, a script, or from an interactive prompt.

A module can discover whether or not it is running in the main scope by checking its own __name__, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m but not when it is imported.

```python
if __name__ == "__main__":
    # execute only if run as a script
    main() # function that contais the code to execute
```

https://docs.python.org/3/library/__main__.html

In [1]:
list.__name__

'list'

In [None]:
def main():
    test_variable = 10
    print(f'The test variable value is {test_variable}')

In [None]:
main()

In [None]:
# add main, restart kernel, import
import test as t

In [None]:
dir(t)

In [None]:
!python test.py

#### `sys.argv`

The list of command line arguments passed to a Python script. argv[0] is the script name (it is operating system dependent whether this is a full pathname or not). <br>
If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'. <br>
If no script name was passed to the Python interpreter, argv[0] is the empty string.

The Python sys module provides access to any command-line arguments using the sys.argv object. 

The sys.argv is the list of all the command-line arguments.<br>
len(sys.argv) is the total number of length of command-line arguments.

Add to the script

```python
import sys

print('Number of arguments:', len(sys.argv))
print ('Argument List:', str(sys.argv))
```

In [7]:
!./test.py

The test variable value is 10
Number of arguments: 1
Argument List: ['./test.py']
[1 2 3]


#### Give some arguments

In [19]:
!./test.py [1,2,4] message 1

The test variable value is 10
Number of arguments: 4
Argument List: ['./test.py', '[1,2,4]', 'message', '1']
[1 2 4]
(3,)
array shape: (3,)


```import numpy as np```

In [20]:
import numpy as np
np.array("[1, 2 , 3]".strip('][').split(','), dtype = int)

array([1, 2, 3])

#### Argument parsing

`import getopt`
    
`opts, args = getopt.getopt(argv, 'a:b:', ['foperand', 'soperand'])`

The signature of the getopt() method looks like:

`getopt.getopt(args, shortopts, longopts=[])`

* `args` is the list of arguments taken from the command-line.
* `shortopts` is where you specify the option letters. If you supply a:, then it means that your script should be supplied with the option a followed by a value as its argument. Technically, you can use any number of options here. When you pass these options from the command-line, they must be prepended with '-'.
* `longopts` is where you can specify the extended versions of the shortopts. They must be prepended with '--'.

https://www.datacamp.com/community/tutorials/argument-parsing-in-python
https://docs.python.org/2/library/getopt.html
https://www.tutorialspoint.com/python/python_command_line_arguments.htm

```python
    try:
        # Define the getopt parameters
        opts, args = getopt.getopt(sys.argv[1:], 'l:s:n:', ['list','string',"number"])
        print(len(opts))
        if len(opts) != 3:
            print ('usage: test.py -l <list_operand> -s <string_operand> -n <number_operand>')
        else:
            print(opts)
            test_array = np.array(opts[0][1].strip('][').split(','), dtype = int)
            string_text = opts[1][1]
            number_text = int(opts[2][1])
            test_array = test_array * number_text 
            print(f'Info {string_text}, for updated list {test_array}')
    except getopt.GetoptError:
        print ('usage: test.py -l <list_operand> -s <string_operand> -n <number_operand>')
```

In [30]:
!./test.py -l [1,2,4] -s message

The test variable value is 10
Number of arguments: 5
Argument List: ['./test.py', '-l', '[1,2,4]', '-s', 'message']
[1 2 4]
(3,)
array shape: (3,)
2
usage: test.py -l <list_operand> -s <string_operand> -n <number_operand>


#### `argparse` -increased readability
`import argparse`

`class argparse.ArgumentParser(prog=None, usage=None, description=None, epilog=None, parents=[], formatter_class=argparse.HelpFormatter, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True, allow_abbrev=True)`<br>
https://docs.python.org/3/library/argparse.html#argumentparser-objects

Argument definition<br>
`ArgumentParser.add_argument(name or flags...[, action][, nargs][, const][, default][, type][, choices][, required][, help][, metavar][, dest])`<br>
https://docs.python.org/3/library/argparse.html#the-add-argument-method

`ap.add_argument("-i", "--ioperand", required=True, help="important operand")`

* -i - letter version of the argument
* --ioperand - extended version of the argument
* required - whether the argument or not
* help - maningful description

https://www.datacamp.com/community/tutorials/argument-parsing-in-python
https://docs.python.org/3/library/argparse.html
https://realpython.com/command-line-interfaces-python-argparse/

```python
    ap = argparse.ArgumentParser()

    # Add the arguments to the parser
    ap.add_argument("-l", "--list_operand", required=True, help="list operand")
    ap.add_argument("-s", "--string_operand", required=True, help="string operand")
    ap.add_argument("-n", "--number_operand", required=True, help="number operand")

    args = vars(ap.parse_args())
    print(args)
    test_array = np.array(args['list_operand'].strip('][').split(','), dtype = int)
    string_text = args['string_operand']
    number_text = int(args['number_operand'])
    test_array = test_array * number_text 

    print(f'With argparse. Info {string_text}, for updated list {test_array}')
```


In [46]:
!./test.py -h

usage: test.py [-h] -l LIST_OPERAND -s STRING_OPERAND -n NUMBER_OPERAND

optional arguments:
  -h, --help            show this help message and exit
  -l LIST_OPERAND, --list_operand LIST_OPERAND
                        list operand
  -s STRING_OPERAND, --string_operand STRING_OPERAND
                        string operand
  -n NUMBER_OPERAND, --number_operand NUMBER_OPERAND
                        number operand


In [44]:
!./test.py -l [1,2,4] --string_operand message -n 3

{'list_operand': '[1,2,4]', 'string_operand': 'message', 'number_operand': '3'}
With argparse. Info message, for updated list [ 3  6 12]


##### `action` parameter - count example
https://docs.python.org/3/library/argparse.html#action

'count' - This counts the number of times a keyword argument occurs. For example, this is useful for increasing verbosity levels:

    `ap.add_argument("-v", "--verbose", action='count', default=0)`


In [48]:
!./test.py -l [1,2,4] --string_operand message -n 3 -vvv

{'list_operand': '[1,2,4]', 'string_operand': 'message', 'number_operand': '3', 'verbose': 3}
With argparse. Info message, for updated list [ 3  6 12]


### Modules

https://docs.python.org/3/tutorial/modules.html
https://www.python.org/dev/peps/pep-0008/#package-and-module-names

If you want to write a somewhat longer program, you are better off <b>using a text editor to prepare the input for the interpreter and running it with that file as input instead. This is known as creating a script.</b> 
    
As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.

A module is a file containing Python definitions and statements. <b>The file name is the module name with the suffix .py appended</b>. Within a module, the module’s name (as a string) is available as the value of the global variable `__name__`.

In [5]:
#Let's create a module for our classes
!touch base_shiny_type.py

In [16]:
import base_shiny_type as bst

In [17]:
bst.MyNewShinyDataType()

MyNewShinyDataType('default value',0)

In [18]:
!touch enhanced_shiny_type.py

In [19]:
import enhanced_shiny_type as est

In [20]:
est.EnhancedNewShinyDataType()

MyNewShinyDataType('default value',0)

### Packages

https://docs.python.org/3/tutorial/modules.html#packages

<b>Packages are a way of structuring</b> Python’s module namespace by using “dotted module names”. <b>For example, the module name A.B designates a submodule named B in a package named A</b>. Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages like NumPy from having to worry about each other’s module names.

In [10]:
!mkdir demoCM

mkdir: demoCM: File exists


In [11]:
!cp test.py demoCM
!cp base_shiny_type.py demoCM
!cp enhanced_shiny_type.py demoCM

In [12]:
!touch demoCM/__init__.py

In [13]:
from demoCM import test as tt

In [14]:
tt.test_function()

This is a function in a python script


In [16]:
from demoCM import base_shiny_type as bst1

In [17]:
bst1.MyNewShinyDataType()

MyNewShinyDataType('default value',0)

In [36]:
dir(bst)

['MyNewShinyDataType',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [18]:
from demoCM import enhanced_shiny_type as est1

In [4]:
# restart kernel

#dir()

In [2]:
from demoCM import *

In [5]:
#dir()

In [7]:
base_shiny_type

<module 'demoCM.base_shiny_type' from '/Users/mitrea/Documents/CLASSES/BIOINF 575 FA 2019/demoCM/base_shiny_type.py'>

In [8]:
base_shiny_type.MyNewShinyDataType()

MyNewShinyDataType('default value',0)

https://towardsdatascience.com/5-advanced-features-of-python-and-how-to-use-them-73bffa373c84

#### A <b>`lambda` function</b> is a small, anonymous function - it has no name

https://docs.python.org/3/reference/expressions.html#lambda<br>
https://www.geeksforgeeks.org/python-lambda-anonymous-functions-filter-map-reduce/<br>
https://realpython.com/python-lambda/<br>

`lambda arguments : expression`

A lambda function can take <b>any number of arguments<b>, but must always have <b>only one expression</b>.

In [13]:
nameless_function = lambda x: x**3

In [15]:
nameless_function(4)

64

In [24]:
import numpy as np
import pandas as pd
test_series = pd.Series([1,2,3,4])
test_series

0     1
1     8
2    27
3    64
dtype: int64

In [None]:
test_series.apply(lambda x: x**3)

In [25]:
test_series.apply(lambda x:True if x % 2 == 0 else False)

0    False
1     True
2    False
3     True
dtype: bool

In [26]:
test_df = pd.DataFrame([[1,2,3,4],[5,6,7,8]])
test_df

Unnamed: 0,0,1,2,3
0,1,2,3,4
1,5,6,7,8


In [None]:
Compute the 

In [31]:
test_df.apply(lambda x:x[0]**2*x[1], axis = 0)

0      5
1     24
2     63
3    128
dtype: int64

#### Useful funtions
https://docs.python.org/3/library/functions.html

`zip` - make an iterator that aggregates elements from each of the iterables.
https://docs.python.org/3/library/functions.html#zip


`zip(*iterables)`

Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The iterator stops when the shortest input iterable is exhausted. With a single iterable argument, it returns an iterator of 1-tuples. With no arguments, it returns an empty iterator.

In [47]:
combined_res = zip([1,2,3],["A","B","C"],[True,False,True])
combined_res

<zip at 0x117440b08>

In [48]:
list(combined_res)

[(1, 'A', True), (2, 'B', False), (3, 'C', True)]

In [49]:
dict(zip([1,2,3],["A","B","C"]))

{1: 'A', 2: 'B', 3: 'C'}

In [None]:
#try unequal sizes



In [55]:
# unzip list
x, y = zip(*zip([1,2,3],[4,5,6]))
print(x,y)
x, y = zip(*[(1,4),(2,5),(3,6)])
print(x,y)

(1, 2, 3) (4, 5, 6)
(1, 2, 3) (4, 5, 6)


`map` - apply funtion to every element of an iterable
https://docs.python.org/3/library/functions.html#map


`map(function, iterable, ...)`

Return an iterator that applies function to every item of iterable, yielding the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted.

In [32]:
map(abs,[-2,3,-5,6,-7])

<map at 0x117186b00>

In [33]:
for i in map(abs,[-2,3,-5,6,-7]):
    print(i)

2
3
5
6
7


https://www.geeksforgeeks.org/python-map-function/

In [41]:
numbers1 = [1, 2, 3] 
numbers2 = [4, 5, 6] 
  
result = map(lambda x, y: x + y, numbers1, numbers2) 
list(result)

[5, 7, 9]

Use a lambda funtion and the map function to compute a result from the followimg 3 lists.<br>
If the elemnt in the third list is divisible by 3 return the sum of the elements from the fist two list, otherwise return the difference.

In [43]:
numbers1 = [1, 2, 3, 4, 5, 6] 
numbers2 = [7, 8, 9, 10, 11, 12] 
numbers3 = [13, 14, 15, 16, 17, 18] 




`filter` - apply funtion to every element of an iterable
https://docs.python.org/3/library/functions.html#filter

`filter(function, iterable)`

Construct an iterator from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

In [58]:
test_list = [3,4,5,6,7]
result = filter(lambda x: x>4, test_list)
result

<filter at 0x11728f048>

In [59]:
list(result)

[5, 6, 7]

In [60]:
# Python Program to find all anagrams of str in  
# a list of strings. 
from collections import Counter 
  
word_list = ["spear", "print", "spare", "practice", "parse"] 
word = "pears"
  
# use anonymous function to filter anagrams of x. 
# Please refer below article for details of reversed 
# https://www.geeksforgeeks.org/anagram-checking-python-collections-counter/ 
result = list(filter(lambda x: (Counter(word) == Counter(x)), word_list))  
  
# printing the result 
print(result)

['spear', 'spare', 'parse']


Return all the elemnts with a value divisible by 7 and a key that starts with A in the following dictionary.

In [65]:
d = {"ACE": 21, "BAC":7, "AML":5, "ABL":14, "MAP":3}



{'ACE': 21, 'ABL': 14}

`reduce` - apply funtion to every element of an iterable
https://docs.python.org/3/library/functools.html#functools.reduce

`functools.reduce(function, iterable[, initializer])`

<b>Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value</b>. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.

In [66]:
from functools import reduce

In [None]:
reduce(lambda x,y: x+y, [47,11,42,13])

<img src = https://www.python-course.eu/images/reduce_diagram.png width=300/>

https://www.python-course.eu/lambda.php

https://www.geeksforgeeks.org/reduce-in-python/
https://www.tutorialsteacher.com/python/python-reduce-function

In [67]:
test_list = [1,2,3,4,5,6]

In [72]:
# compute factorial of n
n=5
reduce(lambda x,y: x*y, range(1,n+1))

120

In [75]:
#intersection of multiple lists
#https://stackoverflow.com/questions/15995/useful-code-which-uses-reduce
    
test_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, test_list))
result

{3, 4, 5}