# Python session - 2.2

## Functions and modules

## Functions

Functions are reusable blocks of code that you can name and execute any number of times from different parts of your script(s). This reuse is known as "calling" the function. Functions are important building blocks of a software.

There are several built-in functions of Python, which can be called anywhere (and any number of times) in your current program. You have been using built-in functions already, for example, `len()`, `range()`, `sorted()`, `max()`, `min()`, `sum()` etc.

#### Structure of writing a function:

- `def` (keyword) + function name (you choose) + `()`.
- newline with 4 spaces or a tab +  block of code # Note: Codes at the 0 position are always read
- Call your function using its name

In [None]:
## Non parametric function
# Define a function that prints a sum of number1 and number2 defined inside the function

get_sum()

In [23]:
# Parametric function
# Define a function that prints a sum of number1 and number2 provided by the user
# Hint: get_sum_param(number1, number2)


In [26]:
# Returning values
# Define a function that 'returns' a sum of number1 and number2 provided by the user
# Hint: print(get_sum_param(number1, number2))


In [32]:
# Local Vs. global variable

# Define a function that returns a sum of number1 and number2 to a variable
# and print it after calling the function
# Hint: returned_value = get_sum_param(number1, number2)


### Exercises: write old codes into a function

In [30]:
# Optional exercise
# Let’s take one of our older codes and write them in function


### Using Modules

One of the great things about Python is the free availability of a _huge_ number of modules that can be imported into your code and used. Modules are developed with the aim of solving some particular problem or providing particular, often domain-specific, capabilities.

Like functions, which are usable parts of a program, packages (also known as libraries) are reusable programs with several modules.

In order to import a module, it must first be installed and available on your system. We will cover this briefly later in the course.  

A large number of modules are already available for import in the standard distribution of Python: this is known as the standard library. If you installed the Anaconda distribution of Python, you have even more modules already installed - mostly aimed at data science.

Importing a module is easy:

- Import (keyword) + package name, for example: 
    - import os           # contains functions for interacting with the operating system
    - import sys          # Contains utilities to process command line arguments

More at: https://pypi.python.org/pypi

In [34]:
import os
os.getcwd()
os.mkdir("new_dir_name")
help(os)                    # manual page created from the module's docstrings

In [37]:
import sys
print(sys.argv)

### Using loops to iterate through files in a directory

In [46]:
# define a function that lists all the files in the folder called demo_folder

import os

def read_each_filename(pathname):
    ...
pathname = 'demo_folder' # name of path with multiple files
read_each_filename(pathname)

In [44]:
# define a function that reads and prints each lines of each file in the folder called demo_folder

import os

def read_each_line_of_each_file(pathname): # name of path with multiple files
    ...
pathname = 'demo_folder' # name of path with multiple files
read_each_line_of_each_file(pathname)

# Hints:
# Options for opening files
# option-1: with open("{}/{}".format(pathname, filename)) as in_fh:
# option-2: with open('%s/%s' % (pathname, filename)) as in_fh:
# option-3: with open(pathname + '/' + filename) as in_fh:
# option-4: with open(os.path.join(pathname, filename)) as in_fh:

In [41]:
# Exercise: Go through each filename in the directory 'demo_folder'
# open those files that end with only '.csv' or only with '.fasta.

In [47]:
# Optional exercises (We will cover this in the session - 3)

# 1. Extract the length of fasta sequence for kinases from the file 'fasta_human_kinase.fasta'
# 2. Extract the UniProt for kinases from the file 'human_kinase.csv'

#### Examples of importing basic modules.

In [6]:
# import numpy
# array_one = numpy.array([1,2,3,4,5,6])
# print(array_one)

In [7]:
# import pandas
# data = pandas.read_table()
# data.plot()

#### Aside: Namespaces
Python uses namespaces a lot, to ensure appropriate separation of functions, attributes, methdos etc between modules and objects. When you import an entire module, the functions and classes available within that module are loaded in under the modules namespace - `pandas` in the example above.  
It is possible to customise the namespace at the point of import, allowing you to e.g. shorten/abbreviate the module name to save some typing:

In [9]:
# import numpy as np
# array_two = np.array([10, 11, 12, 13, 14])
# print(array_two)

In [10]:
# import pandas as pd
# data = pd.read_table()
# data.plot()

Also, as in the examples above, if you need only a single function from a module, you can import that directly into your main namespace (where you don't need to specify the module before the name of the function):

In [12]:
# from numpy import array
# array_three = array([1, 1, 2, 3, 5, 8])
# print(array_three)

In [14]:
# from pandas import read_table
# data = read_table()
# data.plot()

#### Conventions
- You should perform all of your imports at the beginning of your program. This ensures that
  - users can easily identify the dependencies of a program, and 
  - that any lacking dependencies (causing fatal `ImportError` exceptions) are caught early in execution
- the shortening of `numpy` to `np` and `pandas` to `pd` are very common, and there are others too - watch out for this when e.g. reading docs and guides/SO answers online.

### Execises - Importing

In [16]:
# --- numpy
# series_a = numpy.array([5, 5, 5, 5, 5])
# series_b = ---.array([1, 2, 3, 4, 5])
# series_c = series_a - series_b
# print(series_c)

In [17]:
# import pandas --- pd
# data = pd.read_table()

#### Aside: Your Own Modules
Whenever you write some python code and save it as a script, with the `.py` file extension, you are creating your own module. If you define functions within that module, you can load them into other scripts and sessions.

### Some Interesting Module Libraries to Investigate
- os
- sys
- shutil
- random
- collections
- math
- argparse
- time
- datetime
- numpy
- scipy
- matplotlib
- pandas
- scikit-learn
- requests
- biopython
- openpyxl