# A brief, basic introduction to Python for scientific computing - Chapter 4
## Functions

Functions are an important part of any program.  Some programming languages make a distinction between "functions" that return values and "subroutines" that do not return anything but rather do something.  In Python, there is only one kind, functions, but these can return single, multiple, or no values at all.  In addition, like everything else, functions in Python are objects.  That means that they can be included in lists, tuples, or dictionaries, or even sent to other functions.  This makes Python extraordinarily flexible.

To make a function, use the def statement:

In [1]:
def add(arg1, arg2):
   x = arg1 + arg2
   return x

Here, `def` signals the creation of a new function named `add`, which takes two arguments.  All of the commands associated with this function are then indented underneath the `def` statement, similar to the syntactic indentation used in loops.  The `return` statement tells Python to do two things: exit the function and, if a value is provided, use that as the `return` value -- the value given back to the user or to the code which calls the function.

Unlike other programming languages, functions do not need to specify the types of the arguments sent to them.  Python evaluates these at runtime every time the function is called.  Using the above example, we could apply our function to many different types:

In [2]:
add(1, 2)

3

In [3]:
add("house", "boat")

'houseboat'

In [4]:
add([1, 2, 3], [4, 5, 6])

[1, 2, 3, 4, 5, 6]

In [5]:
add(1, "house")

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In the last example, an error occurs because the addition operator is not defined for an integer with a string.  This error is only thrown when we call the function with inappropriate arguments.

The `return` statement can be called from anywhere within a function:

In [6]:
def power(x, y):
   if x <= 0:
     return 0.
   else:
     return x**y

If no `return` statement is present within a function, or if the `return` statement is used without a return value, Python automatically returns the special value `None`:

In [7]:
def test(x):
   print("%11.4e" % x)
   return

In [8]:
ret = test(1)

 1.0000e+00


In [9]:
ret == None

True

In [10]:
ret is None

True

`None` is a reserved, special object in Python, similar to `True` and `False`.  It essentially means nothing, and will not appear using the `print` statement.  However, as seen in the above example, one can test for the `None` value using conditional equality or the `is` statement.

If one wants a function that modifies its behavior depending on the type of the argument, it is possible to test for different types using the `type` function (though often it is regarded as bad practice to write functions which require the types of objects to be known or manipulated):

In [11]:
def add(arg1, arg2):
   #test to see if one is a string and the other is not
   if type(arg1) is str and not type(arg2) is str:
     arg1convert = type(arg2)(arg1)
     return arg1convert + arg2
   elif not type(arg1) is str and type(arg2) is str:
     arg2convert = type(arg1)(arg2)
     return arg1 + arg2convert
   else:
     return arg1 + arg2   

In [12]:
add(1, "40")

41

In [13]:
add(40., "1")

41.0

Notice that in this example, the `type(arg2)` statement is also used to return the function that converts generic objects to the type of `arg2`, e.g., `int`, `float`, or `complex`.  Thus the statement `type(arg2)(arg1)` actually runs this type-conversion function on the string `arg1` to convert it to the type of `arg2`.  

Functions can return more than one value using Python's tuple capabilities.  To do so, specify a comma-separated list after the `return` statement:

In [14]:
def test(x, y):
   a = x / y
   b = x % y
   return a, b

In [15]:
test(5, 2)

(2.5, 1)

In [16]:
c, d = test(5,2)
c

2.5

In [17]:
d

1

## Optional arguments in functions

Arguments of functions can be optional.  Such optional arguments must have a default value, specified in the `def` statement.  If optional arguments are given when a function is called, the arguments will take on the supplied values.  If not, they will assume the default values:

In [18]:
def fmtWithUnits(x, format = "%.3f", unit = "inches"):
   return format % x + " " + unit

In [19]:
fmtWithUnits(7)

'7.000 inches'

In [20]:
fmtWithUnits(7, "%.1f")

'7.0 inches'

In [21]:
fmtWithUnits(7, "%.1f", "feet")

'7.0 feet'

In [22]:
fmtWithUnits(7, unit = "feet")

'7.000 feet'

Notice in the penultimate line that we needed to specify the unit optional argument explicitly, since we skipped the optional format one.  In general, it is good practice to explicitly specify optional arguments in this way whether or not one needs to, since this makes it clearer that the arguments in the call are optional:

In [23]:
fmtWithUnits(7, format = "%.1f", unit = "feet")

'7.0 feet'

## Function namespaces

Argument variables within functions exist in their own namespace.  This means that assignment of an argument to a new value does not affect the original value outside of the function.  Consider the following:

In [24]:
def increment(a):
   a = a + 1
   return a

In [25]:
a = 5
increment(a)

6

In [26]:
a

5

What happened here?  Because `a` is an argument variable defined in the `def` statement, it is treated as a new variable that exists only within the function.  Once the function has finished and the program exits it, this new `a` is destroyed in memory by Python's garbage-collecting routines.  The `a` that we defined outside of the function remains the same.

How, then, does one modify variables using functions?  In other programming languages, you may have been used to sending variables to functions to change their values directly.  This is not a "Pythonic" way of doing things (a way which is considered good style or appropriate in the Python language).  Instead, the Pythonic approach is to use assignment to a function return value.  This is actually a clearer approach than the way of many other programming languages because it shows explicitly that the variable is being changed upon calling the function:

In [27]:
def increment(a):
   return a + 1

In [28]:
a = 5
a = increment(a)
a

6

There is one subtlety to this issue.  Mutable objects can actually be changed by functions if one uses object functions and/or element access.  Consider the following example that uses both to modify a list:

In [29]:
def modifylist(l):
   l.append(5)
   l[0] = 20

In [30]:
l = [1, 2, 3]
modifylist(l)
l

[20, 2, 3, 5]

The reason for the distinction with mutable objects has to do with Python's name-binding approach.  Consider the following generic construct:

When one calls `fn(x)`, Python creates the new variable arg within the function namespace and points it to the data residing in the spot of memory to which `x` points.  Setting `arg` equal to another value within the function simply has the effect of pointing arg to a new location in memory corresponding to newvalue, rather than changing the existing spot in memory associated with `x`.  Therefore, `x` remains unaffected.

On the other hand, consider the following:

Here, in the second line, the bracket notation tells Python to do the following: find the area in memory where the indexth element of `arg` resides and put newvalue in it.  This occurs because the brackets after `arg` are actually treated as an object function of `arg`, and thus are inherently a function of the memory and data to which `arg` points.  A similar case would exist if we had called some object function that modified its contents, like `arg.sort()`.  In these cases, `x` would be modified outside of the function.

## Functions as objects

As alluded to previously, functions are objects and thus can be sent to other functions as arguments.  Consider the following:

In [31]:
def squareme(x):
   return x*x

In [32]:
def applytolist(l, fn):
   return [fn(ele) for ele in l]

In [33]:
l = [1, 7, 9]
applytolist(l, squareme)

[1, 49, 81]

Here, we sent the `squareme` function to the `applytolist` function.  Notice that when we send a function to another function, we do not supply arguments.  If we had supplied arguments, we would have instead sent the return value of the function, rather than the function itself.

**Python shows us that a function is an object**.  Consider, from the above example:

In [34]:
squareme

<function __main__.squareme>

The information printed shows that this is a function.  We can also test the type:

In [35]:
type(squareme)

function

Like other objects, we can perform assignment using functions:

In [36]:
def a(x, y):
   return x+y

In [37]:
b = a
b(1, 4)

5

## Function documentation

Functions can be self-documenting in Python.  A docstring can be written after the def statement that provides a description of what a function does.  This extremely useful for documenting your code and providing explanations that both you and subsequent users can use. The built-in help function uses docstrings to provide help about functions. **Get in the habit of always writing doc strings for your functions**, even before you write the code itself.

In [38]:
def a(x, y):
   """Adds two variables x and y, of any type.  Returns single value."""
   return x + y

In [39]:
help(a)

Help on function a in module __main__:

a(x, y)
    Adds two variables x and y, of any type.  Returns single value.



It is typical to enclose docstrings using triple-quotes, since complex functions might require longer, multi-line documentation.  It is a good habit to write docstrings for your code, and code should really not be regarded as complete until it has useful doc strings.  Each should contain three pieces of information: (1) a basic description of what the function does, (2) what the function expects as arguments, and (3) what the function returns (including the variable types).

## Writing scripts

So far, the examples we have covered have involved commands interpreted directly from the Python interactive prompt.  Python also supports scripts, or lists of commands and function definitions (and any other Python constructs) that are defined in files, similar to source code in other programming languages.  These scripts are no different from the commands and instructions that you would enter at the command prompt.  Python scripts end in the extension .py in all platforms.  

Consider the following contents of a script file called [`primes.py`](primes.py) (also present in this directory) that finds all prime numbers less than or equal to 50:

We can run this program from the command line by calling Python with an argument that is the name of our script.  Python will run the contents of the file as if we typed them at the interactive prompt and then exit.  Under Windows, for example, this might look something like:

## Modules

It is also possible to import scripts from within the Python interpreter.  When files of Python commands are imported in this way they are termed modules.  Modules are a major basis of programming efforts in Python as they allow you to organize reusable code that can be imported as necessary in specific programming applications.  Considering the previous example:

In [40]:
import primes

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]


In [41]:
type(primes)

module

In [42]:
primes.nextprime

<function primes.nextprime>

In [43]:
primes.l

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Notice several features of this example:

* Scripts are imported using the `import` command.  Upon processing the import statement, Python immediately executes the contents of the file primes.py file.  

* One does not use the .py extension in the `import` command; Python assumes the file ends in this and is accessible in the current directory (if unchanged, the same directory from which Python was started).  If Python does not find the script to be imported in the current directory, it will search a specific path called PYTHONPATH, discussed later.

* When Python executes the imported script, it creates an object from it of type module.  

* Any objects created when running the imported file are not deleted but are placed as members of the module object.  In this way, we can access the functions and variables that were part of the module program by using dot notation, like primes.l and primes.nextprime.


By making script objects members of the module, Python gives us a powerful way to write reusable code, i.e., code with generic functions and variables that we can import into programs.  Modules can also import other modules, so that we can have hierarchies of code with variable degrees of generality.


Module objects can be created and modified just like any other object in Python:

In [44]:
primes.l = []
primes.l

[]

In [45]:
primes.k = 5   #create new object in primes module
primes.k

5

Importing a module twice does not execute it twice:

In [46]:
import primes

Python will import a module only once, for reasons of efficiency (in the case, for instance, that many modules import the same sub-module).  This can be overridden using the reload function, which is itself in `importlib`:

In [47]:
from importlib import reload
reload(primes)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]


<module 'primes' from '/Users/dmobley/github/drug-computing/other-materials/python-intro/primes.py'>

Sometimes we want scripts to behave differently when we execute them at the command line versus import them into other programs.  Commonly we want the script to execute certain commands when run from the command line, but need to suppress this behavior when imported.  To achieve this, we need to test whether or not the program is being run from the command line.  Consider the following program:

In the penultimate line, we test to see if the script `test.py` has been run from the command line.  The variable __name__ is a special variable that Python creates which tells us the name of the current module.  (There are many such special variables, and they are always identified by preceding and trailing double-underscores.)  Python gives the value of "__main__" to the variable __name__ if and only if that program is the main program and has been called from the command line (i.e., not imported).  Here is the behavior of our program at the command line:
```
python test.py
20
```

And here is its behavior if we import it:

In [48]:
import test
test.multiply(2, 3)

6

Notice that Python does not execute the `multiply(4, 5)` command when we import, but we still have access to any functions or objects defined in `test.py`.


It is not possible to use path names in the `import` statement.  Instead, by default, Python will look for modules in three places: (1) the current working directory, (2) a special directory or set of directories called `PYTHONPATH`, and (3) the standard Python installation.  The second location makes it convenient to store user-written reuseable code in a common folder.  `PYTHONPATH` is actually a system environment variable that Python looks for and can point to such a folder.  To set it on Windows machines, one needs to edit the system/environment variables (exact procedure depends on your version of Windows).  Then, `PYTHONPATH` can be added to the User Variables category with a value that is the name of a path where your common scripts are. On Mac, this is done by editing your environment variables such as your `~/.bash_profile` directory; help for this is provided elsewhere or via Google.

## Standard modules

Python has a "batteries included" philosophy and therefore comes with a huge library of pre-written modules that accomplish a tremendous range of possible tasks.  It is beyond the scope of this tutorial to cover all but a small few of these.  However, here is a brief list of some of these modules that can come in handy for scientific programming:

* `os` – functions for various operating system operations

* `os.path` – functions for manipulating directory/folder path names

* `sys` – functions for system-specific programs and services

* `time` – functions for program timing and returning the current time/date in various formats

* `filecmp` – functions for comparing files and directories

* `tempfile` – functions for automatic creation and deletion of temporary files

* `glob` – functions for matching wildcard-type file expressions (e.g., `*.txt`)

* `shutil` – functions for high-level file operations (e.g., copying, moving files)

* `struct` – functions for storing numeric data as compact, binary strings

* `gzip, bz2, zipfile, tarfile` – functions for writing to and reading from various compressed file formats

* `pickle` – functions for converting any Python object to a string that can be written to or subsequently read from a file

* `hashlib` – functions for cryptography / encrypting strings

* `socket` – functions for low-level networking 

* `popen2` – functions for running other programs and capturing their output

* `urllib` – functions for grabbing data from internet servers

* `smtplib` - functions for interfacing with SMTP (e-mail) servers

* `audioop, imageop` – functions for manipulating raw audio and image data (e.g., cropping, resizing, etc.)


A complete listing of all of the modules that come with Python are given in the Python Library Reference in the Python Documentation.  In addition to these modules, scientific computing often makes extensive use of various extremely valuable add-on modules that form part of a relatively standard scientific Python "stack". These include `numpy`, `scipy`, `matplotlib`, `scikit-learn`, and many others.