# Structuring python code

We have learned the basics of python by now, and have seen how to use various python concepts and powerful libraries from ipython notebooks. This is a perfectly viable way to work, as long as projects are not too complicated. 

Real-life projects however usually turn out to be more complex, with code accumulating over time. In this case structuring your code is vital.

## Python modules

### Collecting commonly used code

Typically, in your project you will develop some core functionality that you will use over and over again - for example some function

    def do_some_fancy_stuff(arguments):
        ...

You then use this function for many purposes. Over time, you will have several notebooks using it, and then you would need to copy the function and all related code to each notebook!

This is not practical, and error prone.

In this case you should put the function and all related code into a python `module`. This means you put the code in some text file with a name ending in ``.py``, let's say
``module.py``. Then you can use this function in any notebook you want by importing the module:

    import module
    
    module.do_some_fancy_stuff(...)
    
Alternatively, you can also write

    from module import do_some_fancy_stuff
    
    do_some_fancy_stuff(...)

but you already saw that syntax in the basic introduction in ``from math import ...``.

### Separation in different namespaces

Another advantage of modules is that it helps you to avoid errors that may arise if too much code gets intermixed. Say you have code like this:

In [6]:
c = 1

def f(x):
    return c * x

The function ``f(x)`` depends on a global variable ``c``. Now say you write a lot of code inbetween, and you define the variable ``c`` for some other purpose overwriting it by accident (you forgot it was even used before). Then you will change the behavior of ``f(x)``! Try that below

In [7]:
f(1)

1

In [8]:
c = 2
f(1)

2

If you separate the code into a module ``module.py`` which reads

    c = 1
    
    def f(x):
        return c * x
        
and you use it as

    import module
    
    module.f(1)
    
    c = 2
    
    module.f(1)

then no problem arises. ``f`` uses the variable ``c`` from the namespace of the module, whereas you used the variable ``c`` from the namespace of your notebook. You could stil change ``c`` in the module, but then you need to write ``module.c = 2`` - which makes it immediately clear what you do.

Namespaces are a powerful concept in python. They also apply for example to functions - if you write

In [10]:
c = 1

def f(x):
    c = 2
    return c * x

In [14]:
c = 1
print(f(1))
c = 3
print(f(1))

2
2


then the variable ``c`` within the definition of ``f(x)`` is used. Note however that in this case you cannot access the variable c as ``f.c`` (try what happens if you do that). 

This was a simple example of nested namespaces. If you want to know more, then google it!

## Docstrings

If your code becomes more and more complex, you need to add documentation. A convenient way in python is to add documentation directly to the function by wrting a string directly after the function definition:

In [25]:
def f(x, y):
    """This is a function that does bla bla ...
    
    Parameters
    ----------
    x : value for bla bla
    y : value for bla bla
    
    Returns
    -------
    z : some value that depends on x and y
    """
    pass

You can then access the documentation from the ipython notebook simply by writing

- ``f`` and hitting `SHIFT + TAB` to show the first part of the docstring
- ``f?`` or ``help(f)`` to get the full docstring

You can also add a docstring to the top of a module file.

## Testing

Especially if code becomes more and more complex, a increadibly useful tool are unittests. 

It is very easy and common for everybody, even the most experienced programmer, to introduce bugs in your code while working on it. Often those bugs may not affect what you are doing now, but break some stuff you did before! On many occasions you can catch these problems by writing tests alongside your code.

(In principle, there are several frameworks for keeping track of tests in python - we are using ``nose``, which is the most common and comfortable one)

What you have to do is simple: When you wrote some code in a module, add another python file that starts with ``test_``, and add a function to it that starts with ``test_`` (that's easy to remember, right;). For example, in the module ``module.py`` you might have:

    def add_together(x, y):
        return x + y

In ``test_module.py`` you would write:

    import module
    
    def test_add_together():
        assert add_together(1, 2) == 3
        assert add_together("abc", "def") == "abcdef"
        
We introduced a new statement, ``assert``. Let's check here in the notebook, what it does:

In [27]:
assert 1 == 1

Nothing happens in this case. But now let's see what happens if we assert a statement that is not true:

In [28]:
assert 1 < 1

AssertionError: 

In this case, an ``AssertionError`` is raised.

The key is to introduce test functions that raise an ``AssertionError`` if something goes wrong. Usually, you do this using ``assert`` statements, but you can also make more
complicated tests with ``if``'s etc., and raising the error manually:

    raise AssertionError("some error message")
    
So now you have a python file with lots of tests. You can run them all automatically by calling from the commandline 

    nosetests
    
within the folder containing the modules and tests. It will run all the tests you ever wrote, and show you all failures!

To make these tests useful, you would want to run them as often as possible. When you design them, try to use them on as small problems as possible, so that they run *fast*. In this way, you will often run them, and catch many errors.