# Programming Extras

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Testing" data-toc-modified-id="Testing-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Testing</a></span><ul class="toc-item"><li><span><a href="#Docstrings" data-toc-modified-id="Docstrings-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Docstrings</a></span></li><li><span><a href="#Doctest" data-toc-modified-id="Doctest-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Doctest</a></span></li><li><span><a href="#Unit-testing" data-toc-modified-id="Unit-testing-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Unit testing</a></span></li></ul></li><li><span><a href="#Debugging" data-toc-modified-id="Debugging-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Debugging</a></span></li><li><span><a href="#Profiling" data-toc-modified-id="Profiling-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Profiling</a></span><ul class="toc-item"><li><span><a href="#Within-jupyter-notebook" data-toc-modified-id="Within-jupyter-notebook-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Within jupyter notebook</a></span></li><li><span><a href="#Profiling-your-entire-code" data-toc-modified-id="Profiling-your-entire-code-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Profiling your entire code</a></span></li><li><span><a href="#Lineprofiling-your-code" data-toc-modified-id="Lineprofiling-your-code-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Lineprofiling your code</a></span></li></ul></li><li><span><a href="#Speed-up-your-code" data-toc-modified-id="Speed-up-your-code-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Speed up your code</a></span><ul class="toc-item"><li><span><a href="#Ufuncs" data-toc-modified-id="Ufuncs-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Ufuncs</a></span></li><li><span><a href="#Numba" data-toc-modified-id="Numba-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Numba</a></span></li></ul></li><li><span><a href="#Git(hub)" data-toc-modified-id="Git(hub)-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Git(hub)</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#What-can-it-look-like?" data-toc-modified-id="What-can-it-look-like?-5.0.1"><span class="toc-item-num">5.0.1&nbsp;&nbsp;</span>What can it look like?</a></span></li></ul></li></ul></li><li><span><a href="#Github" data-toc-modified-id="Github-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Github</a></span></li><li><span><a href="#Publishing-code" data-toc-modified-id="Publishing-code-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Publishing code</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#Software-Citation-Principles" data-toc-modified-id="Software-Citation-Principles-7.0.1"><span class="toc-item-num">7.0.1&nbsp;&nbsp;</span>Software Citation Principles</a></span></li></ul></li></ul></li></ul></div>

## Testing
*Ensure your code never breaks*

### Docstrings

In [None]:
def func(arg1, arg2):
    """Summary line.

    Extended description of function.

    Args:
        arg1 (int): Description of arg1
        arg2 (str): Description of arg2

    Returns:
        bool: Description of return value

    Raises:
        ValueError: If `arg2` is equal to `arg1`.

    Examples:
        Examples should be written in doctest format, and should illustrate how
        to use the function.

        >>> a = [1,2,3]
        >>> print([x + 3 for x in a])
        [4, 5, 6]

    """
    if arg1 == arg2:
        raise ValueError('arg1 may not be equal to arg2')
    return True

### Doctest

In [None]:
def fib(n):
    """Calculates the n-th Fibonacci number.  

    >>> fib(0)
    0
    >>> fib(15)
    610
    >>> 

    """
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
    return a

Which can be run with 
```
$ python3 -m doctest -v <file>
```

Producing
```
Trying:
    fib(0)
Expecting:
    0
ok
Trying:
    fib(15)
Expecting:
    610
ok
1 items had no tests:
    test
1 items passed all tests:
   2 tests in test.fib
2 tests in 2 items.
2 passed and 0 failed.
Test passed.
```

### Unit testing

In [None]:
import unittest

# Define the function
def fun(x):
    return x + 1

# Define the tests
class MyTest(unittest.TestCase):
    def test(self):
        self.assertEqual(fun(3), 4) # Specifies that if the code works, fun(x=3) should return 4

# Run the unit test (the argv is just for jupyter notebooks)
if  __name__  == '__main__':
    unittest.main(argv=['first-arg-is-ignored'], exit=False)

## Debugging
*When your computer makes you feel stupid*

Most people simply use `print()` statements to debug. But you can do better than that...

In [None]:
import time

def complicated_function():
    time.sleep(2)
    x, y, z = 1, '2', 3
    
    # Usually you might do this
    print(y)
    
    return x+y+z 
    
complicated_function()

In [None]:
import time

def complicated_function():
    time.sleep(0.5)
    x, y, z = 1, '2', 3
    
    # But how about
    import IPython; IPython.embed()
    
    return x+y+z
    
complicated_function()

## Profiling
*Find the bottleneck in your code*

### Within jupyter notebook

In [None]:
%%time

def upper_func(x):
    return x + 1

def middle_func(x):
    [upper_func(i) for i in range(10000)]
    return upper_func(x) + 1

def lower_func(x):
    return middle_func(x) + 1

lower_func(5)

In [None]:
%%timeit

def upper_func(x):
    return x + 1

def middle_func(x):
    [upper_func(i) for i in range(10000)]
    return upper_func(x) + 1

def lower_func(x):
    return middle_func(x) + 1

lower_func(5)

### Profiling your entire code

Try profiling your code using a bash function

In [None]:
profile() { python3 -m cProfile -o ~/Downloads/temp.profile $1; snakeviz ~/Downloads/temp.profile;}

### Lineprofiling your code

Or if that's not detailed enough, place the `@profile` decorator above a function in your code, and then run the following

In [None]:
lineprofile() { kernprof -l -v $1;}

## Speed up your code
*Speed up for-loops*

### Ufuncs

In [9]:
import numpy as np

In [10]:
g = np.array([1, 2, 3, 4])
np.sin(g) # A numpy ufunc enables a function to work with the whole array

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

In [11]:
def step_function(x): # Not a ufunc!
    if x > 0:
        return 1
    else:
        return 0

ar = np.array([-10, 10, 100])
step_function(ar)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [12]:
ustep_function = np.vectorize(step_function) # turns our function into a ufunc

ustep_function(ar)

array([0, 1, 1])

### Numba

In [5]:
ar = np.random.random(12345678)

# Silly function
def step_function_python(a):
    output = np.zeros_like(a)
    for i, nr in enumerate(a):
        if nr > 0:
            output[i] = 1

%time step_function_python(ar)

CPU times: user 4.75 s, sys: 21.4 ms, total: 4.77 s
Wall time: 4.79 s


In [6]:
# Numpy version of step function
def step_function_numpy(a):
    output = np.zeros_like(a)
    a[a > 0] = 1
    

%time step_function_numpy(ar)

CPU times: user 45.2 ms, sys: 28.2 ms, total: 73.4 ms
Wall time: 71.9 ms


In [7]:
import numba as nb

In [8]:
@nb.jit()
def step_function_python(a):
    output = np.zeros_like(a)
    for i, nr in enumerate(a):
        if nr > 0:
            output[i] = 1

%time step_function_python(ar)
%time step_function_python(ar)
%time step_function_python(ar)

CPU times: user 210 ms, sys: 55.9 ms, total: 266 ms
Wall time: 304 ms
CPU times: user 26.5 ms, sys: 18.4 ms, total: 44.9 ms
Wall time: 44.8 ms
CPU times: user 24.3 ms, sys: 19.9 ms, total: 44.2 ms
Wall time: 44 ms


## Git(hub)
*Version control your software*

Everyone should use git. Seriously. You'll no longer need to worry about breaking a working version of your code. Don't worry about learning all the commands - these days there are GUIs like Gitkraken which do the hard work for you.

![final_version](media/final.png)

#### What can it look like?
![git](media/git.png)

For a full introduction, see [this presentation](https://davidgardenier.com/talks/201710_git.pdf)

## Github
*Backup your code*

Want to have a backup of your data? Or collaborate on code without sending having to send through files or code fragments? Check out Github and apply for a Student Developer Pack or an Academic Research Pack.

Want to share a snippet of code? Try using gists

Want your code to automatically be tested when it arrives on Github? Try linking it up with Travis

And want to know which percentage of your code you've tested? Then try Coveralls

## Publishing code
*How to ensure your software is accessible*

> Integrity of research depends on transparency and reproducibility

Quote by Alice Allen

#### Software Citation Principles
* Importance | Software is as important as a paper
* Credit and attribution | Software should be quoted
* Unique identification | Globally unique
* Persistence | The identifiers have to persist
* Accessibility | The code, data etc, should be available
* Specificity | Version of software

* Astrophysics Source Code Library (ASCL, ascl.net) | A place to put software

What do you need to do?
* Release your code
* Specify how you want your code to be cited
* License your code
* Register your code
* Archive your code