# Running Python code

## Interactively (demo)

To run Python code interactively, one can use the standard Python prompt, which can be launched by typing ``python`` in your standard shell:

    $ python
    Python 3.4.1 (default, May 21 2014, 21:17:51) 
    [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>

The ``>>>`` indicates that Python is ready to accept commands. If you type ``a = 1`` then press enter, this will assign the value ``1`` to ``a``. If you then type ``a`` you will see the value of ``a`` (this is equivalent to ``print a``):

    >>> a = 1
    >>> a
    1

The Python shell can execute any Python code, even multi-line statements, though it is often more convenient to use Python non-interactively for such cases.

The default Python shell is limited, and in practice, you will want instead to use the IPython (or interactive Python) shell. This is an add-on package that adds many features to the default Python shell, including the ability to edit and navigate the history of previous commands, as well as the ability to tab-complete variable and function names. To start up IPython, type:

    $ ipython
    Python 3.4.1 (default, May 21 2014, 21:17:51) 
    Type "copyright", "credits" or "license" for more information.

    IPython 2.1.0 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.

    In [1]:

The first time you start up IPython, it will display a message which you can skip over by pressing ``ENTER``. The ``>>>`` symbols are now replaced by ``In [x]``, and output, when present, is prepended with ``Out [x]``. If we now type the same commands as before, we get:

    In [1]: a = 1

    In [2]: a
    Out[2]: 1

If you now type the up arrow twice, you will get back to ``a = 1``.

## Running scripts (demo)

While the interactive Python mode is very useful to exploring and trying out code, you will eventually want to write a script to record and reproduce what you did, or to do things that are too complex to type in interactively (defining functions, classes, etc.). To write a Python script, just use your favorite code editor to put the code in a file with a ``.py`` extension. For example, we can create a file called ``test.py`` containing:

    a = 1
    print(a)

On Linux machines, you can use for example the ``emacs`` editor which you can open by typing:
    
    emacs &
    
(ignore the warnings that it prints to the terminal).

We can then run the script on the command-line with:

    $ python test.py
    1

Note: The ``print`` statement is necessary, because typing ``a`` on its own will only print out the value in interactive mode. In scripts, the printing has to be explicitly requested with the print command. To print multiple variables, just separate them with a comma after the print command:

    print(a, 1.5, "spam")

## Combining interactive and non-interactive use (demo)

It can sometimes be useful to run a script to set things up, and to continue in interactive mode. This can be done using the ``%run`` IPython command to run the script, which then gets executed. The IPython session then has access to the last state of the variables from the script:

    $ ipython
    Python 3.4.1 (default, May 21 2014, 21:17:51) 
    Type "copyright", "credits" or "license" for more information.

    IPython 2.1.0 -- An enhanced Interactive Python.
    ?         -> Introduction and overview of IPython's features.
    %quickref -> Quick reference.
    help      -> Python's own help system.
    object?   -> Details about 'object', use 'object??' for extra details.

    In [1]: %run test.py
    1

    In [2]: a + 1
    Out[2]: 2

# Using the Jupyter notebook

The Jupyter *notebook* is a browser-based application that allows you to write notebooks. The advantage of doing this is that you can include text, code, and plots in the same document.

## Starting up

The normal way to start up the Jupyter notebook is:

    jupyter notebook
    
Once you do this, your web browser should open and go to a page showing a list of folders.

## First steps

At first glance, a notebook looks like a fairly typical application - it has a menubar (File, Edit, View, etc.) and a tool bar with icons. Below this, you will see an empty cell, in which you can type any Python code. You can write several lines of code, and once it is ready to run, you can **press shift-enter** and it will get executed:

In [None]:
a = 1
print(a)

You can then click on that cell, change the Python code, and press shift-enter again to re-execute the code. Once you have executed a cell once, a new cell will appear below. You can again enter some code, then press shift-enter to execute it.

## Plotting

To make plots, enter any Matplotlib commands (see later lectures), and just press shift-enter - note that all commands for a plot should be entered in one cell, you cannot split it up over multiple cells:

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([1,2,3],[4,5,6])
plt.xlabel("x")
plt.ylabel("y")

As before, you can always go back and edit the cell to re-make the plot. If you want to save it, make sure you include ``plt.savefig(filename)`` as the last command, where ``filename`` is the name of the plot, such as ``my_plot.png``.

## Text

It is likely that you will want to enter actual text (non-code) in the notebook. To do this, click on a cell, and in the drop-down menu in the toolbar, select 'Markdown'. This is a specific type of syntax for writing text. You can just write text normally and press shift-enter to *render* it:

    This is some plain text

To edit it, double click on the cell. You can also enter section headings using the following syntax:

    This is a title
    ===============

    This is a sub-title
    -------------------

which will look like:

This is a title
===============

This is a sub-title
-------------------

Finally, if you are familiar with LaTeX, you can enter equations using:

    $$E = m c^2$$

on a separate line, or:

    The equation $p=h/\lambda$ is very important

to include it in a sentence. This will look like:

$$E = m c^2$$

The equation $p=h/\lambda$ is very important

For more information about using LaTeX for equations, see [this guide](http://en.wikibooks.org/wiki/LaTeX/Mathematics).

## Splitting/deleting/moving cells

You can split, delete, and move cells by going to 'Edit' and selecting the appropriate command. Some of the commands are also available in the toolbar - put your mouse over the icon and wait for a second, and it will tell you what it does.

## Important notes

A few important notes about using the notebook:

* Save often! There is an auto-save in the notebook, but better to also save explicitly from time to time.

* Code *can* be executed in an order different from top to bottom, but note that if you do this variables will not be reset. So for example if you type:

In [None]:
a = 1

then do:

In [None]:
print(a)

followed by:

In [None]:
a = 2

and now go back to execute the ``print`` cell above, you will now see the result is ``2``.

To make sure that your code works from top to bottom, go to the 'Cell' menu item and go to **All Output** -> **Clear** then in the **Cell** menu, select **Run All**.

In addition, even if you remove a cell, then variables set in that cell still exist unless you restart the notebook. If you want to restart a notebook, you can select **Kernel** -> **Restart & Clear Output**. This removes any variables from memory, and you have to start running the notebook from the start.

# Numbers, Strings, and Lists

Python supports a number of built-in types and operations. This section covers the most common types, but information about additional types is available [here](https://docs.python.org/3/library/stdtypes.html).

## Basic numeric types

The basic data numeric types are similar to those found in other languages, including:

**Integers (``int``)**

In [None]:
x = 1
y = 219089
z = -21231

In [None]:
print(x, y, z)

**Floating point values (``float``)**

In [None]:
a = 4.3
b = -5.2111222
c = 3.1e33

In [None]:
print(a, b, c)

**Complex values (``complex``)**

In [None]:
d = 4 - 1j

In [None]:
print(d)

Manipulating these behaves the way you would expect, so an operation (``+``, ``-``, ``*``, ``**``, etc.) on two values of the same type produces another value of the same type (with one, exception, ``/``, see below), while an operation on two values with different types produces a value of the more 'advanced' type:

Adding two integers gives an integer:

In [None]:
1 + 3

Multiplying two floats gives a float:

In [None]:
3. * 2.

Subtracting two complex numbers gives a complex number:

In [None]:
(2 + 4j) - (1 + 6j)

Multiplying an integer with a float gives a float:

In [None]:
3 * 9.2

Multiplying a float with a complex number gives a complex number:

In [None]:
2. * (-1 + 3j)

Multiplying an integer and a complex number gives a complex number:

In [None]:
8 * (-3.3 + 1j)

However, we now get to a special case. If you are using Python 2, then the following will return an integer, and if you are using Python 3, it will return a floating-point value:

In [None]:
from __future__ import division

In [None]:
7/4

In [None]:
3/2

In Python 2, this returns ``1`` because it rounds the solution to an integer. However, this has been recognized to be risky and therefore in Python 3 the behavior has been fixed.

If you ever need to work with Python 2 code, the safest approach is to add the following line at the top of the script:

    from __future__ import division
    
and the division will then behave like a Python 3 division. Note that in Python 3 you can also specifically request integer division using ``3 // 2``.

## Exercise 1

The operator for raising one value to the power of another is ``**``. Try calculating $4^3$, $2+3.4^2$, and $(1 + i)^2$. What is the type of the output in each case, and does it make sense?

In [None]:
print 4**3
print 2+3.4**2
# enter your solution here

## Strings

Strings (``str``) are sequences of characters:

In [None]:
s = "Xpam egg spam spay'"

In [None]:
s[0],s[5],s[17]

You can use either single quotes (``'``), double quotes (``"``), or triple quotes (``'''`` or ``"""``) to enclose a string (the last one is used for multi-line strings). To include single or double quotes inside a string, you can either use the opposite quote to enclose the string:


In [None]:
"I'm"

In [None]:
'"hello"'

or you can *escape* them:

In [None]:
'I\'m'

In [None]:
"\"hello\""

You can access individual characters or chunks of characters using the item notation with square brackets``[]``:

In [None]:
s[5]

Note that in Python, indexing is *zero-based*, which means that the first element in a list is zero:

In [None]:
s[0] = 'X'

Note that strings are **immutable**, that is you cannot change the value of certain characters without creating a new string:

In [None]:
s[0]

You can easily find the length of a string:

In [None]:
len(s)

You can use the ``+`` operator to combine strings:

In [None]:
"hello," + " " + "world!"

Finally, strings have many **methods** associated with them, here are a few examples:

In [None]:
s.upper()  # An uppercase version of the string

In [None]:
s.index('pay')  # An integer giving the position of the sub-string

In [None]:
s.split()  # A list of strings

In [None]:
g="KathSa rahRuby"

In [None]:
g.split()

## Lists

There are several kinds of ways of storing sequences in Python, the simplest being the ``list``, which is simply a sequence of *any* Python object.

In [None]:
li = [4, 5.5, "spam"]
list_1=[1,2,3,4]
print li
print list_1
list_1[3]

Accessing individual items is done like for strings

In [None]:
li[0]

In [None]:
li[1]

In [None]:
li[2]

Values in a list can be changed, and it is also possible to append or insert elements:

In [None]:
li[1] = -2.2

In [None]:
li

In [None]:
li.append(-3)

In [None]:
li

In [None]:
li.insert(1, 3.14)

In [None]:
li

Similarly to strings, you can find the length of a list (the number of elements) with the ``len`` function:

In [None]:
len([1,2,3,4,5])

## Slicing

We already mentioned above that it is possible to access individual elements from a string or a list using the square bracket notation. You will also find this notation for other object types in Python, for example tuples or Numpy arrays, so it's worth spending a bit of time looking at this in more detail.

In addition to using positive integers, where ``0`` is the first item, it is possible to access list items with *negative* indices, which counts from the end: ``-1`` is the last element, ``-2`` is the second to last, etc:

In [None]:
li = [4, 67, 8, 2, 9, 6]
x=li[4],li[5],li[0]
print x

In [None]:
li[4]

You can also select **slices** from a list with the ``start:end:step`` syntax. Be aware that the last element is *not* included!

In [None]:
li[0:2]

In [None]:
jane=[1,2,3,4,5,6,7,8,9,10]
print jane[2:]
print jane[4:8] #4:8 #
li[:2]  # ``start`` defaults to zero

In [None]:
li[2:]  # ``end`` defaults to the last element 

In [None]:
print jane
print jane[3::3]  # specify a step size

## Exercise 2

Given a string such as the one below, make a new string that does not contain the word ``egg``:

In [None]:
a = "Hello, egg world!"

# enter your solution here
a.split()
b=a.split()
print b[0]
print b[2]
c=b[0],b[2]
print c

Try changing the string above to see if your solution works (you can assume that ``egg`` appears only once in the string).

## A note on Python objects (demo)

Most things in Python are objects.  But what is an object?

Every constant, variable, or function in Python is actually a object with a
type and associated attributes and methods. An *attribute* is a property of the
object that you get or set by giving the ``<object_name>.<attribute_name>``, for example ``img.shape``. A *method* is a function that the object provides, for example ``img.argmax(axis=0)`` or ``img.min()``.
    
Use tab completion in IPython to inspect objects and start to understand
attributes and methods. To start off create a list of 4 numbers:

In [None]:
li = [3, 1, 2, 1]
li.append(20)
print li

then in the next cell, type ``li.`` then press ``<TAB>``:

This will show the available attributes and methods for the Python list
``li``.

**Using ``<TAB>``-completion and help is a very efficient way to learn and later
remember object methods!**

    In [2]: li.
    li.append   li.copy     li.extend   li.insert   li.remove   li.sort
    li.clear    li.count    li.index    li.pop      li.reverse 
    
If you want to know what a function or method does, you can use a question mark ``?``:
    
    In [9]: li.append?
    Type:       builtin_function_or_method
    String Form:<built-in method append of list object at 0x1027210e0>
    Docstring:  L.append(object) -> None -- append object to end

## Exercise 3

In the following string, find out (with code) how many times the letter "A" appears.

In [None]:
s = "CAGTACCAAGTGAAAGAT"
s.decode()

In [None]:
# your solution here

Given two lists, try making a new list that contains the elements from both previous lists:

In [None]:
a = [1, 2, 3]
b = [4, 5, 6]
print b[1]+a[2],b[0]-a[1],a[2]*b[0]

In [None]:
# your solution here

Note that there are several possible solutions!

## Dynamic typing

One final note on Python types - unlike many other programming languages where types have to be declared for variables, Python is *dynamically typed* which means that variables aren't assigned a specific type:

In [None]:
a = 1
type(a)

In [None]:
a = 2.3
type(a)

In [None]:
a = 'hello'
type(a)

## Converting between types

There may be cases where you want to convert a string to a floating point value, and integer to a string, etc. For this, you can simply use the ``int()``, ``float()``, and ``str()`` functions:

In [None]:
int('1')

In [None]:
float('4.31')

For example:

In [None]:
int('5') + float('4.31')

is different from:

In [None]:
'5' + '4.31'

Similarly:

In [None]:
str(1)

In [None]:
str(4.5521)

In [None]:
str(3) + str(4)

Be aware of this for example when connecting strings with numbers, as you can only concatenate identical types this way:

In [None]:
'The value is ' + 3

Instead do:

In [None]:
'The value is ' + str(3)

## Rounding floating point numbers to integers

By default, ``int`` will round floating point values **down**:

In [None]:
int(14.99)

If you want to round to the nearest integer, you can instead use ``round`` or ``np.round``:

In [None]:
round(14.9)

In Python 2, ``round(14.9)`` returns ``15.0`` so to be safe, you should do:

In [None]:
int(round(14.9))

# Booleans, Tuples, and Dictionaries

## Booleans

A ``boolean`` is one of the simplest Python types, and it can have two values: ``True`` and ``False`` (with uppercase ``T`` and ``F``):

In [None]:
a = True
b = False

Booleans can be combined with logical operators to give other booleans:

In [None]:
True and False

In [None]:
True or False

In [None]:
(False and (True or False)) or (False and True)

Standard comparison operators can also produce booleans:

In [None]:
1 == 3

In [None]:
1 != 3

In [None]:
3 > 2

In [None]:
3 <= 3.4

## Exercise 1

Write an expression that returns ``True`` if ``x`` is strictly greater than 3.4 and smaller or equal to 6.6, or if it is 2, and try changing ``x`` to see if it works:

In [None]:
x = 2
(x>3.4 and x<=6.6) or x==2
# your solution here


## Tuples

Tuples are, like lists, a type of sequence, but they use round parentheses rather than square brackets:

In [None]:
t = (1, 2, 3)
T=[1,2,3]
print t
print T

In [None]:
T[1]=4
print t
print T

They can contain heterogeneous types like lists:

In [None]:
t = (1, 2.3, 'spam')

and also support item access and slicing like lists:

In [None]:
t[1]

In [None]:
t[:2]

The main difference between lists and tuples is that tuples are **immutable** (like strings), while lists are mutable,:

In [None]:
t[1] = 2

We will not go into the details right now of why this is useful, but you should know that these exist as you may encounter them in examples.

## Dictionaries

One of the data types that we have not talked about yet is called *dictionaries* (``dict``). If you think about what a 'real' dictionary is, it is a list of words, and for each word is a definition. Similarly, in Python, we can assign definitions (or 'values'), to words (or 'keywords').

Dictionaries are defined using curly brackets ``{}``:

In [None]:
d = {'a':1, 'b':2, 'c':3}

Items are accessed using square brackets and the 'key':

In [None]:
d['a']

In [None]:
d['c']

Values can also be set this way:

In [None]:
d['r'] = 2.2

In [None]:
print(d)

The keywords don't have to be strings, they can be many (but not all) Python objects:

In [None]:
e = {}
e['a_string'] = 3.3
e[3445] = 2.2
e[complex(2,1)] = 'value'

In [None]:
print(e)

In [None]:
e[3445]

If you try and access an element that does not exist, you will get a ``KeyError``:

In [None]:
e[4]

Also, note that dictionaries do *not* know about order, so there is no 'first' or 'last' element.

It is easy to check if a specific key is in a dictionary, using the ``in`` operator:

In [None]:
"7" in d

In [None]:
"t" in d

Note that this also works for lists:

In [None]:
3 in [1,2,3]

## Exercise 2

Try making a dictionary to map between a few words in two different languages:

In [None]:

# your solution here


# If statements and loops

We now know how to set variables of various types:

In [None]:
a = 1
b = 3.14
c = 'hello'
d = [a, b, c]

but this doesn't get us very far. One essential part of programming are **if statements** and **loops** which allow us to control how the program will proceed based on for example some conditions, or making parts of the program run multiple times.

## ``if`` statements

The simplest form of control flow is the ``if`` statement, which executes a block of code only if a certain condition is true (and optionally executes code if it is *not* true. The basic syntax for an if-statement is the following:

    if condition:
        # do something
    elif condition:
        # do something else
    else:
        # do yet something else

Notice that there is no statement to end the if statement, and the
presence of a colon (``:``) after each control flow statement. Python relies
on **indentation and colons** to determine whether it is in a specific block of
code.

For example, in the following code:

In [None]:
a = 2

if a != 1:
    print("a is 1, changing to 2")
    print 'linear_a = ', 2*a+6
    a = 3
elif a > 1:
    print("I am Sey")
    a=9

print a
print("I am done.")

The first print statement, and the ``a = 2`` statement only get executed if
``a`` is 1. On the other hand, ``print "finished"`` gets executed regardless,
once Python exits the if statement.

**Indentation is very important in Python, and the convention is to use four spaces (not tabs) for each level of indent.**

Back to the if-statements: the conditions in the statements can be anything that returns a boolean value. For example, ``a == 1``, ``b != 4``, and ``c <= 5`` are valid conditions because they return either ``True`` or ``False`` depending on whether the statements are true or not.

Standard comparisons can be used (``==`` for equal, ``!=`` for not equal, ``<=`` for less or equal, ``>=`` for greater or equal, ``<`` for less than, and ``>`` for greater than), as well as logical operators (``and``, ``or``, ``not``). Parentheses can be used to isolate different parts of conditions, to make clear in what order the comparisons should be executed, for example:

    if (a == 1 and b <= 3) or c > 3:
        # do something

More generally, any function or expression that ultimately returns ``True`` or ``False`` can be used.

## ``for`` loops

Another common structure that is important for controling the flow of execution are loops. Loops can be used to execute a block of code multiple times. The most common type of loop is the ``for`` loop. In its most basic form, it
is straightforward:

    for value in iterable:
        # do things

The ``iterable`` can be any Python object that can be iterated over. This
includes lists or strings.

In [None]:
for a in [3, 1.2, 'a']:
    print(a)

In [None]:
for letter in 'akwaaba':
    print(letter)

A common type of for loop is one where the value should go between two integers with a specific set size. To do this, we can use the ``range`` function. If given a single value, it will allow you to iterate from 0 to the value minus 1:

In [None]:
for i in range(0,1001,100):
    print(i)

In [None]:
for i in range(1, 12):
    print(i)

In [None]:
for i in range(2, 20, 2):  # the third entry specifies the "step size"
    print(i)

If you try iterating over a dictionary, it will iterate over the **keys** (not the values), in no specific order:

In [None]:
d = {'a':1, 'b':2, 'c':3}
for key in d:
    print(key)

But you can easily get the value with:

In [None]:
for key in d:
    print(key, d[key])

## Building programs

These different control flow structures can be combined to form programs. For example, the following program will print out a different message depending on whether the current number in the loop is less, equal to, or greater than 10:

In [None]:
for value in [2, 55, 4, 5, 12, 8, 9,10, 22]:
    if value > 10:
        print("Value is greater than 10 (" + str(value) + ")")
    #elif value == 10:
   #     print("Value is exactly 10")
   # else:
    #    print("Value is less than 10 (" + str(value) + ")")
#print('I am done.')

## Exercise 1

Write a program that will print out all the prime numbers (numbers divisible only by one and themselves) below 1000.

Hint: the ``%`` operator can be used to find the remainder of the division of an integer by another:

In [None]:
20 % 3

In [None]:

# enter your solution here


## Exiting or continuing a loop

There are two useful statements that can be called in a loop - ``break`` and ``continue``. When called, ``break`` will exit the loop it is currently in:

In [None]:
for i in range(10):
    print(i)
    if i == 3:
        break

The other is ``continue``, which will ignore the rest of the loop and go straight to the next iteration:

In [None]:
for i in range(10):
    if i == 2 or i == 8:
        continue
    print(i)

## Exercise 2

When checking if a value is prime, as soon as you have found that the value is divisble by a single value, the value is therefore not prime and there is no need to continue checking whether it is divisible by other values. Copy your solution from above and modify it to break out of the loop once this is the case.

In [None]:

# enter your solution here


## ``while`` loops

Similarly to other programming languages, Python also provides a ``while`` loop which is similar to a ``for`` loop, but where the number of iterations is defined by a condition rather than an iterator:

    while condition:
        # do something

For example, in the following example:

In [None]:
a = 1
while a < 10:
    print(a)
    a = a * 1.5
print("Once the while loop has completed, a has the value", a)

the loop is executed until ``a`` is equal to or exceeds 10.

## Exercise 3

Write a program (using a ``while`` loop) that will find the Fibonacci sequence up to (and excluding) 100000. The two first numbers are 0 and 1, and every subsequent number is the sum of the two previous ones, so the sequence starts ``[0, 1, 1, 2, 3, 5, ...]``.

Optional: Store the sequence inside a Python list, and only print out the whole list to the screen once all the numbers are available. Then, check whether any of the numbers in the sequence are a square (e.g. ``0*0``, ``1*1``, ``2*2``, ``3*3``, ``4*4``) and print out those that are.

In [None]:

# enter your solution here


# Functions

## Syntax

You now know how to run Python code, assign variables, and write control flow statements, which allows us to write programs that can do calculations. In fact, this is all you *really* need to write programs (except for being able to read in and write out data which we will talk about later). However, with only this, programs will quickly become very long and unreadable. So one very important rule in programming is to **avoid repetition**.

The syntax for a **function** is:
    
    def function_name(arguments):
        # code here
        return values

Functions are the **building blocks** of programs - think of them as basic units that are given a certain input an accomplish a certain task. Over time, you can build up more complex programs while preserving readability.

Similarly to ``if`` statements and ``for`` and ``while`` loops, indentation is very important because it shows where the function starts and ends.

**Note**: it is a common convention to always use lowercase names for functions.

A function can take multiple arguments...

In [None]:
def add(a, b):
    return a + b

print(add(1,3))
print(add(1.,3.2))
print(add(4,3.))

... and can also return multiple values:

In [None]:
def double_and_halve(value):
    return value * 2., value / 2.

print(double_and_halve(5.))

If multiple values are returned, you can store them in separate variables.

In [None]:
d, h = double_and_halve(5.)

In [None]:
print(d)

In [None]:
print(h)

Functions can call other functions:

In [None]:
def do_a():
    print("doing A")
    
def do_b():
    print("doing B")
    
def do_a_and_b():
    do_a()
    do_b()

In [None]:
do_a_and_b()

**Just because you can put code in functions doesn't mean you always should**. Only use functions to avoid repeating code, or if it makes the program clearer. It's best to try and break up the code into units that make sense - in the end, your function should ideally have a name that everyone can understand.

## Exercise 1

Copy your code that finds prime numbers here and modify it so as to make it a function that given a number will return ``True`` or ``False`` depending on whether it is prime.

In [None]:

# your solution here


## Exercise 2

Try and write a function that will return the factorial of a number (e.g. ``5!=5*4*3*2*1``). First you can try and write a function that uses a loop internally.

It is possible for functions to call themselves (**recursive** functions), so see if you can write a function that uses **no** loops!

In [None]:

# enter your solution here


## Optional Arguments

In addition to normal arguments, functions can take **optional arguments** (also called **keyword arguments**) that can default to a certain value. For example, in the following case:

In [None]:
def say_hello(first_name, middle_name='', last_name=''):
    print("First name: " + first_name)
    if middle_name != '':
        print("Middle name: " + middle_name)
    if last_name != '':
        print("Last name: " + last_name)

we can call the function either with one argument:

In [None]:
say_hello("Michael")

and we can also give one or both optional arguments (and the optional arguments can be given in any order):

In [None]:
say_hello("Michael", last_name="Palin")

In [None]:
say_hello("Michael", middle_name="Edward", last_name="Palin")

In [None]:
say_hello("Michael", last_name="Palin", middle_name="Edward")

## Built-in functions

Some of you may have already noticed that there are a few functions that are defined by default in Python:

In [None]:
x = [1,3,6,8,3]

In [None]:
len(x)

In [None]:
sum(x)

In [None]:
int(1.2)

A full list of built-in functions is available [here](http://docs.python.org/3/library/functions.html). Note that there are not *that* many - these are only the most common functions. Most functions are in fact kept inside **modules**, which we will cover later.

## Exercise 3

Write a function that takes a list, and returns the mean of the values. Test it with the following list:

In [None]:
l = [1, 3, 4, 5, 6, 7]

# enter your solution here

## Arbitrary number of arguments in functions

In some cases, you may want to write a function that can accept an arbitrary number of positional or keyword arguments. To do this, you can use the ``*`` and ``**`` notation respectively:

In [None]:
def show_args(*args):
    """
    This function can take any number of positional
    arguments, and args is converted to a list.
    """
    print(args)

In [None]:
show_args(1, 4, 'a')

In [None]:
def show_kwargs(**kwargs):
    """
    This function can take any number of keyword
    arguments, and kwargs is converted to a dict.
    """
    print(kwargs)

In [None]:
show_kwargs(a=1, b='3')

These can be combined:

In [None]:
def show_allargs(*args, **kwargs):
    """
    This function can take any number of positional
    and keyword arguments.
    """
    print('args:', args)
    print('kwargs:', kwargs)

In [None]:
show_allargs(3, 5, 'a', a='hello', j=3)

This syntax can also be used when calling functions, to expand a normal Python list into multiple arguments:

In [None]:
def show_abc(a, b, c):
    print('a=', a)
    print('b=', b)
    print('c=', c)

In [None]:
values = [1, 2, 3]

In [None]:
show_abc(*values)

Likewise, a dictionary can be expanded into keyword arguments:

In [None]:
def show_def(d=None, e=None, f=None):
    print('d=', d)
    print('e=', e)
    print('f=', f)

In [None]:
values = {'f':3, 'd':1}

In [None]:
show_def(**values)

# Modules

One of the strengths of Python is that there are many built-in add-ons - or
*modules* - which contain existing functions, classes, and variables which allow you to do complex tasks in only a few lines of code. In addition, there are many other third-party modules (e.g. Numpy, Scipy, Matplotlib) that can be installed, and you can also develop your own modules that include functionalities you commonly use.

The built-in modules are referred to as the *Standard Library*, and you can
find a full list of the available functionality in the [Python Documentation](http://docs.python.org/3/library/index.html).

To use modules in your Python session or script, you need to **import** them. The
following example shows how to import the built-in ``math`` module, which
contains a number of useful mathematical functions:

In [None]:
sin(3.2)

In [None]:
import math as mth

In [None]:
mth.

sin(3.2)

You can then access functions and other objects in the module with ``math.<function>``, for example:

In [None]:
a=math.sin(2.3)
b=mth.cos(3)
print a/b

In [None]:
from math import factorial
print(factorial(3))

In [None]:
math.pi

Because these modules exist, it means that if what you want to do is very common, it means it probably already exists, and you won't need to write it (making your code easier to read).

For example, the ``numpy`` module, which we will talk about soon, contains useful functions for finding e.g. the mean, median, and standard deviation of a sequence of numbers:

In [None]:
import numpy as np

In [None]:
li = [1,2,7,3,1,3]

np.mean(li)

In [None]:
np.median(li)

In [None]:
np.std(li)

Notice that in the above case, we used:

    import numpy as np
    
instead of:

    import numpy
    
which shows that we can rename the module so that it's not as long to type in the program.

Finally, it's also possible to simply import the functions needed directly:

In [None]:
from math import sin, cos
#print(sin(3.4))
print(cos(60))


You may find examples on the internet that use e.g.

    from module import *
    
but this is **not** recommended, because it will make it difficult to debug programs, since common debugging tools that rely on just looking at the programs will not know all the functions that are being imported.

## Where to find modules and functions

How do you know which modules exist in the first place? The Python documentation contains a [list of modules in the Standard Library](http://docs.python.org/3/library), but you can also simply search the web. Once you have a module that you think should contain the right kind of function, you can either look at the documentation for that module, or you can use the tab-completion in IPython:
    
    In [2]: math.<TAB>
    math.acos       math.degrees    math.fsum       math.pi
    math.acosh      math.e          math.gamma      math.pow
    math.asin       math.erf        math.hypot      math.radians
    math.asinh      math.erfc       math.isinf      math.sin
    math.atan       math.exp        math.isnan      math.sinh
    math.atan2      math.expm1      math.ldexp      math.sqrt
    math.atanh      math.fabs       math.lgamma     math.tan
    math.ceil       math.factorial  math.log        math.tanh
    math.copysign   math.floor      math.log10      math.trunc
    math.cos        math.fmod       math.log1p      
    math.cosh       math.frexp      math.modf    

## Exercise 1

Does the ``math.cos`` funtion take radians or degrees? Are there functions that can convert between radians and degrees? Use these to find the cosine of 60 degrees, and the sine of pi/6 radians.

In [None]:
from math import pi
math.cos(3.2)
a=math.radians(60)
math.tan(pi/6)
math.sin(math.sqrt(4))
# enter your solution here


# Introduction to Numpy

Python lists:

* are very flexible
* don't require uniform numerical types
* are very easy to modify (inserting or appending objects).

However, flexibility often comes at the cost of performance, and lists are not the ideal object for numerical calculations.

This is where **Numpy** comes in. Numpy is a Python module that defines a powerful n-dimensional array object that uses C and Fortran code behind the scenes to provide high performance.

The downside of Numpy arrays is that they have a more rigid structure, and require a single numerical type (e.g. floating point values), but for a lot of scientific work, this is exactly what is needed.

The Numpy module is imported with:

In [None]:
import numpy

Although in the rest of this course, and in many packages, the following convention is used:

In [None]:
import numpy as np

This is because Numpy is so often used that it is shorter to type ``np`` than ``numpy``.

## Creating Numpy arrays

The easiest way to create an array is from a Python list, using the ``array`` function:

In [None]:
a = np.array([10, 20, 30, 40])

In [None]:
a

Numpy arrays have several attributes that give useful information about the array:

In [None]:
a.ndim  # number of dimensions

In [None]:
a.shape  # shape of the array

In [None]:
a.dtype  # numerical type

*Note: Numpy arrays actually support more than just one integer type and one floating point type - they support signed and unsigned 8-, 16-, 32-, and 64-bit integers, and 16-, 32-, and 64-bit floating point values.*

There are several other ways to create arrays. For example, there is an ``arange`` function that can be used similarly to the built-in Python ``range`` function, with the exception that it can take floating-point input:

In [None]:
np.arange(10)

In [None]:
np.arange(3, 12, 2)

In [None]:
np.arange(1.2, 4.4, 0.1)

Another useful function is ``linspace``, which can be used to create linearly spaced values between and including limits:

In [None]:
np.linspace(11., 12., 11)

and a similar function can be used to create logarithmically spaced values between and including limits:

In [None]:
np.logspace(1., 4., 7)

Finally, the ``zeros`` and ``ones`` functions can be used to create arrays intially set to ``0`` and ``1`` respectively:

In [None]:
np.zeros(10)

In [None]:
np.ones(5)

## Exercise 1

Create an array which contains 11 values logarithmically spaced between $10^{-20}$ and $10^{-10}$.

In [None]:
# your solution here

Create an array which contains the value 2 repeated 10 times

In [None]:
# your solution here

Try using ``np.empty(10)`` and compare the results to ``np.zeros(10)`` - why do you think there is a difference?

In [None]:
# your solution here

Create an array containing 5 times the value 0, as a 32-bit floating point array (this is harder)

In [None]:
# your solution here

## Combining arrays

Numpy arrays can be combined numerically using the standard ``+-*/**`` operators:

In [None]:
x = np.array([1,2,3])
y = np.array([4,5,6])

In [None]:
x + 2 * y

In [None]:
x ** y

Note that this differs from lists:

In [None]:
x = [1,2,3]
y = [4,5,6]

In [None]:
x + 2 * y

## Accessing and Slicing Arrays

Similarly to lists, items in arrays can be accessed individually:

In [None]:
x = np.array([9,8,7])

In [None]:
x[0]

In [None]:
x[1]

and arrays can also be **sliced** by specifiying the start and end of the slice (where the last element is exclusive):

In [None]:
y = np.arange(10)

In [None]:
y[0:5]

optionally specifying a step:

In [None]:
y[0:10:2]

As for lists, the start, end, and step are all optional, and default to ``0``, ``len(array)``, and ``1`` respectively:

In [None]:
y[:5]

In [None]:
y[::2]

## Exercise 2

Given an array ``x`` with 10 elements, find the array ``dx`` containing 9 values where ``dx[i] = x[i+1] - x[i]``. Do this without loops!

In [None]:
# your solution here

## Multi-dimensional arrays

Numpy can be used for multi-dimensional arrays:

In [None]:
x = np.array([[1.,2.],[3.,4.]])

In [None]:
x.ndim

In [None]:
x.shape

In [None]:
y = np.ones([3,2,3])  # ones takes the shape of the array, not the values

In [None]:
y

In [None]:
y.shape

Multi-dimensional arrays can be sliced differently along different dimensions:

In [None]:
z = np.ones([6,6,6])

In [None]:
z[::3, 1:4, :]

## Functions

In addition to an array class, Numpy contains a number of **vectorized** functions, which means functions that can act on all the elements of an array, typically much faster than could be achieved by looping over the array.

For example:

In [None]:
theta = np.linspace(0., 2. * np.pi, 10)

In [None]:
theta

In [None]:
np.sin(theta)

Another useful package is the ``np.random`` sub-package, which can be used to genenerate random numbers fast:

In [None]:
# uniform distribution between 0 and 1
np.random.random(10)

In [None]:
# 10 values from a gaussian distribution with mean 3 and sigma 1
np.random.normal(3., 1., 10)

A very useful function in Numpy is [numpy.loadtxt](http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) which makes it easy to read in data from column-based data. For example, given the following file:

In [None]:
%cat data/columns.txt

We can either read it in using a single multi-dimensional array:

In [None]:
data = np.loadtxt('data/columns.txt')
data

Or we can read the individual columns:

In [None]:
date, temperature = np.loadtxt('data/columns.txt', unpack=True)

In [None]:
date

In [None]:
temperature

There are additional options to skip header rows, ignore comments, and read only certain columns. See the documentation for more details.

By default, many functions are applied to the whole arrays:

In [None]:
values = np.random.random((3,6))

In [None]:
np.sum(values)

In [None]:
np.median(values)

But in some cases you can make use of the ``axis=`` argument to tell Numpy to only apply the stastistic along one dimension (remember that the first axis is 0):

In [None]:
np.sum(values, axis=0)

In [None]:
np.median(values, axis=1)

There are many more functions in Numpy - for instance, you can find a list of all mathematical functions [here](http://docs.scipy.org/doc/numpy/reference/routines.math.html), and a list of all functions that modify arrays (such as reshaping, transposing, etc.) [here](http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html)

## Masking

The index notation ``[...]`` is not limited to single element indexing, or multiple element slicing, but one can also pass a discrete list/array of indices:

In [None]:
x = np.array([1,6,4,7,9,3,1,5,6,7,3,4,4,3])
x[[1,2,4,3,3,2]]

which is returning a new array composed of elements 1, 2, 4, etc from the original array.

Alternatively, one can also pass a boolean array of ``True/False`` values, called a **mask**, indicating which items to keep:

In [None]:
x[np.array([True, False, False, True, True, True, False, False, True, True, True, False, False, True])]

Now this doesn't look very useful because it is very verbose, but now consider that carrying out a comparison with the array will return such a boolean array:

In [None]:
x > 3.4

It is therefore possible to extract subsets from an array using the following simple notation:

In [None]:
x[x > 3.4]

Conditions can be combined:

In [None]:
x[(x > 3.4) & (x < 5.5)]

Of course, the boolean **mask** can be derived from a different array to ``x`` as long as it is the right size:

In [None]:
x = np.linspace(-1., 1., 14)
y = np.array([1,6,4,7,9,3,1,5,6,7,3,4,4,3])

In [None]:
y[(x > -0.5) & (x < 0.4)]

Since the mask itself is an array, it can be stored in a variable and used as a mask for different arrays:

In [None]:
keep = (x > -0.5) & (x < 0.4)
x_new = x[keep]
y_new = y[keep]

In [None]:
x_new

In [None]:
y_new

A mask can also appear on the left hand side of an assignment:

In [None]:
y[y > 5] = 0.

In [None]:
y

### NaN values

In arrays, some of the values are sometimes NaN - meaning *Not a Number*. If you multiply a NaN value by another value, you get NaN, and if there are any NaN values in a summation, the total result will be NaN. One way to get around this is to use ``np.nansum`` instead of ``np.sum`` in order to find the sum:

In [None]:
x = np.array([1,2,3,np.nan])

In [None]:
np.nansum(x)

In [None]:
np.nanmax(x)

You can also use ``np.isnan`` to tell you where values are NaN. For example, ``array[~np.isnan(array)]`` will return all the values that are not NaN (because ~ means 'not'):

In [None]:
np.isnan(x)

In [None]:
x[np.isnan(x)]

In [None]:
x[~np.isnan(x)]

## Exercise 3

The [data/temperatures_ghana.txt](data/temperatures_ghana.txt) data file gives the average monthly temperature in Ghana from 1900 to 2012 in the following format:

    1901.042 25.326
    1901.125 28.643
    1901.208 28.397
    1901.292 28.612
    1901.375 27.477
    1901.458 26.638
    1901.542 25.647
    1901.625 25.338
    1901.708 25.836
    1901.792 26.874
    ...

Read in the file using ``np.loadtxt``:

In [None]:

# your solution here


Now try and use masking to extract arrays that contain the dates and temperatures only since 1990:

In [None]:

# your solution here


## Masked arrays

In some cases, it can be useful to represent arrays with missing values - for this we can use [masked arrays](http://docs.scipy.org/doc/numpy/reference/routines.ma.html). These are created using the Numpy ``ma`` sub-package:

In [None]:
masked_array = np.ma.array([1,2,3], mask=[0,1,0])
masked_array

A mask value of ``1`` or ``True`` indicates that the value should be **ignored** and a value of ``0`` or ``False`` indicates that the value is valid. The mask can also be set after the array is created:

In [None]:
values = np.round(np.random.random((10,10)),2)
masked_array = np.ma.array(values)
masked_array.mask = masked_array > 0.8
print(masked_array)

One of the advantages of using masked arrays is that many Numpy functions are designed to take into account the masking:

In [None]:
print(np.median(masked_array))

In [None]:
print(np.median(masked_array, axis=1))

# Introduction to Matplotlib

Now that we can start doing serious numerical analysis with Numpy arrays, we also reach the stage where we can no longer print out hundreds or thousands of values, so we need to be able to make plots to show the results.

The **Matplotlib** package can be used to make scientific-grade plots. You can import it with:

In [None]:
import matplotlib.pyplot as plt

If you use the Jupyter notebook, add a cell containing:

In [None]:
%matplotlib inline

and the plots will appear inside the notebook.

## Basic plotting

The main plotting function is called ``plot``:

In [None]:
plt.plot([1,2,3,6,4,2,3,4])

In the above example, we only gave a single list, so it will assume the x values are the indices of the list/array.

However, we can instead specify the x values:

In [None]:
a=([3.3, 4.4, 4.5, 6.5], [3., 5., 6., 7.])
print a
plt.plot([3.3, 4.4, 4.5, 6.5], [3., 5., 6., 7.])

Matplotlib can take Numpy arrays, so we can do for example:

In [None]:
import numpy as np
x = np.linspace(0., 10., 50)
y = np.sin(x)
print x,y
plt.plot(x, y)
z = np.cos(x)
plt.plot(x,z)
q=np.tan(x)
#plt.plot(x,q)

The ``plot`` function is actually quite complex, and for example can take arguments specifying the type of point, the color of the line, and the width of the line:

In [None]:
plt.plot(x, y, marker='*', color='black', linewidth=1)

The line can be hidden with:

In [None]:
plt.plot(x, y, marker='o', color='green', linewidth=0)

If you are interested, you can specify some of these attributes with a special syntax, which you can read up more about in the Matplotlib documentation:

In [None]:
plt.plot(x, y, 'go')  # means green and circles

## Exercise 1

We start off by loading the [data/temperatures_ghana.txt](data/temperatures_ghana.txt) file which we encountered in the Numpy notes:

In [None]:
import numpy as np
date, temperature = np.loadtxt('data/temperatures_ghana.txt', unpack=True)
print date
moddate = date%1
print moddate
plt.plot(date,temperature)

Now that the data has been read in, plot the temperature against time:

In [None]:
plt.plot(moddate,temperature)
# your solution here



Next, try plotting the data against the fraction of the year (all years on top of each other). Note that you can use the ``%`` (modulo) operator to find the fractional part of the dates:

In [None]:

# your solution here


## Other types of plots

### Scatter plots

While the ``plot`` function can be used to show scatter plots, it is mainly used for line plots, and the ``scatter`` function is more often used for scatter plots, because it allows more fine control of the markers:

In [None]:
x = np.random.random(100)
y = np.random.random(100)
plt.scatter(x, y)

### Histograms

Histograms are easy to plot using the ``hist`` function:

In [None]:
v = np.random.uniform(0., 10., 100)
h = plt.hist(v)  # we do h= to capture the output of the function, but we don't use it

In [None]:
h = plt.hist(v, range=[-5., 15.], bins=100)

### Images

You can also show two-dimensional arrays with the ``imshow`` function:

In [None]:
array = np.random.random((64, 64))
plt.imshow(array)

And the colormap can be changed:

In [None]:
plt.imshow(array, cmap=plt.cm.cool)

## Customizing plots

You can easily customize plots. For example, the following code adds axis labels, and sets the x and y ranges explicitly:

In [None]:
x = np.random.random(100)
y = np.random.random(100)
plt.scatter(x, y)
plt.xlabel('Temperature ($^o$C)')
plt.ylabel(r'Pressure ($\pi$ kg$_{length}$)')
plt.xlim(0., 1.)
plt.ylim(0., 1.)

## Making subplots

In some cases, you might want to include multiple sub-plots in a single figure. You can do this easily by using the ``plt.subplot`` command, which takes the number of rows, columns, and the index of the current plot (counting from left to right then top to bottom):

In [None]:
plt.subplot(2,2,1)
plt.plot([1,2,3])
plt.subplot(2,2,2)
plt.plot([3,1,2])
plt.subplot(2,2,3)
plt.plot([19,29,34,56])
plt.subplot(2,2,4)
plt.plot([19,29,34,56],[2,4,6,8])
plt.savefig('4plot.png')

If there is not enough or too much space between sub-plots, you can adjust this with [plt.subplots_adjust](http://matplotlib.org/api/pyplot_api.html?highlight=subplots_adjust#matplotlib.pyplot.subplots_adjust):

In [None]:
plt.subplot(1,2,1)
plt.plot([1,2,3])
plt.subplot(1,2,2)
plt.plot([3,1,2])
plt.subplots_adjust(wspace=0.4)

## Saving plots to files

To save a plot to a file, you can do for example:

In [None]:
plt.savefig('my_plot.png')

and you can then view the resulting file like you would iew a normal image. On Linux, you can also do:

    $ xv my_plot.png

in the terminal.

## Learning more

The easiest way to find out more about a function and available options is to use the ``?`` help in IPython:

        In [11]: plt.hist?

    Definition: plt.hist(x, bins=10, range=None, normed=False, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, hold=None, **kwargs)
    Docstring:
    Plot a histogram.

    Call signature::

      hist(x, bins=10, range=None, normed=False, weights=None,
             cumulative=False, bottom=None, histtype='bar', align='mid',
             orientation='vertical', rwidth=None, log=False,
             color=None, label=None, stacked=False,
             **kwargs)

    Compute and draw the histogram of *x*. The return value is a
    tuple (*n*, *bins*, *patches*) or ([*n0*, *n1*, ...], *bins*,
    [*patches0*, *patches1*,...]) if the input contains multiple
    data.

    etc.

But sometimes you don't even know how to make a specific type of plot, in which case you can look at the [Matplotlib Gallery](http://matplotlib.org/gallery.html) for example plots and scripts.


## Exercise 2

Use Numpy to generate 10000 random values following a Gaussian/Normal distribution, and make a histogram. Try changing the number of bins to properly see the Gaussian. Try overplotting a Gaussian function on top of it using a colored line, and adjust the normalization so that the histogram and the line are aligned.

In [None]:

# your solution here


## Exercise 3

The [central limit theorem](http://en.wikipedia.org/wiki/Central_limit_theorem) states that the arithmetic mean of a large number of independent random samples (from any distribution) will approach a normal distribution. You can easily test this with Numpy and Matplotlib:

1. Create an empty array ``total`` with 10000 values (set to 0)
2. Generate 10000 random values uniformly between 0 and 1
3. Add these values to the ``total`` array
4. Repeat steps 2 and 3 10 times
5. Divide ``total`` by 10 to get the mean of the values you added
5. Make a histogram of the values in ``total``

You can also see how the histogram of ``total`` values changes at each step, if you want to see the evolution!

In [None]:

# your solution here
