# Functions and numpy

## Functions

A function is a useful way of reusing code to do a common task over and over again without having to repeat yourself.
This makes it easier for you to write code and makes it simpler for others to read.

We can start with a really simple example – adding two numbers together.
For example, if we want to add two numbers together in Python we can type: 

In [1]:
a = 2
b = 3
c = a + b
print(c)

5


In practice we would never make something so straightforward into a function, but if we do want to make this into a function it would look like this:

In [2]:
def add(a,b):
    c = a + b
    return c

or just:

In [3]:
def add(a,b):
    return a + b

There are several elements to this. Firstly, the `def` indicates that you are defining a function, and the name of this function is `add`. 
Next, this function takes two inputs: `a` and `b`. This is what the function needs the user to put in to do its job.
Everything that comes after the colon is the operation that the function will do. Note that everything in the function after the colon is indented – this is important! Finally the part after the `return` is what the function returns when you call it.

We can call the function by typing:

In [4]:
c = add(5,6)
print(c)

11


### Exercise

We often measure greenhouse gases in the units mole of the substance per mole of air.
Sometimes it's useful to instead have the units in mass concentration, so mass of the substance per volume of air.

In Spyder, open a new Python script file and write the code to do the following in it:
1. Create a function that takes the concentration of a gas in the units $\text{mol/mol}$ and the molar mass of that gas and converts it into mass concentration in units $\text{g/m}^3$. Assume that the molar volume of air is $0.0224\,\text{m$^3$/mol}$.
2. Use this function to work out what the mass concentration of $1 \times 10^{-6}~\text{mol/mol}$ of methane ($\mathrm{CH_{4}}$) is in $\text{g/m}^3$

## Numpy

A collection of many function is called a module.
Luckily many useful modules already exist which perfom many of the tasks that we need to do.
One of the most useful modules is called *numpy* (**num**erical **Py**thon) – it contains many functions to deal with numerical programming.

To use the functions contained within `numpy` we first need to *import* it.

In [5]:
import numpy as np

The `as` statement gives us a shorthand to use in the code when we want to access numpy, in this case `np`.

To use the functions within numpy, e.g. the square root function, we can type:

In [6]:
np.sqrt(9.)

3.0

We need the `np.` at the start of the function to tell python that it's contained within the numpy module.

Numpy is also useful as we can create something called an *array* and assign it to a variable. This is a collection of numbers similar to a vector.
For example we could create an array using:

In [7]:
arr = np.array([1.,1.,2.,3.,5.,8.])

We can then do the same to all elements of the array at once similar to with single numbers, e.g. to multiply all numbers in the array by two:

In [8]:
print(arr * 2)

[ 2.  2.  4.  6. 10. 16.]


Contrast this with the behaviour we saw earlier with lists - can you recall what the difference was?

In a similiar way to lists and strings, we can access an element of an array using an index. You should recall that, in python, **indexes start from 0 and not 1**.

For example, we can type:

In [9]:
print(arr[2] * 2)

4.0


This gives us access to the 3rd element in the array (not the 2nd) and multiplies it by 2.

To access the first 3 elements of the array we can type:

In [10]:
print(arr[0:3])

[1. 1. 2.]


That is all elements up to, but not including, the one with index 3.

Just like for scalar variables, we can also perform a function on an array, for example to find the sum of an array we can type:

In [11]:
print(np.sum(arr))

20.0


We can also replace one number or multiple numbers in the array by

In [12]:
arr[0] = 0.
print(arr[0])

0.0


### Reading text and csv files with numpy

Numpy also has modules to do other useful tasks, such as reading text files.
The file "ch4_macehead_2014.csv" contains one year's worth of daily averaged methane concentration measured at the Mace Head site on the west coast of Ireland. You can open this up in a text editor to have a look.
The left column contains the dates and the right column contains the average concentration on that day (in the units $\text{mol/mol}$).

To read the file into an array you can do the following:

In [13]:
ch4array = np.genfromtxt('data/ch4_macehead_2014.csv', delimiter=',', skip_header=1)

The `genfromtxt` function generates and array from a text file. The `delimiter` argument says that the values in each column are separated by a ',' and `skip_header` tells numpy to ignore the first row and this is just the header information and not values to go into the array.

The variable called `ch4array` now contains all of the dates and concentrations in two columns. To use just part of the data in the array we need to access the correct columns and rows.

For example to print just the first ten days' of data we can type

In [14]:
print(ch4array[0:10,:])

[[           nan 1.90616602e-06]
 [           nan 1.91029199e-06]
 [           nan 1.92304534e-06]
 [           nan 1.93461050e-06]
 [           nan 1.93035617e-06]
 [           nan 1.94106017e-06]
 [           nan 1.92089650e-06]
 [           nan 1.92795616e-06]
 [           nan 1.91506836e-06]
 [           nan 1.90238485e-06]]


This tells python to print the first ten rown of data and the colon on its own means print all of the columns in that row.

But something is wrong here?! All of the dates have been replaced with `nan` (meaning 'not a number'). 
This is because numpy can only read numbers, whereas the first column in our file is formatted as a date.
There are other ways to read in data containing dates and or strings which you will learn about later.
For now we will just extract the measurements that we're interested and work with those.

If we want to print the entire column of concentrations we can type

In [15]:
print(ch4array[:,1])

[1.90616602e-06 1.91029199e-06 1.92304534e-06 ... 1.91967665e-06
 1.92091567e-06 1.91707725e-06]


### Exercise

In Spyder, open a new Python script file and write the code to do the following in it:
 1. Read "ch4_capegrim_2012-2016.csv" from within the "data" directory as a numpy array. This contains **monthly** methane concentrations from Cape Grim in Australia from 2012 - 2016. *Hint: You may want to look the file first to see what options from the above example you need to update.*
 2. Find the mean and standard deviation of the measured concentration of methane for 2012 in $\text g/m^3$.

## Next Topic

When ready you can move onto the next topic:

### [Xarray and netCDF files](xarray.ipynb)

To view the introduction page containing the list of topics click [here](introduction.ipynb)