# Introduction to Python

This is an introduction to Python! This is the first in a set of Jupyter notebooks that will provide you with an brief overview of Python. The range of things that are possible with Python is huge, so it is not possible to cover all of them in a few lectures (or even a year long module!!). Hopefully these notebooks provide you with enough information to start your journey into scientific data analysis.


As a companion to these notes, I suggest the following books:

[Python Data Science Handbook](https://www.amazon.co.uk/Python-Data-Science-Handbook-Essential-ebook/dp/B01N2JT3ST) - Jake VanderPlas (also available [online](https://jakevdp.github.io/PythonDataScienceHandbook/))

[Python for Data Analysis](https://www.amazon.co.uk/Python-Data-Analysis-Wrangling-IPython-ebook/dp/B075X4LT6K/ref=pd_cp_351_1/258-0652017-7669268?_encoding=UTF8&pd_rd_i=B075X4LT6K&pd_rd_r=28e28162-c20d-44c9-9258-58418e174dcb&pd_rd_w=CUoAI&pd_rd_wg=dq0RK&pf_rd_p=01704ebe-a86a-4b47-8c36-0f9f5bbc2882&pf_rd_r=6M4AQ1MKNZHM125KCBFN&psc=1&refRID=6M4AQ1MKNZHM125KCBFN) - Wes McKinney


During the notes I also provide links to many online resource that are useful learning Python. I recommend exploring these websites in more detail to help with your learning.

If you find any useful resources that help you understand an aspect of Python better, you can help contribute to these notes. Please let me know what you've found and the future years students will benefit from your contribution.

# Launching the Jupyter Environment

First, lets create a folder for our Python code
> mkdir python_files

> cd python_files 

To launch the Notebook Dashboard, type the following in the command line
> jupyter notebook

This will open up a window in you browser showing the following:
![Screenshot%202019-02-01%20at%2015.28.00.png](attachment:Screenshot%202019-02-01%20at%2015.28.00.png)


There are many useful resources online that will help you get to grips [with using Jupyter](https://jupyter.brynmawr.edu/services/public/dblank/Jupyter%20Notebook%20Users%20Manual.ipynb#4.3.5-Notebook-Internal-Links).

***

# Launching Pycharm and Python

You can work with Python in many ways. We will use Pycharm - which is what is known as a _developer environment_ and contains many useful features.

You will have a Pycharm launch icon on your desktop. When you run this for the first time you be asked to create a project and name it.

Call it something sensible, like *KL7002_astromodule_intro* .

The Pycharm environment should open and you will presented with a Terminal console, Python console and an editor window, amongst other things.

We'll first focus on the terminal. If we click on the terminal tab, it tells us we are working in the directory 
_ username\PycharmProjects\astromodule>. The last part refers to the current project directory.


## IPython
We will want to work with a version of Python called the IPython shell - which basically makes the basic version of python 'pretty'.

In the terminal window, enter the following in to the command line

> \>ipython

We are now in a Python shell and we are able to execute python statements and commands.

E.g. input a=6 into the command line, enter, then a followed by enter.
It should look like the following

> \>In[1]:a=6

> \>In[2]:a

> \>Out[2]:6

You can exit the IPython environment with: 

> \>exit()


## Python files
In Project tab, right click on the project root (in my examples called *astro_mod*), select _new_ then _Python file_.

This will then open a new tab and ask you to name it. Let's start off with the classic *hello_world* (you will notice the files automatically got the extension *.py*, to show it is a python file).

In the file enter the following:

> print('Hello World!')

and save (*ctrl+s*).

Now we return to the Python shell and you can run your newly created Python file with the %run command

> \> %run hello_world

***

# Data types

In order to undertake any form of analysis you will need to work with some form of data! In its most basic variety, this will usually take the form of numbers or letters.

In Python, we call fundamental bits of data _values_ and they can have different classes. For example, the number 4 belongs to the class _integer_ and 'Hello' belongs to the class _string_.

We can find out the class of the values, we can use a *function* called type

In [2]:
type('Hello')

str

Notice that the string has single quotation marks surrounding it. Double quotation marks can also be used ("hello"). If required, you can also use three of each, e.g. """Hello"""

A single number is an integer:

In [2]:
type(4)

int

You can also have floats, which represent decimals:

In [4]:
type(5.67)

float

If inputing a large integer, you do not need commas, e.g., 97,000 should be input as 97000. You can see what happens if the comma is included!

In [2]:
42000

42000

You can also have complex numbers:

In [5]:
type(1+2j)

complex

***

# Variables
A variable is a way to associate a name with a value and is performed using the = sign (this is referred to as an assignment token and not to be confused with equals, which is == ).

In the following I assign the word 'cat' to variable _a_ and the float 4.5 to variable _b_ . I am also using the _print_ function, which prints variables (and other things) to screen.

In [1]:
a='cat'
print(a)
b=4.8
print(b)

cat
4.8


Variables, as the name suggests, can have there value changed.

In [13]:
month='January'
print(month)
month="February"
print(month)

January


'February'

Variable names can be of any length and can contain both letters and numbers (although they have begin with a letter). They are also case-sensitive, i.e., _month_ is different from _Month_. The convention is to use lower-case letters only though. The underscore can also be used in variables names, e.g., month_1985.

Improperly defined variables result in a syntax error. For example, if your variable starts with a number:

In [14]:
1month='December'

SyntaxError: invalid syntax (<ipython-input-14-6613a1c621f7>, line 1)

You may also get a syntax error if you choose a variable name that is a __[Python keyword](https://www.w3schools.com/python/python_ref_keywords.asp)__.

Finally, when you define variables in a code, the name should be useful for a human to read! If you just give it a random name or _var1, var2, var3_, etc., your code will be very difficult to follow.

__Be kind to your future self - make your code easy to read!__

>_“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. ...[Therefore,] making it easy to read makes it easier to write.”_


>― Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship


***

# Expressions

An expression is a combination of variables, values, operations or functions.

We will begin by looking at expressions containing mathematical operators.

## Mathematical Operators
Addition

In [None]:
2+3

Subtraction

In [15]:
9-10

-1

Multiplication

In [16]:
2*3

6

Division

In [17]:
10/2

5.0

Raising exponents

In [19]:
3**3

27

Finding the remainder after division using the modulo operator

In [43]:
18%4

2

We can perform the same operations with variables that are numerical

In [4]:
a=9
b=10
c=a*b
print(c)

90


There is also something called _floor division_, // , which performs integer division:

In [3]:
print(7/2)
print(7//2)

3.5
3


## Boolean values, logical and relational operators

A Boolean value is either True or False. They are usually the outcome of testing statements with relational operators.

Here are the most common relational operators, and their Boolean output

In [19]:
print(5==3) # This says is 5 equal to 3
print(5!=3) #           is 5 not equal to 3
print(5<3)  #           is 5 less than 3
print(5>3)  #           is 5 greater than 3
print(5<=3) #           is 5 less than or equal to 3
print(5>=3) #           is 5 greater than or equal to 3

False
True
False
True
False
True


The Boolean values are useful for sorting data and conditional execution of statements - all of which we will see later.

We can combine the relational expression using logical operators, AND, OR and NOT

In [50]:
print(5>2 and 5<7)
print(5>2 and 5>7)
print(5>2 or 5<7)
print(not(5>2 and 5<7)) #Reverses the boolean, i.e. if true gives false

True
False
True
False


## Operations with strings

You can also manipulate strings in Python.

In [2]:
word='Alphabet'
word2='Soup'
print(word+' '+word2)
print(word*3)

Alphabet Soup


'AlphabetAlphabetAlphabet'

This includes accessing specific positions in the string (counting starts at 0):

In [3]:
print(word[3])

a


There are many more things that can be achieved with string manipulation, however this will not be a focus of this course. But for now it's enough you know it exists!

***

# Storing data

You will often be working with many bits of data that come in various forms. Python contains a number of 'containers' that will hold data (these are known as compound data types).

## Lists

The most versatile, but maybe not the most appropriate for data analysis, is the _list_. The list is defined by comma separated entries enclosed in square brackets. It **can contain a mix of data types** - although it typically should only be used to contain one type.

In [2]:
lis=[5,'apple',6.7,'goat']

Elements in the list are accessed by a numerical value, starting at zero. Negative indices count backwards.

In [3]:
print(lis[0],lis[3])
print(type(lis[0]),type(lis[3]))
print(lis[-1])

5 goat
<class 'int'> <class 'str'>
goat


Lists can be _sliced_ to create new lists. The colon is used to do the _'slicing'_:

In [25]:
lis=[0,1,2,3,4,5,6,7,8,9]
print(lis[0:4])
print(lis[-3:])

[0, 1, 2, 3]
[7, 8, 9]


You can also create lists of lists. Suppose we have the following information on peoples heights (in m). Jeff - 1.60, Paul - 1.80, Sally - 1.65 . We can store this as a list of lists, e.g. 

In [45]:
heights=[['Jeff',1.60],
 ['Paul',1.80],
 ['Sally',1.65]
]
print(heights[0])   # Give backs Jeff's details
print(heights[1][1]) # Gives back Paul's height

['Jeff', 1.6]
1.8


## Tuples

There is a special type of list called a _tuple_ - which is immutable (see [the section on mutable vs immutable](#Mutable-vs-immutable)). Basically you cannot change any aspects of a tuple, e.g. size, content. They should be used if you want to avoid data in them being changed.

They are defined with parenthesis rather than square brackets.

In [35]:
atup=('the','number','is',10)
print(atup[0])
atup[3]=13              # trying to change an element value results in error

the


TypeError: 'tuple' object does not support item assignment

As with lists, tuples can store a mixture of data types:


In [5]:
mytup=tuple(([1,2],'text',False))

You can 'unpack' a tuple, which is a useful operation when dealing with many functions in Python. To unpack a tuple you do the following:

In [8]:
alist,sometext,abool=mytup
print(alist)
print(sometext)

[1, 2]
text



## Dictionaries

A dictionary is a collection of data types that are unordered and indexed (actually, in Python versions greater than v3.5, dictionaries are now ordered!). They are defined with curly brackets and have **keys** and **values**.

In [9]:
mydict={
    "fruit": "Apple",
    "value": 1.20,
    "quantity": 100,
}
print(mydict)

{'fruit': 'Apple', 'value': 1.2, 'quantity': 100}


Here the keys are "value", "fruit" and "quantity". The values are "Apple", 1.20, and 100.

You use the keys to access the values.

In [10]:
print(mydict['fruit'])

Apple


If we want to find all the keys of values in the dictionary, we can use some in-built methods:

In [12]:
print(mydict.keys())
print(mydict.values())

dict_keys(['fruit', 'value', 'quantity'])
dict_values(['Apple', 1.2, 100])


We can explore the dictionary in a number of ways. One example is using a _for_ loops to iterate through the collection of keys.

In [4]:
for key in mydict:
    print(key)

value
fruit
Quantity


Here is another example looking at the values or type for each key:

In [5]:
for key in mydict:
    print(mydict[key],type(mydict[key]))
    

Apple <class 'str'>
1.2 <class 'float'>
100 <class 'int'>


We can check if a key exists in the dictionary

In [6]:
if 'fruit' in mydict:
    print("It's here")    #Here I have used double quotation marks for thr string as the text contains a '

It's here


To add a new key and value we can do the following

In [7]:
mydict['color']='red'
print(mydict)

{'fruit': 'Apple', 'value': 1.2, 'Quantity': 100, 'color': 'red'}


For more on dictionaries, see e.g. _[https://realpython.com/iterate-through-dictionary-python/](https://realpython.com/iterate-through-dictionary-python/)_



## Arrays
The most common way of working with numerical data is to use an array. Arrays are not native to Python but can be created and manipulated with add-on libraries. The main library for this is _numpy_.

First, let us import the _numpy_ library. Full details on the _numpy_ library can be found at __[https://docs.scipy.org/doc/numpy/index.html](https://docs.scipy.org/doc/numpy/index.html)__

In [2]:
import numpy as np #The second part of the this statement lets us refere to a library with a shorthand notation, e.g, np

### Array creation 
We can create an array in many ways (depending on what you want from the array). A simple 1D array is given by providing a list to the numpy _array_ function:

In [13]:
arr1=np.array([6,7,8]) # provide a list of numbers to the np.array function
print(arr1)
print(type(arr1))


[6 7 8]
<class 'numpy.ndarray'>
(2, 3)
2
6
27.0


A 2D is given by a list of tuples, where each tuple is a row. Hence the following matrix

$$
\begin{pmatrix}
6 & 7 & 8 \\
1 & 2 & 3 
\end{pmatrix}
$$

is given by

In [23]:
arr2=np.array([(6.,7.,8.),(1.,2.,3.)])
print('Shape of array is',arr2.shape) #what is the size of the array
print('Dimensions of array are',arr2.ndim)  #how many dimensions does it have
print('Size of array is ',arr2.size)  #how many elements does it have
print('Sum of columns is ',arr2.sum(axis=1)) #sum the elements for each row (axis=1 operates over columns)

Shape of array is (2, 3)
Dimensions of array are 2
Size of array is  6
Sum of columns is  [21.  6.]


The last four commands are known as [attributes](#Attributes-and-Methods) of the array and tell us some information about the array.

We are not limited to real numbers:

In [34]:
c = np.array( [ [1,2], [3,4] ], dtype=complex )
c

array([[1.+0.j, 2.+0.j],
       [3.+0.j, 4.+0.j]])

We can also create arrays of zeros or ones easily with _numpy_: 

In [35]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [36]:
np.ones((2,3))

array([[1., 1., 1.],
       [1., 1., 1.]])

We can also create arrays containing ordered values:

In [38]:
np.arange(0,10,1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [21]:
b=np.arange(0,1,0.1).reshape((2,5)) #There are other recommended ways to create floating point arrays in numpy
print(b)
print('Shape of b is',b.shape)

[[0.  0.1 0.2 0.3 0.4]
 [0.5 0.6 0.7 0.8 0.9]]
Shape of b (2, 5)


### Basic operations
Basic arithmetic operations can be performed on arrays.

In [42]:
a=np.array([5,10,15,20])
b=a+4
print(b)

[ 9 14 19 24]


Multiplication is performed element wise, as opposed to typical matrix multiplication. However there are ways to perform linear algebra, e.g. the dot product, but we will not discuss them here.

In [43]:
c=a*b
print(c)

[ 45 140 285 480]


In [44]:
print(3*np.sin(a)) #here we use numpy to get access to mathematical functions, e.g. sin

[-2.87677282 -1.63206333  1.95086352  2.73883575]


There are more basic mathematical functions available with _numpy_: [https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html)


We can access the different elements of the array by indexing, one-dimensional arrays are similar to lists. There are also tricks to slice arrays and manipulate them ([https://docs.scipy.org/doc/numpy/user/quickstart.html](https://docs.scipy.org/doc/numpy/user/quickstart.html)).

The following type of actions are called slicing.

In [52]:
a=np.arange(10)
print(a[0:5:2]) # this selects every 2nd element from 0 to 5
print(a[::-1])  # this reverses the array

[0 2 4]
[9 8 7 6 5 4 3 2 1 0]


Notice what happens when we use the indicies 0 to 5 to select array elements:

In [53]:
print(a[0:5])

[0 1 2 3 4]


In [56]:
print(a[:-1])
print(a[:])

[0 1 2 3 4 5 6 7 8]
[0 1 2 3 4 5 6 7 8 9]


We can also use Boolean logic to help us investigate arrays 

In [2]:
print(a<5)      # This is what happens when we use a comparison operator with the array
print(a[a<5])   # Now we use logic to return the values that satisfy our comparison (this is not slicing)
print(a[a==5])
print(a[a>5])

[ True  True  True  True  True False False False False False]
[0 1 2 3 4]
[5]
[6 7 8 9]


What happens if we want to know the array locations where a statement is true? (Note - the following is an equivalent to IDL _where_ function - the comma after the index unpacks the tuple).

In [3]:
index,=np.where(a<5)
print(index)
index2,=np.where( (a>2) & (a<5) )
print(index2.size,index2)

[0 1 2 3 4]
2 [3 4]


In [None]:
index,=((a >2) & (rn<2)).nonzero()   # alternative to np.where

For multi-dimensional arrays, the indexing and slicing is similar but requires extra terms. (Note: array referencing is opposite to IDL!)

In [61]:
a=np.arange(0,10,1)
a.resize(3,3)       # Create 2D array of size 3 by 3
print(a)
print('Elements in first row',a[0,:])       # Select all elements in first row
print('Elements in second column',a[:,1])       # Select all elements in second column     

[[0 1 2]
 [3 4 5]
 [6 7 8]]
Elements in first row [0 1 2]
Elements in first row [0 1 2]
Elements in second column [1 4 7]
Elements in second column [3 4 5]


In [25]:
a.sum(axis=1)

array([ 3, 12, 21])

### Array concatenation
Different arrays can be combined together via concatenation.

I demonstrate two options here. We will investigate the attributes of new array from the concatenation.

In [27]:
a=np.floor(10.*np.random.random((2,2))) # Create a 2 by 2 array of random numbers and use floor operator 
                                        # to round values down
print(a)
print('Shape',a.shape)
b=np.floor(10.*np.random.random((2,2)))
print(b)
c=np.hstack((a,b))      # horizontal stacking (coulmn wise)
print('New array \n',c) # \n indicates start a new line
print('Shape',c.shape)

[[9. 3.]
 [3. 7.]]
Shape (2, 2)
[[1. 9.]
 [6. 4.]]
New array 
 [[9. 3. 1. 9.]
 [3. 7. 6. 4.]]
Shape (2, 4)


In [28]:
c=np.vstack((a,b))  # Vertical stack (row wise)
print('New array \n',c) 
print('Shape',c.shape)

New array 
 [[9. 3.]
 [3. 7.]
 [1. 9.]
 [6. 4.]]
Shape (4, 2)


A thing to note: when operating and manipulating arrays (this also applies to lists), some actions copy the data into a new, unique array (operations of new array don't influence previous version). However, some actions do not create a unique copy of the array (operations on the new array affect the old array).

In [31]:
a=np.floor(10.*np.random.random((2,4)))
print('Shape of a is ',a.shape)
b=a                 # no new object is created

print(id(a),id(b))  # Python gives each object a unique identfier, hence if a and b are the same, 
                    # the id should be the same
    
print('Is b the same object as a? ',b is a)  # is checks whether two things are the same object
b.shape=4,2
print('New shape of a is ',a.shape)      # Changing shape of b changes a 

Shape of a is  (2, 4)
4606922144 4606922144
Is b equal to a?  True
New shape of a is  (4, 2)


Similar behavior happens with new arrays created by slicing (called _shallow_ or _view_ copies).

In [32]:
a=np.floor(10.*np.random.random((2,4)))
print('This is a\n',a)
s=a[0,:]
print('This is s\n',s)
s[0]=20.
print('This is a\n',a)      # a's data changes


This is a
 [[5. 9. 5. 7.]
 [6. 6. 5. 0.]]
This is s
 [5. 9. 5. 7.]
This is a
 [[20.  9.  5.  7.]
 [ 6.  6.  5.  0.]]


To make a copy, you need to use the copy method

In [15]:
a=np.floor(10.*np.random.random((2,4)))
b=a.copy()
print('Is b the same object as a? ',b is a)
print(b)
b.shape=(4,2)   # Change shape of b
b[0,0]=20.      # Change value in b
print('This is a\n',a)
print(a.shape)

False
[[7. 8. 6. 7.]
 [1. 4. 6. 4.]]
This is a
 [[7. 8. 6. 7.]
 [1. 4. 6. 4.]]
(2, 4)


## Mutable vs immutable

Different data types are either mutable (can change) or immutable (can't be changed). For example a list is mutable and a string isn't mutable.

The following two examples show this. Trying to change a string creates an error:

In [None]:
string='Hello'
string[1]='a'  # Trying to change second letter of Hello to a

In [62]:
list_example=[1,2,3]
list_example[1]='grape'
print(list_example)

[1, 'grape', 3]


## Attributes and Methods

While we will not go deeply into this aspect, I note that Python is an [object-orientated language](https://en.wikipedia.org/wiki/Object-oriented_programming). Hence, that is why we said we have, e.g. an integer class ([Section 4](#Data-types)) - where a class refers to a group of objects with common properties.

Given that Python is an object-orientated language, most data types and containers have two features known as attributes and methods.

Attributes are properties of the class.

Methods are essentially operations or functions that belong to the objects data.

We have already encountered attributes and methods before. You can see some of the attributes and methods of each object by using tab-complete. The available attributes and methods differ for different objects. 


In [None]:
list_example.

In [34]:
class Point:
    """ Point class represents and manipulates x,y coords. """

    def __init__(self, x=0, y=0):
        """ Create a new point at x, y """
        self.x = x
        self.y = y
        
    def distance_from_origin(self):
        """ Compute my distance from the origin """
        return ((self.x ** 2) + (self.y ** 2)) ** 0.5
    
    

In [38]:
p=Point(3,4)   # create an object p
print(type(p))
print(p.x,p.y) # attributes
print(p.distance_from_origin()) # method

<class '__main__.Point'>
3 4
5.0


***

# Conditional Statements

In order to create programs that are able to perform useful tasks for us, we will need to invoke conditional statements and loops (with logical operators). I have already used a couple in previous sections.

## The _if_ statement

The _if_ statement is a basic conditional statement and looks like this:

In [39]:
x=2
if x==2:               # The expression after the if is the condition
    print('Hi there')  # This line has to be indented 

Hi there


The _if_ statement basically evaluates the expression in the condition. If the expression is True, then the following indented lines of code are run. Here, if the expression is false, then nothing happens.

You may actually want one of two actions to be performed depending on whether the condition is true of false. We will then need to include the _else_ statement

In [5]:
bird1=0
bird2=0
spotted='pidgeon'
if spotted=='pidgeon':
    bird1 +=1              # This operation adds 1 onto the previous value of the array
else:
    bird2 +=1
print(bird1,bird2)

1 0


Change the spotted variable to 'crow' and see what happens.

The above code can be thought of as _if (logic satisfied) do task1 else do task2_ .

Sometimes there are more than 2 choices, and we can use the _elif_ statement

In [None]:
choice='a'
if choice == 'a':
    print("You chose 'a'.")
elif choice == 'b':
    print("You chose 'b'.")
elif choice == 'c':
    print("You chose 'c'.")
else:
    print("Invalid choice.")

Change the variable choice to see what happens.


## _for_ loops

A _for_ loop will loop over a section of code for the number of times specified. The _for_ loops in Python might look slightly different from those in other languages, e.g., Matlab, IDL. The _for_ loops in Python are 'collection-based' and they iterate over a collection of objects (see [here for additional detials](https://realpython.com/python-for-loop/) ).

A simple example is:

In [7]:
for x in range(5):           #object with elements [0,1,2,3,4]
    print('The number is ',x)

The number is  0
The number is  1
The number is  2
The number is  3
The number is  4


The range function here is similar to _arange_ in numpy - it generates a sequence of evenly spaced samples. The _for_ loop then iterates over the integers.

Here is another example with strings in a list:

In [40]:
a=['apple','banana','carrot']
for i in a:
    print(i)

apple
banana
carrot


In both of these cases we have created an _iterator_, which provides a consistent way to iterate over sequences. Here is another example with a dictionary (similar to an example shown previously):

In [1]:
some_dict={'a':1,'b':3,'c':5}
for keys in some_dict:
    print(keys)

c
a
b


You can also control the flow of _for_ loops with _break_ and _continue_ statements.

A break statement will cause a loop to terminate. In the following example we will create a loop break when a conditional statement is satisfied.

In [9]:
for x in range(5):           
    if x==3:        # If our iterator gets to 3 then break
        break
    print('The number is ',x)

The number is  0
The number is  1
The number is  2


A continue statement terminates the current iteration but the loop continues.

In [10]:
for x in range(5):           
    if x==3:
        continue
    print('The number is ',x)

The number is  0
The number is  1
The number is  2
The number is  4


## While loops

Like the _for_ statement, the _while_ statement starts a loop. However, the looping does not stop after a set number of iterations. It only stops when a given conditional statement is False, i.e., loops while it is True.   

In [41]:
prompt = "Can you guess what number I am thinking of? "

number=input(prompt)            # The input function enables some user interaction - number variable is string

while number != "42":           # This says ' loop while number is not equal to 42'
    number =  input('No, keep going! ')
    
print("Yes, well done!")

Can you guess what number I am thining of? 3
No, keep going! 2
No, keep going! 42
Yes, well done!


***

# Functions

So far, we have been using various pre-existing functions in Python. However, we can, and will often want to, develop our own functions to perform specific tasks. There are two methods for defining functions. The second method is discussed in [Section 10 - Imports](#Imports).


## In-line functions 
The first method is one that can be done 'in-line', in the interpretor. 

We have to use the _def_ keyword when defining the function. This is then followed by the name of the function and parenthesis, between which you give any arguments for the function.

In [1]:
def welcome_me(name):       # function name - welcome_me, argument - name
    print('Hello '+name)    # These two lines are the function definition
    
welcome_me('Richard')       # Here I am calling the function

Hello Richard


Information is passed into the function as an argument. Here, _name_ is the argument.


The function can also return information, and that information be stored in variables.

In [9]:
def square_number(x):
    return x**2

y=square_number(4)  # Here x in the function takes on the value 4
print(y)


16


In theory, you can have as many arguments as you want, but when inputting them you will have respect the order they are input.

You can also assign arguments by their names, and here order doesn't matter. 



In [3]:
def subtract_numbers(x,y):
    return x-y

z=subtract_numbers(4,3) # Here x in the function takes on the value 4 and y is 3
print(z)

z=subtract_numbers(3,4) # Here x in the function takes on the value 3 and y is 4
print(z)

z=subtract_numbers(y=3,x=4) # assign argument by name
print(z)

1
-1
1


### A small aside
Note that depending upon whether the argument that is passed to the function is mutable or immutable, one of two things can happen.

In programming, arguments passed to functions are either _passed-by-reference_ or _passed-by-value_. 

If the arguments are passed-by-reference, the function gets an explicit reference to the argument (rather than a copy) - hence if your function changes what the argument variable refers to, the change reflects beck in the calling function. This is the case for mutable data types.

This is slightly technical, but is illustrated with a simple example.

Remember, the _id()_ function gives us the unique id number for the Python object.

In [30]:
def change_list(lis):
    lis.append([1,2,'cat'])
    print('Inside function ',id(lis))
    return

lis=['dog',3,4]
print('Outside function ',id(lis))
change_list(lis)
print(lis)

Outside function  4377955720
Inside function  4377955720
['dog', 3, 4, [1, 2, 'cat']]


For immutable data types, e.g., integers, the arguments are passed-by-value. Hence, the function effectively creates a copy of the variable to use within function. There is no change to the variables outside the function.

Again, a simple example highlights this.

In [31]:
def multiply_number_by_five(x):
    x*=5
    print('Inside function ',id(x))
    return x

x=4
print('Outside function ',id(x))
print(multiply_number_by_five(x))
print(x)                            # x is unchanged

Outside function  4316223952
Inside function  4316224464
20
4


Further reading on this can be found here [https://www.python-course.eu/python3_passing_arguments.php](https://www.python-course.eu/python3_passing_arguments.php)


## More properties of functions

### Returning multiple values

A function can also return multiple values, and there are a number of ways to access these multiple values.
The next two examples use the same function but show two different ways to access the values.

In this first example we provide a single variable to assign the function output to. The result is that the variable is a tuple.

In [33]:
def issac_details():
    name = "Issac Newton"
    dob  = "4th January 1943"
    pob  = "Woolsthorpe"
    return name,dob,pob

details=issac_details()
print(details,type(details))

('Issac Newton', '4th January 1943', 'Woolsthorpe') <class 'tuple'>


This time, we provide 3 variables to assign the function output to.

In [36]:
name,dob,place=issac_details()
print(name)
print(dob)
print(place)

Issac Newton
4th January 1943
Woolsthorpe
<class 'str'>


Note that the first method returns a tuple, hence we can't alter the contents of elements in it (tuples are immutable).

### Global vs local variables

Variable names defined outside of functions are called _global_ variables. Depending on how the function is defined, the function can access these global variables.

The following function will use global variables:

In [38]:
def print_x():
    print(x)
    
x=10
print_x()

10


In the following function, we create a _local_ variable in the function (through the argument definition). Even though it has the same name as the global variables, Python assumes we want the local variables - hence gives us an error

In [39]:
def print_x(x):
    print(x)
    
x=10
print_x()

TypeError: print_x() missing 1 required positional argument: 'x'

The following will also give an error, even though the local variable is defined after the print statement.

In [40]:
def print_x():
    print(x)
    x=20
    
x=10
print_x()

UnboundLocalError: local variable 'x' referenced before assignment

What happens if we now just define a _local_ variable in the function that doesn't correspond to a global variable.  

In [42]:
def f(y):
    temp=y
    
x=10
f(x)
print(temp)

NameError: name 'temp' is not defined

I tried to print the local variable _temp_ after running the function, however, we get an error saying it is not defined. This is because temp only exists inside the function and is effectively destroyed once the function ends. 


### Optional Parameters

Functions can also have optional (or default) arguments - they don't have to be given if the function is called.

Too many arguments can be cumbersome when actually using the function. In many situations you will want the function to perform a specific task, but also allow it to be customisable when the need arises. Hence, one way to achieve this flexibility is to define default arguments.

In the following, first, I don't give the function anything for _value_. I then provide the optional argument in two ways.

In [45]:
def price_of(value='£1000'):
    print('The price of an Ipad is '+value)
    
price_of()
price_of('£500')
price_of(value='£400')


The price of an Ipad is £1000
The price of an Ipad is £500
The price of an Ipad is £400


You can also define the argument as None - which is a null value. Note, this is not the same as 0.

In [9]:
def price_of(value=None):
    if value is None: 
        value='£500'
        
    print('The price of an Ipad is '+value)
    
price_of()
price_of('$200')

The price of an Ipad is £500
The price of an Ipad is $200


# Modules & Imports

We have already discussed running a list of instructions in a Python file (i.e. a file ending in _.py_). 

However we can also create and use modules (or packages), which are formatted .py files.

We have already come across modules, such as numpy, and modules are imported as:

> import some_module

There are many such modules, each containing special functions, methods or types and are designed to undertake a particular problem.

We can create our own modules simply. Enter the following in a new python file and save it as test_module.py

In [None]:
#test_module.py
const=4.67

def addFour(x):
    return x+4

def multip(a,b):
    return a*b


We can load the module in with import, i.e.
> \>import test_module

and use the constant and functions in the module as such:

> \>res=test_module.addFour(7)

> \>print(test_module.const)

We can also import the functions as

> \>from test_module import addFour, multip, const

> \>res=multip(5,const)

Or we can do:
> \>import test_module as tm

> \>res=tm.addFour(tm.const)

# Error messages

A final word in this notebook is on error messages.

As we have seen, Python will throw error messages if you do something incorrectly. The error messages are useful for understanding what has gone wrong, and will help you correct any problems with your code.

Any time you see an error message you don't understand - trying [googling](https://www.google.com) it.

They may appear cryptic, but will help you fix any issues with you code! The quicker you start to use and understand the error messages, the easier it will be diagnose and fix problems. 