
## ARCHER COURSE

# SCIENTIFIC PYTHON : INTRODUCTION


<br>

## Website:  http://www.archer.ac.uk 

## Helpdesk: support@archer.ac.uk

<br>

<img src="../images/epsrclogo.png" style="float: center">
<br>
<img src="../images/nerclogo.png" style="float: center">
<br>
<img src="../images/craylogo.png" style="float: center">

<br>
<img src="../images/epcclogo.png" style="float: center">

<br>
<img src="../images/ediunilogo.png" style="float: center"> 

<br>
<br>



<img src="../images/reusematerial.png" style="float: center; width: 90" >

<br>

# Scientific Programming with Python: Introduction


## Presenter: Adrian Jackson

#### Contributing authors: 
#### Adrian Jackson, Neelofer Bangawala, Arno Proeme, Kevin Stratford, Andy Turner 

<br>
<br>

<br>

## Scientific computing 


A typical workflow might include:



* Generate data
  * Perhaps from simulation on HPC facilities
  * Perhaps from experiment


* Process data
    * Compute/extract appropriate results from data


* Visualise results
  * To understand the significance of our work and gain scientific insight


* Communicate results
  * Publications, presentations, web, etc.


<br>

## Why Python?

https://www.python.org/

* Can write scripts (cf. `bash`, `perl`, etc.)

  * Easy to learn and write
  * Many many standard packages available


* Interactive interface(s)

  * Relatively easy to do relatively hard things
  * Exploratory work has low overhead


* Use as single interactive environment

    * Viable (free and open-source) alternative to, e.g., Matlab, R




<br>

## Using the bare Python shell


Interactive access to Python interpreter via the **`python`** command

Type commands directly at the prompt, e.g.,
```bash
bash-3.2> python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello World!")
Hello World!

>>> quit()
```

Use `CTRL-d` or `quit()` to exit

* OK for testing simple operations

#### Exercise
* From the `File` menu on the top right select `File->New->Terminal`
and try out the commands above. 



<br>

## A Python script


<br>

We may place a series of python commands in a file.

Use a text editor / IDE / whatever to edit a file **<code>hello.py</code>**
```python
# <- Comments are introduced via a hash
print("Hello World!")
print("...from file")
```

This may be executed from the shell command line:
```bash
bash-3.2$ python hello.py
```


* Ideal for persistent, reusable, large (complex) code

#### Exercise

1. From the `File` menu at the top right, select `File->New->Text File`.
2. Enter a python command in the new window
3. Save the file with a `.py` file extension via `File->Save File As...`
4. From the terminal window, run the new script

Note. You may need to change directory in the terminal, e.g.,:
```bash
bash-3.2$ cd lectures/user-intro
```
The present working directory can be identified via the command `pwd`.

<br>

## Unix execution

You may see, at the top of a script, something like
```python
#!/usr/bin/env python
...
```

This is a "shebang", which is an instruction to the Unix program loader on how to execute the contents (here "find python in the usual place in the current environment"). Windows users: see following section.

#### Exercise
1. Add the shebang to your python script `hello.py`
2. From the terminal, set the script to executable mode, and run via
```bash
bash-3.2$ chmod +x hello.py
bash-3.2$ ./hello.py
```

Note. You can check the mode of the file via, e.g.,
```bash
bash-3.2$ ls -l hello.py
-rwxr-xr-x  1 kevin  staff  41 Oct 24 09:55 hello.py
```
You should see `x` for executable in the mode on the left.



## Using Python on Windows (10)

From the main Windows Start Menu, it should be possible to locate
the Anaconda Launcher, which allows you to start various python-related
activity.

These include `jupyter-lab` which can be used as the basis of this
exercise. If you have a preferred _modus operandi_, use it.

Once started, the `File->New->Terminal` menu option provides `powershell`.

#### Exercise

Check you can a) start the bare python shell from the command line
as described above, and b) create and run a simple python script contined in a file. That is, run from the command line
```bash
$ python hello.py
```



<br>

## IPython shell

An enhanced interactive python shell

http://ipython.readthedocs.org/en/stable/overview.html


Includes

* Shell (or system) commands `ls`, `cd`, `pwd` etc
* Tab completion

* Getting help use <b>`'help'`</b>  or <b>`?`</b>, e.g.: `help(int)`, `?int` or `int?`

* 'Magic' commands (commands to ipython itself rather than python):
  * <code> %hist</code> (history of commands)
  * <code> %save</code> (record state to return later)
  * <code> %run</code> (run python script within shell)

<b>quickref</b> command gives a summary of capabilities



<br>

## jupyter-notebook (`.ipynb` files)

A browser interface to the iPython shell

* Input is split into cells
* So far we have seen "Markdown" cells
* Enter python commands into "code" cells (click to focus)
* Execute cells with SHIFT+ENTER
* Can clear single cell output : focus, and `Edit–>Clear Ouputs`

<br>


In [None]:

print("Hello World")



<br>

## jupyter-lab

This combines notebooks with other useful facilities such as the text editor
and terminal shell, as we are using here.


----

<br>

## Python basics

#### Assignment

One can assign values to variables


In [None]:

# Integers, e.g.,
nmax = 11

# Floating point numbers, e.g.,
pi = 3.14159

# Strings, e.g, 
string1 = 'single quotes'
string2 = "or double quotes"


#### type


Note. There are no variable 'declarations'. All python variables or objects have a type, and type is determined from context.

In assignment, type is determined by what is on the right hand side.

To determine the type of a variable, use the inbuilt `type()` function.


In [None]:

nmax = 11
pi = 3.14159
string1 = 'single quotes'

print("The type of nmax is:   ", type(nmax))
print("The type of pi is:     ", type(pi))
print("The type of string1:   ", type(string1))
print("The type of type() is: ", type(type))


<br>

#### Operations

Operations are again determined by context:




In [None]:

nmax = 11
pi = 3.14159
string1 = 'single quotes'
string2 = 'or double quotes'

print("Integer addition:        ", nmax + nmax)
print("Floating point addition: ", pi + pi)
print("String addition:         ", string1 + " " + string2)

# The result of mixed operations...
# ...is via "promotion"

result = nmax + pi
print("The result is: ", result, " of type: ", type(result))



<br>

#### Exercise

One special case requiring care is division. Python has two operators:

* floating point division `/`
* integer division `//`

Check the results of the following. How does integer division behave? What happens if you try interger division with floating point numbers and vice-versa?

In [None]:

# Floating point

f1 = 2.0; f2 = 1.0; f3 = -1.0

print("f1 f2 = ", f1/f2, f2/f1)
print("f1 f3 = ", f1/f3, f3/f1)


In [None]:
# Integer

i1 = 2 ; i2 = 1; i3 = -1

print("i1 i2 ", i1//i2, i2//i1)
print("i1 i3 ", i1//i3, i3//i1)

In [None]:
# Floating point and Integer
f1 = 2.5

i1 = 2
# Try mixing floating point and numbers by completing the print statements below
print("f1/i1 =", )
print("i1/f1 =", )
print("f1//i1 =", )
print("i1//f1 =", )


<br>

## Lists

Lists of objects are introduced by square brackets
* A general container for "things"
* Access via [index] **starting at zero**


In [None]:

mylist = [False, 1, 2.0, "three"]

print("mylist is of type:         ", type(mylist))
print("The first element is:      ", mylist[0])
print("The length of the list is: ", len(mylist))


In [None]:

# Lists support operations

mylist1 = [1, 2, 3, 4, 5]
mylist2 = [6, 7, 8, 9, 10]

mylist1 + mylist2


In [None]:

# Lists support many "class" methods or functions

empty_list = []
empty_list.append(1)

print("No longer empty_list: ", empty_list)


<br>

## Tuples

Essentially a list which does not support assignment of elements
(it is _immutable_)

* Introduced via parenthesis (usually)
* Indexed via square brackets (like list)


In [None]:

mytuple1 = (0, 1, 2)
mytuple2 = 1,

print("The first element is: ", mytuple1[0])
print("Is mytuple2 a tuple?: ", type(mytuple2) == tuple)


In [None]:

# Attempt at assignment will result in an error

mytuple1 = (-1, 0, 1)
mytuple1[0] = 1


<br>

## Iteration via `for`

Many data types support **iteration**

* A general mechanism to move through elements one-by-one
* Note colon at end for `for` statement
* Structured block identified by indentation


In [None]:

for c in "Hello":
    print("Character:", c)
    
print("World")


In [None]:

alist = [4, 3, 2, 1]

for value in alist:
    print("Element has value ", value)



<br>

#### enumerate()

The built-in function _`enumerate(list)`_ produces a counter:

In [None]:

blist = ["Andy", "Arno", "Kevin", "Neelofer"]

for index, name in enumerate(blist):
    print("Element", index, " has value ", name, blist[index])



<br>

#### Ranges

Built-in function _`range(min, max[, step])`_ 

* produces a range of integers `[min, max-1]`
* If `step` is omitted, it defaults to 1

In [None]:


for n in range(0, 4):
    print("Iteration", n)

print("End of iteration")



<br>

## Logic and conditionals

Logical values: `True` and `False`

There is also `None` (and empty or null value, which evaluates to `False`)

Logical (Boolean) operations are: `and`, `or`; unary `not`


In [None]:

# Conditional branches are indented
# Note the colons again:

expression = False

if expression:
    print("True")
else:
    print("False")

print("True and False is", True and False)


In [None]:

# Arithmetic comparison

i = 1

if i <= 2:
    print("i is less then or equal to 2")


<br>

## Functions


Built-in functions, e.g., _`print()`_, _`type()`_, _`range()`_
 https://docs.python.org/3/library/functions.html


* _Class methods_ are accessed with dot operator : `object.method()` e.g. `mylist.append()`
  
* Define your own function, e.g.:
```python
def my_square(x):
      return x*x
```


#### Exercise

Write a function which takes as its argument a list of integers,
iterates through the list to compute the sum of the elements,
and returns that sum. Check you have the expected answer.




In [None]:

# Define the function here ...

# The sum for mylist should be 15

mylist = [1, 2, 3, 4, 5]

<br>

## Importing standard libary modules

https://docs.python.org/3/library/

* There are a number of variations of the _import_ statement:

  * `import module`
  * `from module import name`
  


For example, the standard library contains a module `random` for the generation of random numbers. The module contains the functions `random()` and `choice()` (amongst others).

In [None]:

# First, functions may be accessed via module name...

import random

print(random.random())
print(random.choice(["yes", "no", "maybe"]))


In [None]:

# Alternatively

from random import random, choice

print(random())
print(choice(["yes", "no", "maybe"]))



<br>

## A Python module

A file with extension `.py` is a "module" i.e. something with python code in it.

Sometimes useful to be able to execute as a script, or import as
a module.

This is often done by checking the special variable `__name__`,
which takes on different values in different contexts.

```python
if __name__ == "__main__":
    
    # we execute in the context of a script...
    ...
```

#### Exercise

In the directory with this notebook is a python file `mymodule.py`.
Have a look at its contents.

Check you can run `mymodule.py` from the terminal as before:
```bash
bash-3.2$ python mymodule.py
```
and note the output.

Now check you can import and run the `my_name()` function from
the following cell.

In [None]:

# We can also import the module

import mymodule

mymodule.my_name("here!")



<br>

## Horrible Error! Help!

Errors produce a _stack trace_ which can be extremely long.

Scroll to the bottom and read the last message first.





In [None]:
# Not a correctly formed python statement

Whoops!


#### Exercise

The following cells have common errors (here deliberate). Look at the error message to
try to understand what is wrong:

In [None]:
# One

def my_function(a):
    print ("The argument is: ", a)


In [None]:
# Two

print ("The value of a is ", a)


In [None]:
# Three

if (True):
    print("Statement is true")
 else:
    print("Statement is false")

In [None]:
# Four

bells = ["St Clement's", "St Martin's", "Old Bailey", "Shoreditch"]

print("When I grow rich, say the bells at", bells[4])



<br>



## Some common problems


#### Dynamic typing


In [None]:

# Typos can cause bugs...

x1 = 1.0
xl = x1 + 2.0
print("I expect x1 to be: ", x1)



<br>

#### You may destroy something you didn't mean to...


In [None]:

print = "yes"
print("no!")



<br>

This can give rise to some weird and unexpected errors. E.g., in the above,
the intrinsic name _print_ has been redefined to be a string (it's no longer a callable function).

In this particular case, it is possible to recover by explicitly deleting the new string:

In [None]:

print = "yes"

del print
print("okay!")





Ultimately, you may need to restart the python kernel to recover sanity.
From the menu `Kernel->Restart Kernel...`

<br>

#### White space is significant

Errors in indentation can result in incorrect iteration, logic, e.g.:


In [None]:

def my_function(arg):

    if arg:
        # ... run the True branch
        result = True
    else:
        # ...perhaps a long and highly indented False branch...
        result = False
        # ...
    
        # Is this is the false branch?
        print("The result is: ", result)

my_function(False)


Prefer splitting long and/or highly indented regions into separate functions.


<br>

#### Notebooks have memory

The order of execution of cells may be important. Variables and values are carried forward.

Editing long and complex notebooks can introduce unintended dependencies
on the order of execution of the cells.


In [None]:

# This cell must be executed ...

i10 = 1


In [None]:

# ...before this cell.

print("The value of i10 is", i10)



Use ipython `%whos` to check the current namespace and values.

If unexpected errors persist, it may be necessary to clear all the output
and history from the menu `Edit->Clear All Outputs`, and re-execute cells in the correct order.

While notebooks are good for experimentation and short tests, significant correct code is probably best kept as a final version in a .py file and imported. This is a common development picture.


<br>

## Some other cautions 

 
* Some compatability issues between versions 2.x and 3.x

  * E.g., `print` can behave differently
  * python2 had no separate integer division //
  * Some older packages are python2
  * See https://wiki.python.org/moin/Python2orPython3
  * Formal support for python2 will end in 2020
  
  
* Many packages can risk complex dependency problem

    * Package A depends on specific version of package B depends on...
    * Package managers such as anaconda are often used
    

* Python is "interpreted"; can be "slow"

    * ...compared with a compiled language version of the same thing


<br>

## Summary

* We have reviewed some Python syntax, data structures, functions, and modules

* We have also been introduced to the IPython shell

* The following lectures will look at some standard packages

  * NumPy
  * Matplotlib
  * SciPy







<br>

## Final Exercise: Median


Define a function <code>my_median()</code> that takes a list
of dates of birth (just the year) of individuals as it single
argument, and returns the median age from the list.

The function must do a number of things:

1. Compute a new list of ages "now" from the dates of birth;
2. Sort the resulting list (how!?);
3. Decide whether there is an even or odd number in the sample;
4. Return the appropriate median

Assume "now" is 2019, and someone born in 2018 is one year old etc.

If `dates_of_birth = [1989, 1955, 2011, 1943, 1976]`,
and year_now is 2019, then the result should be
```python
print(my_median(dates_of_birth))
43
```


Use the above list of numbers as input to check you get the right answer.



In [None]:

# Type your solution in the notebook,
# or use an editor to create a separate script in a file,
# or use an IDE.
# Use the approach you are most comfortable with...

year_now = 2019

def my_median(dobs):

    # Compute the median from a list of dates of birth
    
    return median


dates_of_birth = [1989, 1955, 2011, 1943, 1976]

print(my_median(dates_of_birth))



Clearly, a hardwired
```python
year_now = 2019
```
is not very robust (the answer will be wrong unless you happen to be interested in 2019).

Can you find a method from the standard library which will provide the current year?


<br>

## Solution: Median


In [None]:
# Uncomment and execute this cell to see a solution
#%load mymedian.py


<br>

## Extra exercise

#### Using an external file

Suppose the list of dates-of-birth was stored in a file (we have provided one called 
<code>years.txt</code>). The file is structured as follows:
```
total number of years
1 year1
2 year2
...
```

If we wanted to read the data from the file and store in a list,
we can do something like the following. The file is openned for reading,
or in mode 'r', and the content may be read line-by-line:

```python
...
input = open(filename, 'r')
line = input.readline()
line.strip()
tokens = line.split()
...
```
At this point, the variable `tokens` will contain a list of the tokens, or the strings on
the line that has been read in. (Find out exactly what the string functions `strip()` and
`split()` do.) We are then in a position to convert the strings to integers
and treat them appropriately.

Use the cell below to attempt to read the data from the file.

If we wished to store some processed results to a new file at the end of the
procedure, we could open a new file for writing (mode 'w'), schematically:

```python

output = open(filename, 'w')
...
output.write("{0:2d} {1:2d}\n".format(1, years[0]))
...
output.close()

```

Again, check the string `format()` function in the documentation.

Have a go in the cell below.


In [None]:
# Read the data from the file and compute the median

# E.g., write the corresponding ages to a new file


In [None]:
# Example solution. You can try to import the python code in the file ex_file.py and
# use the functions therein. Have a look to see what they do.

import ex_file



We will see later more convenient ways to deal with external files.