# Foundations of python

This notebook introduces students to some of the most common parts of the python programming language. It is intended as a *cheat sheet* for students to refer to when working on the batch processing script. Look through the aspects of python below and have a go at running each cell. Modify the code (eg. add new variables or function) and run again to help you understand how the computer is interpreting and acting on the instructions.


## Introduction


This is a ***jupyter notebook***. It is neat a way to have runable python code alongside notes and embeded graphs and outputs from the code. Notebooks are commonly used to develop bits of research code. The notebook is made up of code cells, which you can edit and run, and markdown cells, where text can be written.

The cell below includes a bit of python code. Click on the box and then press the ***Run*** button at the top of the screen to see it run.

In [None]:
print('Hello world')

### The basics

The cell below gives a brief overview of some of the most common bits of syntax in python. It consists of a set of comments (which the computer will not try to run as commands), and a set of commands, where you tell the computer what you want it to do.

The code below sets some ***variables***, prints those out and does some maths on it. Note that there are two broad types of ***variables*** below; numbers (known as ***floats*** or ***integers*** and words (known as ***strings*** or ***characters***). Python treats these different data types differently, allowing you to build up complex programs.

It is **very important** that you make sure you keep track what type each piece of data is saved as and only use the right type of data in operations. For example, $\sqrt{x}$ would not mean anything if ***x="cabbage"***.


**Click and run each of the code blocks below to see what it does. Have a go at modifying values and rerunning. Make sure you understand how the code is making the computer perform tasks**.

In [None]:
'''
Examples of the basic aspects of python3
'''

# Comments can be behind "#"
'''Or behind three quotation marks'''
# use # for comments in text, and ''' for descriptions of functions


# print something to the screen
print("Welcome to python3")


In [None]:

# set a variable. The variable will hold this value until you over-write it.
x=2.718281828459045


# write variable to screen
print(x)


In [None]:

# set a "string", ie a word. Note we have overwritten "x" set earlier and 
# changed it from a number to a word
x="This is a word"


# print a string variable to the screen
print(x)


In [None]:

# Multiple variables can be printed out
x=2.718281828459045
y=6.52
z="a note"
print(z,x,y)


In [None]:

# basic maths can be done in python
x=2.718281828459045
y=3.2
z=x*y
print(x,"times",y,"equals",z)


## Arrays and lists

We can also store lists of things in a single variable, which is a useful way of passing sets of data around (it would get very tedious to have to define thousands of separate variables to hold every brightness value from a 1 Mpixel photograph).

In [None]:
'''
Basics of lists and 
arrays in python3
'''


# We can also store lists of things in a single variable
x=[1,2,3,4,5]

# This can be used to pass around large amounts of data without needing thousands of variables
print(x)


In [None]:

# it can be accessed by telling pythong with "element" you want
print(x[0])


# we can also count from the back
print(x[-1])
print(x[-2])


# Note that the elements start from "0" and go, in this case, to 4.
print(x[0],x[4])


In [None]:
# lists can be of any type
x=["This","is","now","a","string"]


# or even a mix
x=[1,2,"sausage",4,5]
print(x)


There are many other ways to hold large amounts of data in python. eg. dictionaries and tuples, but lists are perhaps the most common.

In GIS and EO, we often want to hold large arrays of numbers this can be efficiently done using a "numpy" array, which have been specifically designed for this.

In [None]:

import numpy as np


# set some data (most often we will read from file)
x=np.array([4,3,2,1])


# It can be accessed in a similar way to lists
print(x[0],x[2])


# but now we can perform maths on the whole array
y=x/2
print("original",x,"modified",y)


In [None]:
# arrays can also be multi-dimensional. Let us make one full of "-1"
x=np.full((100,100),-1)

# the "np.full()" command tells it to make an array full of a given value
# the "(100,100)" tells it to make it 100*100 elements
# the "-1" tells it what value to fill the array with


Now we can store large amounts of data in a single variable. Let us see how we can analyse that.

# Plotting data

Often we need to visualise data. Python has a very useful plotting package.

In [None]:

# import a useful plotting package.
import matplotlib.pyplot as plt


# let us read some data in.
import numpy as np
filename='data/practice/practice_data.csv'  # this is the name of a file on the disk
x,y=np.loadtxt(filename,delimiter=',',unpack=True,dtype=float,comments="#")


# set the labels
plt.xlabel('x')
plt.ylabel('y')

# tell it which data to plot, and what format. '.' means point
plt.plot(x,y,'.')

# print to screen
plt.show()


In [None]:
# or we plot with a line
plt.plot(x,y)
plt.show()

# Program flow: Loops, functions and if's

## Functions
So far we have written "monolithic" code. We can create reusable chunks of code called "functions". We have already been using these from some packages (eg ***plt.plot()***), but can create our own.

To create a function, define it, the populate it with commands with a set number of indentations, as in the cell below. We can then reuse this function throughout our code. This allows us to build libraries of tools which we can reuse throughout our work for decades to come.


In [None]:
def myFunc(x):   # function name and input variables
  '''Describe the function here'''
  y=x*x       # do some analysis
  return(y)   # pass the results back


What happens when you run the cell above?

The answer is hopefully nothing. Here you have defined a set of instructions for the computer, but you have not asked it to carry out those instructions. You can think of defining a function a bit like planning a hillwalking route. Everything needed to complete the walk is there in the instructions, but the walking (the time consuming part) does not start until you set off.

We set off a function by ***calling*** it.

In [None]:

# set some variable ready to pass to our function
x=2

# call our function
res=myFunc(x)

# write the results to screen
print("The result of",x,'times',x,"is",res)


The important thing to understand with functions is what data goes in (*the bit inside the brackets after the function name*) and what data comes out (*the bit in the return*). If you pass the wrong data to or from a function, it will not give you an accurate result.

For the function above, it takes a single variable which it will then try to times by itself. This must be a number (as a word times a word has no meaning). **Importantly** whatever data you pass to the function will be renamed within it. So the variable names inside the code can be different from those outside the code. This lets you iteravely call a function and pass a whole list of variables to it (eg. a filename list).

It is good practice to group all commands in to functions to make them easier to rerun and your code more reusable. We will make extensive use functions in the biomass mapping exercise.

## Loops


The real power of a computer comes from its ability to perform repetative tasks quickly we can "loop" over a set of tasks, repeating them one by one.

Below is an example of simple loop:


In [None]:

# make a list of numbers
numList=[1,2,3,4,5,6,7,8,9]

# the loop below will set the variable, i, equal to each element
# of the list, numList, then carry out whatever instructions are
# indented within the loop
for i in numList:
  print(i)



There are also inbut functions, such as ***range()***, which can make lists of numbers between given bounds.

In [None]:

# loop over a list made by the range() function
for i in range(0,10):
  print(i)


Below is an example of reading data through a function and then looping over it to write the elements out one by one.

Here ***len(x)*** returns the length of the array. This way we can read different sized data files and the code will automatically scale the loop to fit the data.

In [None]:
import numpy as np

def readData(filename):
  '''A function to read some data'''
  x,y=np.loadtxt(filename,delimiter=',',unpack=True,dtype=float,comments="#")
  return(x,y)

x,y=readData('data/practice/practice_data.csv')

# loop over the length of the array
for i in range(0,len(x)):
  print(x[i],y[i])

## If's

We won't use these in the course today, but they are included for completeness. Feel free to skip to the second notebook now. If you want to learn more python, read on.

An if statement lets us control the flow of a program, allowing it to do different things in different conditions. This allows our program to make "decisions", making it more flexible.

In [None]:

z=0.34
threshold=0.5

if(z>threshold):
  print("z is bigger than",threshold,z)
else:
  print("z is smaller than",threshold,z)


This can be used to change the behaviour of the program based on the data that is passed to it. For example, the code below will only print out elements of a dataset if they are more than a certain number of standard deviations from the mean.

Here we haves made use of the ***mean()*** and ***std()*** functions within the ***numpy*** package to calculate the mean and standard deviation of our data.

In [None]:
# read the data
x,y=np.loadtxt(filename,delimiter=',',unpack=True,dtype=float,comments="#")


# eg, only print values more than 2 stdandard deviations from the mean
meanY=np.mean(y)
stdevY=np.std(y)
# set thresholds
minThresh=meanY-2*stdevY
maxThresh=meanY+2*stdevY

# only write those values
for i in range(0,len(x)):
  if( (y[i]<minThresh) | (y[i]>maxThresh)):
    print(x[i],y[i])
