# Introduction to python

In this notebook, you can play about with some fundamental python techniques, before we go on to do some simple image processing in python.


## How to use a python notebook
The idea of a python notebook is that you can quickly iterate on your code without mucking about in the command line. 

A python notebook consists of 'cells', each of which is a semi-independent chunk of code. Cells can also contain code in 'markdown' (like this one) which is a bit like HTML, and renders in to pretty bits of text.

Each section in gray is a cell, cells with 'In [ ]:' beside them are input cells for code. The putput of the code is shown below the code cell. Note that only the last output will be automatically shown, so if you want to see results from further up, you have to explicitly print them.

To run your code, highlight a cell by clicking on it (it should then be outlined in green) then either clicking the play button at the top, or pressing Shift + Enter. Alternatively, you can do CTRL + Enter, which does not move you into the next cell and can be useful sometimes.

Note that to use things defined in a cell, it must be executed - this can confuse you if you change something in a bit of code but forget to re-run it!

Go ahead and run the cell below this text to try it.

In [None]:
# Python is really cool
import this

That should have printed some text telling you the Zen of python. The Zen is a good thing to keep in mind, but you probably don't have to worry about it yet!

# Contents
Below are links to sections of this notebook that explain some important concepts, or give example code. If you click them, you should be taken to roughly the right place in the notebook to get some more information.


[Data Types and Variables](#DataTypesAndVariables)

[Operators](#Operators)

[Compound Operators](#CompoundOperators)

[Other Data Types](#OtherDataTypes)

[Data Structures](#DataStructures)

[Tuples](#Tuples)

[Dictionaries](#Dictionaries)

[Lists](#Lists)

[Program Control](#ProgramControl)

[For Loop](#ForLoop)

[While Loop](#WhileLoop)

[A Note on Whitespace](#WhitespaceNote)

[Break and Continue](#BreakAndContinue)

[Example: Finding prime numbers](#PrimesExample)

[Functions](#Functions)

[Plotting](#Plotting)

[A Very brief introduction to numpy](#NumpyQuickstart)

[Example: Plotting the sin$^4$ function](#Sin4Example)

[Filtering Signals](#FilteringSignals)

[Finding Peaks in data](#PeakFinding)

## Data Types and Variables
<a id="DataTypesAndVariables"></a>
The idea of a data type and a variable go hand in hand, so let's look at them together.

A variable is like a labelled box that you can put something in. For example:

In [2]:
aVariable = 10

In this example, the variable is called `aVaraiable` (i.e. the label on the box), and the value stored by it is `10`. You can name your variable whatever you like, as long as it isn't one of the reserved names (listed below), doesn't start with a number, and doesn't use non-alphanumeric characters. The only non-alphanumeric character you're allowed is the underscore (_).

The protected names are:

    and       del       from      not       while    
    as        elif      global    or        with     
    assert    else      if        pass      yield    
    break     except    import    print              
    class     exec      in        raise              
    continue  finally   is        return             
    def       for       lambda    try

Each of these does something in python, so you shouldn't name a variable with one of these.

Another important thing is the use of code comments. In any line of python code, everything to the right of a # symbol is ignored. The usual use of this is to put little notes in your code so that six-months-from-now you can work out what is going on. Comments are very important and you should get into the habit of making them.

The data type of `aVariable` is determined by what is stored in it - this is what makes python a dynamically typed language. In the case above, `aVariable` is an integer:

In [3]:
print(type(aVariable))

<type 'int'>


Dynamic typing means that we can change the data type stored by `aVariable` whenever we want:

In [None]:
aVariable = 10.0
print(type(aVariable))

This introduces two of the fundamental data types in python; integers and floats. Integers deal with only whole numbers, while floats are needed to deal with real numbers.

## Operators
<a id="Operators"></a>

You can do all the usual arithmetic operations with ints and floats: addition, subtraction, division, multiplication and exponentiation as well as some less common (in every day use at least) operations like floor division and modulo division:

In [None]:
# Make up a couple of variables:
variable1 = 10
variable2 = 3

# Addition
print(variable1 + variable2)# <-- Expect 13
# Subtraction
print(variable1 - variable2)# <-- Expect 7
# Division
print(variable1 / variable2)# <-- 10/3 should be 3.333333... Why is it only 3?
# Multiplication
print(variable1 * variable2)# <-- Expect 30

In [None]:
# Now do the same thing with floats:
variable3 = 10.0
variable4 = 3.0

# Addition
print(variable3 + variable4)# <-- Expect 13.0
# Subtraction
print(variable3 - variable4)# <-- Expect 7.0
# Division
print(variable3 / variable4)# <-- Expect 3.3333...
# Multiplication
print(variable3 * variable4)# <-- Expect 30.0

print

In [None]:
# Now look at some of the stranger operators

# Exponentiation
print(variable1 ** variable2)# 10^3 = 1000
print(variable3 ** variable4)# Float version gives 1000.0

# Modulo division
print(variable1 % variable2)# 10/3 = 3 remainder 1 (so should get 1)
print(variable3 % variable4)# Float version gives 1.0

# Floor division
print(variable1 // variable2)# Same as normal division for integers
print(variable3 // variable4)# Should be 3.0

__Important Note__:

Because of how integer division works, it is possible to get totally incorrect results if you're not careful. For example:

In [None]:
print(1/2)# Both integer, so what happens?

## Compound operators
<a id="CompoundOperators"></a>
Compund operators do two things at once, for example:

In [None]:
someVariable = 10
someVariable *= 2
print(someVariable)

In this case, we multiplied `someVariable` by 2, then assigned the result of that back into `someVariable`. This is equivalent to the following code:

In [None]:
someVariable = 10
someVariable = someVariable * 2
print(someVariable)

but a bit tidier.

There are a bunch of operators like this: `+=`, `-=`, `*=`, `**=`, `/=`, `%=`, and `//=`. They should be used with caution, but can make your code much cleaner.

## The other data types
<a id="OtherDataTypes"></a>
There are a few other data types you need to know about, starting with the simplest: booleans.

Boolean variables can have one of two values: `True` or `False`. Probably the most common place you will see them is as a result of comparison operators:

In [None]:
aBigNumber = 1000
aSmallerNumber = 250

# The equality operator
print(aBigNumber / aSmallerNumber == 4)
print(type(aBigNumber / aSmallerNumber == 4))

# The greater than operator:
print(aBigNumber > aSmallerNumber)
print(type(aBigNumber > aSmallerNumber))

# The greater than or equal to operator:
print(aBigNumber >= aSmallerNumber)
print(type(aBigNumber >= aSmallerNumber))

# The less than operator:
print(aBigNumber < aSmallerNumber)
print(type(aBigNumber < aSmallerNumber))

# The less than or equal to operator:
print(aBigNumber <= aSmallerNumber)
print(type(aBigNumber <= aSmallerNumber))

Booleans can be used with a few conditional operators:

In [None]:
someNumber = 36
factor1 = 3
factor2 = 5

# Using the and operator
# Are both 3 and 5 factors fo 36?
print(someNumber % factor1 == 0 and someNumber % factor2 == 0)

# Well maybe one of them is, use the or operator
print(someNumber % factor1 == 0 or someNumber % factor2 == 0)

# What if I want to know if neither is a factor?
print(not(someNumber % factor1 == 0) and not(someNumber % factor2 == 0))# Extremely contrived example!

The last data type is the string. 

Strings are denoted using double quotes (`"`), though they can also be denoted using single quotes (`'`). 

Strings are pretty simple, but have a few gotchas:

In [None]:
aString = "Hello "# Simple Enough
print(aString)

# Addition works with strings!
print(aString + "World!")

# Multiplication also works! (but maybe not how you expected...)
print(aString*3)

# To get the result of calculations into strings, you need to do some conversions...
tenSquared = 10 **2
print("Ten squared is: " + str(tenSquared))# First way - okay if you want the number at the end
print("%d is ten squared"%(tenSquared))# This way allows you to put the number anywhere, but you need to learn some format codes
print("I told you, {0} is ten squared!".format(tenSquared))# Preffered way - can put multiple results in easily.

The `.format()` method for putting things into strings is preferred because it is just nicer. To restrict the accuracy of floating point numbers, you can specify the number of decimal points to use:

In [None]:
# Approximate value of pi:
approxPi = 22.0/7.0 # Note - used float division!

print("My approximate pi is accurate to 0 DP: {0:.0f}".format(approxPi))
print("My approximate pi is accurate to 1 DP: {0:.1f}".format(approxPi))# Specify a number od DP after a point
print("My approximate pi is accurate to 2 DP: {0:.2f}".format(approxPi))
print("But not 3 DP: {0:.3f}".format(approxPi))# Note - it will round the printed value!
print("The approximate value is: {0}".format(approxPi))


One really annoying 'feature' of Windows is that the backslash (\\) is used in directory names.

Unfortunately, the backslash is also used to escape characters in strings. What does this mean? Consider the directory C:\newStuff:

In [None]:
print("C:\newStuff")

This will most certainly cause headaches in code. Fortunately, there are ways round it:
    

In [None]:
print(r"C:\newStuff")# Tell python to treat it as a 'raw string'
print("C:\\newStuff")# Escape the escape character!

Which trick you use is up to you, and will depend on the characters needed in the directory. This is one place where it would be a good idea to put a comment explaining why you did it the way you did!

Strings can be treated like arrays of characters, so you can access any element, but you can't change it.

# Data Structures
<a id="DataStructures"></a>
Data structures are the mechanisms by which data is organised into a structure. In python there are three ways to do this.

## Tuples
<a id="Tuples"></a>
Tuples can contain any number of elements, but once you create one, you can't change it. Tuples are created using 'normal' parentheses:

In [None]:
# Make a tuple of three numbers
aTuple = (1,2,3)
print(aTuple)# This is fine
print(aTuple[0])# This is also fine
#aTuple[1] = 4 # This will not work!

# Tuples can have anything in them - eg tuple of tuples
anotherTuple = ((1,0,0),(0,1,0),(0,0,1))# Happens to be an identity matrix!
print(anotherTuple)# But python doesn't know that...

# You can even have different data types in a tuple:
yetAnotherTuple = ('a', (1,2), 3)
print(yetAnotherTuple)

# Some arithmetic works on tuples, but might not do what you expected
print(aTuple*3)
print(aTuple + aTuple) # Addition only works between two tuples!

It might seem obvious to use tuples to hold vectors in your code, but the fact that they cannot be changed means it probably isn't a good idea.

Tuples are used by some modules to return several things at once.

#### A Side note:
If you look in the top left of this notebook, you will probably see that we are using Python 2. While this is fine, Python 2 will be retired soon (Python 3 is already out). 

One big difference between Python 2 and Python 3 is that the `print` statement becomes a function. That means that you **must** put the things you want to print in parentheses. In Python 2 this is not the case, so something like

    print "Hello World!"

is fine; in Python 3 however, this would cause an error. The workaround is to pretend that we're using the function version of `print` by giving tuples to the statement version.

If that is really confusing, don't worry about it. The main reason for highlighting this is that when you use 3rd party libraries, you might see some old Python 2 code and not know how to fix it to work with your shiny Python 3 code. When writing new code, follow the example in this notebook, and you should be fine.

## Dictionary
<a id="Dictionaries"></a>

The dictionary in python is a map from one thing to another. Dictionaries are formed of key=value pairs which makes them useful for containing e.g. DICOM headers. Dictionaries are created using curly braces:

In [None]:
aDictionary = {'a':1, 'b':2}
print(aDictionary)# This is okay, but probably confusing
print(aDictionary['a'])

# It is okay to change values in a dictionary
aDictionary['b'] = 3
print(aDictionary['b'])

# And it is very easy to add to a dictionary:
aDictionary['c'] = 4
print(aDictionary)# Note the weird order - this is not guaranteed to be the same every time!

# Anything can be the key or value of a dictionary
anotherDictionary = {(1,2,3) : 'abc', (4,5,6): 'def'}
anotherDictionary['ghi'] = "Yep"# You can even mix types!
print(anotherDictionary)

# Note - arithmetic doesn't work on dictionaries

Dictionaries have a few useful functions built into them for handling the key and value lists. This is important when you want to do something to every value in a dictionary.

The two lists can be accessed using the `.keys()` and `.values()` functions:

In [None]:
print(aDictionary.keys())

print(aDictionary.values())

Dictionaries are handy, but the most general data structure is the list

## Lists
<a id="Lists"></a>
Lists are, as the name implies, lists of items. The items can be different types, and you can change the values after initialising the list. 

In [None]:
# Make a list
aList = [1,2,3,4]
print(aList)

# You can change things!
aList[0] = 5
print(aList)

# Use a list of lists to make something like a matrix:
anotherList = [[1,0,0],[0,1,0],[0,0,1]]
print(anotherList)# Python still doesn't print it right...

# Arithmetic works again, but probably not as you expect
print(aList * 2)
print(aList + anotherList)# Addition only works between two lists!

You could use python lists to contain an N-dimensional image. There would be no problem doing that from the point of view of memory, but the performance of python lists is not great. There are specialised libraries for dealing with numerical data like images, which we will look at later.

Lists have some useful built in functions, like the append function, which (wait for it) appends data to the end of the list:


In [None]:
# Contrived Example!!
# Create an empty list:
emptyList = []
emptyList.append(1)
emptyList.append(2)
print(emptyList)

You can also extend a list with another list, which can be handy sometimes

In [None]:
print(emptyList)
emptyList.extend([3,4,5])
print(emptyList)

Lists in python can be any length, and can contain any data type, even mixed ones.

# Program control
<a id="ProgramControl"></a>
So, now you can create lists, dictionaries and tuples of different data types and do some operations on them. The next bit of the jigsaw is being able to control the flow of your program. 

Program control flow is done using if/else statements and loops. If/else statements are pretty simple:

In [None]:
# Another contrived example!
aNumber = 10
firstFactor = 7
secondFactor = 5

# Is 7 a factor of 10? If not, is 5 a factor of 10?

if aNumber % firstFactor == 0:
    print("{0} is a factor of {1}".format(firstFactor, aNumber))
elif aNumber % secondFactor == 0:
    print("{0} is not a factor of {1}, but {2} is".format(firstFactor, aNumber, secondFactor))
else:
    print("Neither {0} or {1} is a factor of {2}".format(firstFactor, secondFactor, aNumber))



You can play about with the numbers in the above example and see how re-running under different conditions changes which branch of the if-else tree we go down.

# Loops

Unlike other languages, there are only two types of loop in python.

## For loop
<a id="ForLoop"></a>
The for loop is used when you have something to iterate over, for example:

In [None]:
for i in [1,2,3,4,5]:
    print(i)

In this example, the syntax is 

**for** *loop variable name* **in** *something to iterate over*:

Obviously, typing `[1,2,3,4,5...]` is impractical, so there is a built in function to do that for you, called range. Range works like this:

**range**(*start*, *stop*, *step*)

but the step argument is optional (it defaults to 1). Also, it is important to note that range stops before the last value, so the equivalent code to that above would be.

In [None]:
for i in range(1,6):
    print(i)

The `range()` function has a few gotchas, the most annoying of which is that it can only give integer values. This is really annoying when you want to get some floating point numbers in a range (for example to plot a function with). Fortunately, the library we will be using later to do some image processing has a few ways of getting floating point numbers in a range-like way, so we can re-visit that problem later

The for loop can be used to do things called comprehensions. These are clever ways to build data structures using generating functions, for example

In [None]:
# Generate a list of squares:
listOfSquares = [a**2 for a in range(1, 10)]
print(listOfSquares)

# Can also do dictionaries:
dictOfSquares = {a:a**2 for a in range(1, 10)}
print(dictOfSquares)

## While loop
<a id="WhileLoop"></a>
The while loop is used to execute some code while a condition is true, for example:


In [None]:
# Count up in threes below 10
currentNumber = 0
while currentNumber < 10:
    print(currentNumber)
    currentNumber += 3

The syntax is pretty obvious:

**while** *condition*:
Do Stuff

Note that the colon after the condition is part of the statement!

### A note on whitespace
<a id="WhitespaceNote"></a>
As you have probably noticed, there isn't really anything splitting up the different bits of a Python program. In C for example, copious use of semicolons and curly braces is made to show where lines end and how the code is made out of chunks.

In Python, code is chunked according to how indented it is, i.e. how far to the right a line of code starts. 

This can be quite easy to mess up, so when writing code use an IDE (Integrated Development Environment) that knows Python and will auto-indent for you. If the indentation goes wrong, use the TAB key to indent your code until it looks right - at that point it will probably run, but you may need to tweak it.

Oh, and watch out for accidental spaces at the start of lines!

## Break and continue
<a id="BreakAndContinue"></a>
There are two important keywords for use wil loops, `break` and `continue`.

Break makes the program jump out of the loop and go to the next thing after it - this is useful when you want to stop looping as soon as a condition is satisfied

Continue makes the loop jump to the next iteration without doing any of the code in the body of the loop below the continue statement. This can be useful when you need to e.g. skip over an iteration of the loop when some condition is met.

### Example: Finding primes below 100
<a id="PrimesExample"></a>
We can put the while and for loops together to find a list of all primes less than 100.

In [None]:
currentNumber = 2
listOfPrimes = [2]# Need to start with 2 because *everything* is divisible by 1
while currentNumber < 100:# Only looking for numbers less than 100
    isPrime = False
    for p in listOfPrimes:
        if currentNumber % p == 0:# See if the current number is divisible by any of our known primes
            isPrime = False # If it is, it can't be prime!
            break# No point checking the rest of the list
        else:
            isPrime = True # It might be prime - need to check the other numbers in our list
            continue # Jump to the next iteration
        
    if isPrime:
        listOfPrimes.append(currentNumber)# Add the latest number to the primes list, if it is prime.
    currentNumber += 1 # Increment the current number

print(listOfPrimes)

By now we can write reasonably full python programs that can solve actual problems! 

# Functions
<a id="Functions"></a>
Functions are a way to wrap up bits of python code that we may want to run several times in such a way that we can just give the function some arguments to work with, and let it give us back the result.

We have seen a few functions so far, for example the `range()` function, but haven't tried making our own yet. Function definitions start with the keyword `def`, after which comes the function name, then a list of arguments in parentheses followed by a colon. The same indentation rules apply for functions as for if/else statements and loops.

Lets write a simple function that calculates the nth root of a number, and returns it to us

In [None]:
# Calculating the nth root of a number, using a function.
def nthRoot(number, n):
    """
    This is a docstring. It is a short comment that tells you what the function does, what arguments it expects and 
    what values it returns. You may also want to put some info about who wrote the function, and when it was written in here too.
    
    Docstrings are enclosed in triple-double quotes.
    
    Function to return the nth root of a number.
    Arguments:
    number       The number whose root we want to find
    n            The root we want
    
    Returns:
    theRoot      The nth root of the given number
    """
    exponent = 1.0/n # Easy way to get nth root
    theRoot = number ** exponent
    return theRoot

This function is not hugely complicated; it just loks big because of the long docstring. We can then use the function just like you would any other:

In [None]:
print(nthRoot(4, 2))# Square root of 4 should be 2

print(nthRoot(8, 3))# cube root of 8 should also be 2

It is also possible to have optional arguments in functions; to do this, simply give the default value in the function definition. Redefining the nthRoot function to default to square roots:

In [None]:
# Calculating the nth root of a number, using a function.
def nthRoot(number, n=2):
    """
    This is a docstring. It is a short comment that tells you what the function does, what arguments it expects and 
    what values it returns. You may also want to put some info about who wrote the function, and when it was written in here too.
    
    Docstrings are enclosed in triple-double quotes.
    
    Function to return the nth root of a number.
    Arguments:
    number       The number whose root we want to find
    n            The root we want. Default 2
    
    Returns:
    theRoot      The nth root of the given number
    """
    exponent = 1.0/n # Easy way to get nth root
    theRoot = number ** exponent
    return theRoot

We can now call the function with only 1 argument to get square roots, and supply a value of n if we want anything else:

In [None]:
print(nthRoot(4))# Should be 2
print(nthRoot(9))# should be 3
print(nthRoot(9, 3))# Still able to do other roots

If you have some code that you want to apply several times, or re-use in another program, you should put it into a function. Using functions also makes it easier to figure out what is going wrong when you try to fix bugs in your code.

Next we will tackle some plotting and data analysis tricks, and then make a start on image processing.

# Plotting
<a id="Plotting"></a>
Plotting in python is almost always done using the excellent matplotlib library. To use it we simply import the pyplot module from it. 

Python has a handy feature that allows us to import a module then rename it within our program. This can be really useful when a module has a long name. It is quite common to import `matplotlib.pyplot` as `plt` to save typing out the `matplotlib.pyplot` bit every time:

In [None]:
import matplotlib.pyplot as plt

When using matplotlib in a notebook, we have to tell it to run in inline node (otherwise it will put all plots in one figure). To do this, put the following line after importing matplotlib.

In [None]:
%matplotlib inline

Now we're ready to plot!

## A super-short primer on numpy
<a id="NumpyQuickstart"></a>

The other library we will make extensive use of is the `numpy` library. Numpy is designed for numerical calculation, so it is ideal for a lot of image processing tasks. Importing numpy is as simple as:

In [None]:
import numpy as np

A primer on numpy is waaaay beyond the scope of this notebook. Reference material can be found [here](https://docs.scipy.org/doc/numpy/reference/), but may be a bit technical. 

An excellent resource for novice programmers is the [stackoverflow](http://stackoverflow.com/) site, where your question has almost certainly been asked before, and if it hasn't you can create an account and ask it yourself.

Numpy arrays are quite different to the other data structures, so we should look at them a little bit.

Numpy arrays can only have one data type which is inferred fron what it is given. One way to generate a numpy array is from a normal Python list:

In [None]:
anArray = np.array([1,2,3])
print(type(anArray), anArray.dtype)

Arithmetic on numpy arrays works element-wise in the way you would probably expect:

In [None]:
arrayOfTwos = np.array([2,2,2,2])
print(arrayOfTwos)

# Multiply by 2:
print(arrayOfTwos * 2)

# Add 3
print(arrayOfTwos + 3)

# Even compound operators work
print(arrayOfTwos)
arrayOfTwos /= 2
print(arrayOfTwos)

Arrays can be multiplied together, so long as they have the same shape


In [None]:
# Create an array of threes:
arrayOfThrees = np.array(5*[3])# There is a better way to do this

# Create another array counting up from 1 to 5
countUp = np.arange(1, 6)# This is the numpy version of the range() function - it is much better

print(arrayOfThrees * countUp)


Part of what makes numpy useful for image processing is that it can easily handle multi-dimensional arrays, for example:


In [None]:
# Use a numpy function to make a 4x4 identity matrix:
identity = np.eye(4)
print(identity)

Numpy can even create arrays of random numbers with any shape

In [None]:
# Create a 4x4x4 array of random numbers
randArray = np.random.rand(4,4,4)
# print(randArray) # uncomment this for a bunch of random numbers!

## Array Indexing
Numpy allows you to do some pretty cool tricks with array access:

In [None]:
print("Print all numbers in the nth slice of an image:")
print(randArray[:,:,2])# Change the 2 to something else to see what happens

print("Print every other slice")
print(randArray[::2])# Try changing the 2 here too!


There are a lot of other things you can do, which this notebook will not go into. There is a whole huge article on array indexing in the numpy manual [here](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

Now we understand arrays a bit better, we can do a little plotting.

## Example: Plotting the sin$^4$ function
<a id="Sin4Example"></a>

The sin$^4$ function is sometimes used as an approximation to breathing motion, and might be interesting to look at. First we create a set of points linearly spaced between two endpoints, then calculate and plot the `sin` of the numbers.

In [None]:
# Create a 1024 length array of numbers between 0 and 2 pi
N = 1024
xValues = np.linspace(0, 2.0*np.pi, N, endpoint=True)# 
sinValues = np.sin(xValues)# Note: numpy knows how to handle arrays!

# Now plot the data - simple x-y line plot
plt.plot(xValues, sinValues)


You should see a nice sin curve above. Using what you know about python:
1. Plot a sin$^4$ function over the same range
2. Add a phase argument to the sin$^4$ function, and plot it alongside the original
3. Add some random noise to the sin$^4$ function.

In [None]:
# Plotting a sin^4 curve

# Recall the exponentiation operator in python? Use it to convert your sin data to sin^4 data

sin4 = sinValues ** 4

# Now plot it in a new plot

plt.plot(xValues, sin4)

In [None]:
# Adding phase to the sin^4 curve.

# Phase is just a constant offset in time, so we can emulate it by adding a constant to the x values used to 
# create the sin data

sinWithPhase = np.sin(xValues + np.pi/3)
sin4WithPhase = sinWithPhase ** 4

# Now plot the two together:
plt.plot(xValues, sin4)
plt.plot(xValues, sin4WithPhase)

In [None]:
# Adding random noise to a signal.

# We know about the numpy.random.rand function, so we could try using that:
sinWithNoise = sin4 + np.random.rand(sin4.shape[0])
plt.plot(xValues, sinWithNoise)

# This probably isn't very realistic though. Lets use Gaussian random noise
sinWithRealNoise = sin4 + np.random.normal(loc=0.0, scale=0.1, size=sin4.shape[0])

plt.figure()# Put it in a new figure
plt.plot(xValues, sinWithRealNoise)

# Filtering Signals
<a id="FilteringSignals"></a>
Taking noise out of a signal is done using some kind of filter, or smoothing window. There are a lot of possible smoothing windows and you should choose one that is well suited to the job you're trying to do. 

To apply a smoothing window, we use a mathematical technique called convolution. Convolution takes two functions and looks at the extent to which they overlap. Fortunately, you can do numerical convolution using numpy!

The very simplest smoothing window is the moving averages window, which is essentially a convolution with a top hat function:

In [None]:
# Moving averages window function
N = xValues.shape[0]/32 # Try changing the window width to see the effect on the filtered signal
window = np.ones(N)
convolved = np.convolve(window/window.sum(), sinWithRealNoise, mode='same')# Note - divide by the sum of the window to 
                                                                        #    maintain normalisation
plt.plot(xValues, convolved)
plt.plot(xValues, sin4, linewidth=2)

More complicated windows can be constructed and applied. There is a decent list of windows on [Wikipedia](https://en.wikipedia.org/wiki/Window_function), from where you could try implementing a few of the others. Below I've implemented a Hamming window (not to be confused with a Hanning window!), Sin window and Nuttall Window.

In [None]:
# Hamming window
alpha = 0.54
beta = 1.0 - alpha
N = xValues.shape[0]/32
HW = np.array([2.0*np.pi*i / (N-1) for i in range(0, N)])
hwConvolved = np.convolve(HW/HW.sum(), sinWithRealNoise, mode='same')
plt.plot(xValues, hwConvolved)
plt.plot(xValues, sin4, linewidth=2)

In [None]:
# Sin window
N = xValues.shape[0]/16
SW = np.sin(np.array([np.pi*i / (N-1) for i in range(0, N)]))
sinConvolved = np.convolve(SW/SW.sum(), sinWithRealNoise, mode='same')
plt.plot(xValues, sinConvolved)
plt.plot(xValues, sin4, linewidth=2)

In [None]:
# Nuttall window
a0 = 0.355768
a1 = 0.487396
a2 = 0.144232
a3 = 0.012604

N = xValues.shape[0]/32
windowValues = np.array([np.pi*i / (N-1) for i in range(0, N)])
NW = a0 + a1*np.cos(2.0*windowValues) + a2*np.cos(4.0*windowValues) - a3*np.cos(6.0*windowValues)
nuttallConvolved = np.convolve(NW/NW.sum(), sinWithRealNoise, mode='same')
plt.plot(xValues, nuttallConvolved)
plt.plot(xValues, sin4, linewidth=2)

# Peak Finding
<a id="PeakFinding"></a>
Now you have smoothed your data, you can try to find some peaks in it. The manual way to do this is quite simple:


In [None]:
# Lets use the version filtered with the Hamming window
plt.plot(xValues, hwConvolved)
plt.plot(xValues, sin4, linewidth=2)

# This tells us the maximum value, and its location in the array
print(np.max(hwConvolved), np.argmax(hwConvolved))# Two useful functions!

# We will now try to resolve each peak in the data (there should only be two)
# It is usual to use a threshold, we will set ours to 75% of the maximum value in the data
threshold = 0.75 * np.max(hwConvolved)
# Delta is a parameter used to distingush peaks - we require that a peak be at least delta bigger than its surroundings
delta = 0.1

# Make some variables to hold the peak information
maxtab = []
mx =-np.inf
mxpos = np.nan

for i in np.arange(len(hwConvolved)):
    this = hwConvolved[i]
    
    if this < threshold:# Skip points below the threshold
        continue
        
    if this > mx:
        mx = this
        mxpos = xValues[i]# This position is greater than the previous max, so might be a peak!
        
    if this < mx-delta:# Make sure we are larger than the peak by some amount
        if not (mxpos, mx) in maxtab:
            maxtab.append((mxpos, mx))# If we haven't already, record this peak
        mn = this
        mnpos = xValues[i]
        
plt.plot(np.array(maxtab)[:,0], np.array(maxtab)[:,1], color='red', linewidth=0, marker='.', markersize=10)

We could even wrap all this up into a function so it can be re-used:

In [None]:
def peakFind(xData, yData, thresholdFrac=0.75, deltaFrac=0.1, convolvedData=False):
    """
    """
    #First we smooth the data by applying a hamming window.
    alpha = 0.54
    beta = 1.0 - alpha
    N = xData.shape[0]/32
    HW = np.array([2.0*np.pi*i / (N-1) for i in range(0, N)])
    yDataConvolved = np.convolve(HW/HW.sum(), yData, mode='same')
    
    # Now we run the peak detection algorithm over the smoothed data
    maxtab = []
    mx =-np.inf
    mxpos = np.nan
    
    threshold = thresholdFrac * np.max(yDataConvolved)
    delta = deltaFrac #* np.max(yData)

    for i in np.arange(len(yDataConvolved)):
        this = yDataConvolved[i]

        if this < threshold:# Skip points below the threshold
            continue

        if this > mx:
            mx = this
            mxpos = xValues[i]# This position is greater than the previous max, so might be a peak!

        if this < mx-delta:# Make sure we are larger than the peak by some amount
            if not (mxpos, mx) in maxtab:
                maxtab.append((mxpos, mx))# If we haven't already, record this peak
    if convolvedData:
        return (maxtab, yDataConvolved)
    return maxtab

In [None]:
# Now use the peak finder on some noisy data

peaks, yCData = peakFind(xValues, sinWithRealNoise, convolvedData=True)
print(peaks)


plt.plot(xValues, yCData)
plt.plot(np.array(peaks)[:,0], np.array(peaks)[:,1], color='red', linewidth=0, marker='.', markersize=10)
plt.plot(xValues, sin4, linewidth=2)


%timeit peakFind(xValues, sinWithRealNoise)

You should now be more than able to tackle the next notebook, in which real physiological data is supplied for you to work with