# Energy Materials: Design, Discovery and Data

## 4. Python for Science and Engineering

## Advance preparation

Modules one and two from [Intro to Python for Data Science](https://www.datacamp.com/courses/intro-to-python-for-data-science)

## Lecture Slides 
On [Speakerdeck](https://speakerdeck.com/lucydot)

In [1]:
%%HTML 
<script async class="speakerdeck-embed" data-id="e051ec84f9764a4c89ff28c6f3658a81" data-ratio="1.33333333333333" src="//speakerdeck.com/assets/embed.js"></script>

## Acknowledgment
A lot of this notebook has been adapted from [Ben Morgan's Tutorials](https://github.com/bjmorgan/jupyter_cc_1) 

The Control Flow section has been adapted from the GitBook [A Byte of Python](https://www.gitbook.com/book/swaroopch/byte-of-python/details)

## Contents

- [Mathematical functions and modules](#functions_and_modules)
- [Variables](#variables) 
- [Data types](#data_types)
 - [numbers](#numbers)
 - [strings](#strings)
 - [lists](#lists)
 - [dictionaries](#dictionaries)
- [Defining functions](#functions)
- [Control flow](#control_flow)
 - [The `if` statement](#if_statement)
 - [The `for` statement](#for_statement)
- [Common mistakes](#common_mistakes)
- [Reading in data with Pandas](#pandas)
- [Plotting data with Matplotlib](#matplotlib)
- [Fitting data with Numpy](#numpy)
- [Putting it all together](#together)
- [Extension task](#extension)
- [Resources](#resources)

## Mathematical functions and modules<a id="functions_and_modules"></a>

In programming a **function** converts an input into an output. For example, if we want to calculate a square root, we can use the `sqrt()` function.

>```python
sqrt(4)
```

This has given us an error:  

<span style="font-family:monospace"><span style="color:#890004">NameError</span><span style="color:#046308">: name 'sqrt' is not defined</style></span>. 

Python has a *lot* of built in commands (functions). Commands (such as mathematical functions) are collected in **modules** that we can load, to make these available in our notebook.

`sqrt` lives in the `math` module. We can load it like this:

>```python
from math import sqrt
sqrt(4)
```

Or we can import the entire math module:

>```python
import math
math.sqrt(4)
```

You can think of `math.sqrt()` as instructing the computer to &ldquo;use the `sqrt()` function provided inside the  `math` module&rdquo;.

The `math` module contains a [large set of common mathematical functions](https://docs.python.org/2/library/math.html), and the constants $\pi$ and $\mathrm{e}$ (natural logarithm).

>```python
from math import pi, sin, e, log
```



<div class="alert alert-success">
Calculate the area of a circle with a radius of 2cm. Add a comment to your code (#) stating which unit your answer is in. <br/>
Note: the power operator is a double asterix. E.g. 3 to the power 4 is written 3**4
</div>

<div class="alert alert-success">
Calculate the $sin(2\pi)$. Discuss with the person next to you why it does not give you the answer you expect.
</div>

##  Variables<a id="variables"></a>

<span style="font-family:monospace"><span style="color:#046308">print</span>()</span> is another function, like `math.sqrt()` or `math.log()`. Instead of performing a mathematical calculation, <span style="font-family:monospace"><span style="color:#046308">print</span>()</span> just prints whatever is inside the brackets. 

<span style="font-family:monospace"><span style="color:#046308">print</span>()</span> can print more than one variable if these are separated by commas:

>```python
print("72/4 =",72/4)
```

Storing results in computer memory is called **assigning** **variables**. To access the value stored in the variable, we can use the variable name to refer to the original result.

>```python
# calculate 2 + 3 and store the result in the variable `my_result`
my_result = 2 + 3
```

A variable `my_result` is created, and the value returned by the calculation is stored here.  

Variable names can be nearly anything cannot begin with a number (but can contain numbers), and they cannot contain spaces. Underscores are commonly used instead of spaces to keep the code readable.

To check the value stored in a variable we can just type the variable name (which returns the stored value).

>```python
my_result
```

Or we can use use `print()`.

>```python
print( my_result )
```

Variables can be used to store raw numbers, and can then be used for calculations.

>```python
the_number_six = 6
my_result + the_number_six
```

Any code that uses variables may itself return a further result, which can be assigned to a new variable, and used later (and so on).

>```python
yet_another_variable = my_result + the_number_six
print (yet_another_variable) 
```

If you refer to a variable that has not yet been created you will get an error.

>```python
favourite_fruit = bananas # this will return an error
```

You can add an integer to a variable using shorthand notation:

>```python
x = 1
x += 4
print (x)
```

<div class="alert alert-success">
Create three variables, $x$, $y$, and $z$, and use them to store the numbers $5,6,7$.  

Using these variables, calculate:  
$5+6+7$, and  
$(5+6)\times7$.
</div>

## Data types<a id="data_types"></a>

### Numbers: *int* and *float*<a id="numbers"></a>

- Whole numbers, without decimal points are integers or &ldquo;ints&rdquo;, e.g. <span style="color:#108714; font-family:monospace">1</span>, <span style="color:#108714; font-family:monospace">6</span>, <span style="color:#108714; font-family:monospace">2331</span>.  

- Numbers with decimal points are floating point numbers or &ldquo;floats&rdquo;, e.g. <span style="color:#108714; font-family:monospace">1.0</span>, <span style="color:#108714; font-family:monospace">232.141</span>.

Note that <span style="color:#108714; font-family:monospace">1</span> and <span style="color:#108714; font-family:monospace">1.0</span> are different:

>```python
type(1) # `type()` returns the data-type of something
```





>```python
type(1.0)
```

>```python
1 is 1.0 # `is` tests whether two things are the same
```

Even though they both represent the number one, and have equal values (yes, this can be confusing), `1` and `1.0` are not **the same** because the first is an integer and the second is a float.  

To reassure ourselves slightly, we can test whether two things are equal using `==`

>```python
1 == 1.0
```

Very large and very small numbers can be written using **scientific notation**.

<span style="color:#108714; font-family:monospace">1.3e5</span> uses scientific notation and is shorthand for <span style="color:#108714; font-family:monospace">130000.0</span>.

<span style="color:#108714; font-family:monospace">2.41e-7</span> uses scientific notation and is shorthand for <span style="color:#108714; font-family:monospace">0.000000241</span>.

 <br/>
<div class="alert alert-success">
Add 0.00054 and 4700000 using scientific notation
</div>

### Strings<a id="strings"></a>

Strings are any sequence of text. We indicate that a sequence of text is a string, and not a Python command, by enclosing it in single or double quotes. Being able to use either quote type allows strings that themselves contain quotes.

>```python
'this is a string using single quotes'
```

>```python
"this is a string using double quotes"
```

We can turn an integer into a string using the str() function:

>```python
x = 4
print (type(x))
y = str(x)
print (type(y))
```

We can format strings so that they contain variables:

>```python
subject = "materials science"
time = 100
print ("I have loved {0} for {1} years".format(subject, time)) # The use of curly braces {} in a string is called string formatting
```

### Lists<a id="lists"></a>

Python also contains built-in data types for collections of things. For data analysis we often deal with sets of numbers. These can be collected in **lists**.

A list is denoted by a series separated by commas, and enclosed in square brackets:

>```python
my_list = [ 1, 2, 3, 4 ]
my_list
```

although lists can contain any set of Python objects:

>```python
my_other_list = [ 4, 1.5, 'peach' ]
my_other_list
```

To refer to one element in a list, use the **index** of that element. Index numbering counts the number of jumps along the sequence, so starts at zero.

>```python
# 1st element (zero jumps along the sequence)
print( my_other_list[0] )
# 2nd element (one jump along the sequence)
print( my_other_list[1] ) 
# 3rd element (two jumps along the sequence)
print( my_other_list[2] ) 
```

Using an index outside the range of elements in the list will produce an error. For example, `my_other_list` has three elements, but `my_other_list[3]` tries to return the *4th* element (which does not exist)

>```python
print( my_other_list[3] ) # this will return an error
```

In [2]:
# run this cell to create the list `alphabet`
alphabet = [ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 
             'h', 'i', 'j', 'k', 'l', 'm', 'n', 
             'o', 'p', 'q', 'r', 's', 't', 'u',
             'v', 'w', 'x', 'y', 'z' ]

>```python
alphabet[3:8]
```

→ start from three jumps, finish at eight jumps, i.e. elements 4 to 9.

Negative numbers count backwards from the end of the sequence.

>```python
alphabet[-8:-3]
```

→ 9th from the end up to 4th from the end.

And leaving out one of the numbers in the range will include all elements up to the start or end of the sequence.

>```python
alphabet[14:]
```

>```python
alphabet[:14]
```

We can append items to a list using the `append` method

>```python
alphabet.append('end of alphabet')
alphabet
```

We can create a list of integers using the `list()` and `range()` functions. 

`list(range(6))` will create a list of all integers up to, but not including, 6:

>```python
list(range(6))
```


`list(range(10,15))` will create a list of all integers from 10 and up to, but not including, 15:


>```python
list(range(10,15))
```


`range(10,20,2)` will create is list of integers in steps of 2 from 10 and up to, but not including, 20:

>```python
list(range(10,20,2))
```


<div class="alert alert-success">
Print the multiples of 3 from 12 to 90 using the <span style="font-family:monospace"><span style="color:#046308">range()</span></span> function
</div>

### Dictionaries<a id="dictionaries"></a>

A dictionary is a simple database for storing and organising data.

>```python
my_dictionary = {'name': 'Lucy', 'age': "Hey, don't be so cheeky", 'shoe size': 5}
my_dictionary
```

You can access dictionary values using the dictionary key:

>```python
my_dictionary['name']
```

If you try to access a dictionary item with a key which is not part of the dictionary you get an error:

>```python
my_dictionary['favourite pet'] # this will return an error
```

We can update a dictionary by adding a new entry or key-value pair

>```python
my_dictionary['favourite pet'] = "Gorilla"
my_dictionary
```


We can also modify an existing entry

>```python
my_dictionary['age'] = "I'm still not telling you"
my_dictionary
```

Dictionary values can be any type of python object:

>```python
my_dictionary['favourite foods'] = ['worms','snails','goo'] # The dictionary value is a python list
my_dictionary
```

Dictionary keys need to be an immutable object : string, numbers or tuples 

*Note: an immutable object is one that cannot be changed after it is created*

>```python
my_dictionary[['test','key']] = "does it work?" # A list is mutable, this will give an error
```

>```python
my_dictionary[5.25] = "five point two five" # A number is immutable, this should work
my_dictionary ```

<div class="alert alert-success">
Create a dictionary with the following information about you: name, favourite element and favourite number
</div>

## Defining functions<a id="functions"></a>

You can create your own python commands by defining a function.

Function blocks begin with the keyword `def` followed by a function name and parentheses `()` and a colon `:`

>```python
def my_function(): # This will give an error
```


The code block is indented by 4 spaces
>```python
def my_function(): 
    print ("yeh, my function works!")   
```


To execute the function you must call it (including parentheses):
    
>```python
my_function()
```

You can create a function with arguments
>```python
def print_details(name,age): # This function contains two arguments: name and age
    print ("Your name is {0} and your age is {1}".format(name,age))
```

Remember to call it with the argument
>```python
print_details() # this won't work because there are no arguments
```

>```python
print_details("Snow White",17)
```

The `return` statement exits a function, optionally passing back an expression to the caller

>```python
def square_number(number): # This function contains one argument: number
    y = number*number
    return y
```

Call the function and use the returned value

>```python
large_square_number = square_number(456)
larger_cube_number = square_number(large_square_number)
larger_cube_number
```

<div class="alert alert-success">
Write a function which returns the hypotenuse of a right angled triangle when given the length of the opposite and adjacent edge.<br/><br/>

Call the function to find the hypotenuse of a triangle with opposite edge = 5cm and adjacent edge = 7cm.
</div>

## Control flow<a id="control_flow"></a>
Control flow statements allow the code to do different things depending upon the situation.

### The `if` statement<a id="if_statement"></a>
The `if` statement is used to check a condition; when the condition is met the `if` code block runs. 
>```python
number = 42
guess = int(input('Enter an integer : '))
if guess == number: # a common mistake would be to use `if guess = number` here
    print ("wow, you're a mind reader!")
```


We can add an `else` statement. When the `if` condition is not met the `else` code block runs.

>```python
number = 42
guess = int(input('Enter an integer : '))
if guess == number: 
    print ("wow, you're a mind reader!")
else:
    print ("bad luck")
```
    

The `elif` statement combines an `else` and `if` statement into one:

>>```python
number = 42
guess = int(input('Enter an integer : '))
if guess == number: 
    print ("wow, you're a mind reader!")
elif guess < number:
    print ("your guess is too small")
else:
    print ("your guess is too big")
```

### The `for` statement<a id="for_statement"></a>

The `for..in` statement is another looping statement which iterates over a sequence of objects i.e. goes through each item in a sequence. 

>```python
for i in range(1, 5):  
    print(i)
```


We can use nested `for` loops:

>```python
for i in range(1, 5): 
    for j in ('a','b','c'):
        print(i,j)
```



We can use the `enumerate()` function to add a counter to our loop:

>```python
for index,item in enumerate(('pizza','curry','burger')):
    print ("Choice number {0} is {1}".format(index,item))
```

The `append` list method can be combined with a `for` loop:

>```python
square_numbers = [] # create an empty list
for number in range(11):
    number_squared = number*number
    square_numbers.append(number_squared)
square_numbers
```

## Common mistakes<a id="common_mistakes"></a>

You are about half way through the tutorial - congratulations! Let's take a breath and recap some of what has been mentioned previously with this short task.

The code below contains several common mistakes

>```python
kitten property  = "cute" # variable names cannot include spaces or numbers
string = "I think kittens are {0}".format(Kitten_property) # Python is case-sensitive
print string # print statements need ()
    kitten_food = ["mice","treats","milk"] # Python is sensitive to indentation level
print ("mmmm, I'd love a nice glass of {0}".format(kitten_food[3])) # Indexing begins at 0
```



<div class="alert alert-success"> Correct the code above so that the code runs.</div>

<img src="./images/LPAGP-Zeno.png">

# Reading in data with Pandas<a id="pandas"></a>

We are going to read in data using the [`Pandas`](http://pandas.pydata.org) module.

>```python
import pandas as pd # We load the pandas module so we can access all the useful functions
```

The tutorial files include a `data` directory that contains example data file. To read in the data use the `read_csv` function.

>```python
data = pd.read_csv( 'data/example.csv', header=1) # the data headings are located on row 1
```

The data has been saved as a [pandas dataframe object](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

>```python
data
```


We can access the values under the time heading using the following syntax 

>```python
time = data['time']
print (time)
```


>```python
temperature = data['temperature']
print (temperature)
```

## Plotting data with matplotlib<a id='matplotlib'></a>

To plot data we use another module: [`matplotlib`](http://matplotlib.org) This is a very powerful (and complicated) plotting library, that be used for quick analysis of experimental data, or to generate publication quality figures. It supports an enormous number of plot types. We are going to start with simple 2D $x,y$ plots.

>```python
import matplotlib.pyplot as plt
%matplotlib inline # this 'magic' specifies that matplotlib figures should be shown directly in the notebook
```

We create a plot using `plt.plot()`. Remember, we have assigned `plt` as shorthand for `matplotlib.pyplot`.
>```python
plt.plot(time, temperature)
plt.show()
```

Matplotlib can also be used to plot user-defined functions. This can be used for plotting $y$ as a function of $x$, e.g. $y=x^2$.

>```python
x = [0, 1, 2, 3, 4, 5] # remember that Python lists use square brackets
y = [n**2 for n in x] # this is called list comprehension and it is a very useful python feature
plt.plot( x, y)
plt.show()
```

<div class="alert alert-success">
Plot $y=2x^3$ for $x=0$ to $5$.<br/>
</div>

The default plot shows a connected line. To plot individual points, we can add a third argument to `plt.plot()` that specifies the appearance for that data set:

>```python
plt.plot( x, y, "o" )
plt.show()
```

Adding axes labels and a title uses the `xlabel()`, `ylabel()`, and `title` commands. Save the figure using `savefig()`.

>```python
plt.plot( x, y, 'o' )
plt.xlabel( 'x' )
plt.ylabel( r'$2x^3$' ) # the r'$ $' notation formats a python raw string as latex
plt.title( r'$y = x^2$' )
plt.savefig('my_figure.pdf',bbox_inches='tight') # this must be in the same box as the 
# plotting commands. The 'bbox_inches='tight' keyword removes excess white space from the figure.
plt.show()
```

## Fitting data with numpy<a id='numpy'></a>
We can fit our data using the [`numpy`](www.numpy.org) module

>```python
import numpy as np
```

Let's use the time and temperature data from the previous section. 

>``` python
plt.plot(time, temperature,'.')
plt.show()
```

We will use the [`polyfit`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html) least-squares function to fit a quadratic to our data.

The polyfit function takes three arguments. The third argument is the degree of the fitting polynomial.

>```python
fit = np.polyfit(time,temperature,2) # We want to fit a polynomial degree 2 (quadratic)
print (fit) # `fit` contains the coefficients of our polynomial, highest power first. 
```

To plot our fit create 100 evenly spaced numbers from the start time to end time using the function linspace:

>```python
time_hundred_steps = np.linspace(time.min(),time.max(),100) # time.min() is the start time (minimum time), time.max is the end time (maximum time).
```

Now use the [`polyval`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyval.html#numpy.polyval) function to calculate the temperature at these times using our calculate fit

>```python
temperature_fit = np.polyval(fit,time_hundred_steps)
```

Plot the original data and our fit on the same figure

>```python
plt.plot(time, temperature, 'o', time_hundred_steps, temperature_fit,'-')
plt.savefig("time_temperature.pdf") # save the plot
plt.show()
```

# Putting it all together: Plotting the deformation potential
<a id='together'></a>

<div class="alert alert-success"> 
We are now going to put everything you have learnt so far together. You are going to read in, plot and fit a polynomial to temperature powder X-ray diffraction data published [`here`](http://pubs.rsc.org/en/content/articlehtml/2013/ta/c3ta10518k). <br/><br/>

**Step 1)** Create a new notebook called "[Your name here]-thermalexpansion". <br/><br/>

**Step 2)** Import the matplotlib, pandas and numpy modules. <br/><br/>

**Step 3)** Read in the datafile "data/thermalexpansion.csv" and assign values to the variables 'temperature' and 'volume'. <br/><br/>

**Step 4)** Fit a polynomial to the data using numpy (you will have to determine the suitable order of the polynomial). <br/><br/>

**Step 5)**  Plot the data and the polynomial fit using matplotlib. Label the axes and give your plot a title.  <br/><br/>

**Step 6)** Save the figure as "[Your name here]-thermalexpansion.pdf" and send the it to <span style='font-family:monospace'>lucywhalley@gmail.com</span>. <br/><br/>
*Hint: It may help to split this work across several cells in your new notebook; errors will be easier to debug.*
</div> 



##  \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_Congratulations! You've finished!\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

<img src="./images/6_chester_wink01.gif"> 

## Extension work
<a id='extension'></a>

Pick an extension tasks (they can be done in any order):
-  Improve the appearance of your plots. There's lots online but you could start with [`linestyles`](http://matplotlib.org/api/lines_api.html#matplotlib.lines.Line2D.set_linestyle) or [`colours`](http://matplotlib.org/mpl_examples/color/named_colors.hires.png) 
- Use the [`Pandas`](http://pandas.pydata.org) library to remove data outliers and improve the fit


## Resources<a id='resources'></a>

### Python practice and tutorial websites
[Practice Python](http://www.practicepython.org/)

[Introduction to scientific computing with Python](http://nbviewer.jupyter.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-0-Scientific-Computing-with-Python.ipynb) (in jupyter notebook format)

[A Byte of Python](https://www.gitbook.com/book/swaroopch/byte-of-python/details) (in GitBook format)

[Simple Programming Problems](https://adriann.github.io/programming_problems.html)

[Check iO game](https://py.checkio.org) (fun, addictive, educational)

### Python documentation and resources
[Python 3 official documentation](http://docs.python.org/3/) 

[Python Crash Course cheat sheets](https://ehmatthes.github.io/pcc/cheatsheets/README.html)

### General background to scientific computing
[Good Enough Practices for Scientific Computing](https://swcarpentry.github.io/good-enough-practices-in-scientific-computing/)