# Python for Programmers

This is a crash course on Python for  people who have already some experience with programming (usually scripting languages such as MATLAB or R).

---
### Lesson Objectives:

* learn basic syntax (functions, loops)

* learn about Python's basic data structures and the differences between them
    - lists, strings, disctionaries, tuples

* learn how to munipulate data with numpy arrays

* learn how to create a simple Class and instantiate objects from it

* learn how to make simple plots


Jupyter Notebooks are great to get started, share results with collaborators, teach....

Other development environments: [Spyder](https://pythonhosted.org/spyder/), [PyCharm](https://www.jetbrains.com/pycharm/), [Atom](https://atom.io/) ...

---
### Python's Types
![](pydatatypes.jpg)
Image from http://vinodsrivastava.com/Learning-Hub/



* Python's libraries provide additional data structures
* We can create our own data types
* We can create our own objects

We will learn about all this data types through processing some brain image datasets.

---
### Manipulating Numerical Data

Download the data from https://github.com/valentina-s/PythonForProgrammers/blob/code/data.zip. It is best to have a folder called `notebooks/` and `data/` in your lesson working directory.

Libraries: some picture.

In [None]:
# we are importing this library so that we can read the images in the dataset
from scipy import ndimage

In [None]:
image = ndimage.imread('../data/OAS1_0001_MR1_mpr_n4_anon_111_t88_masked_gfc_fseg_tra_90.gif')

In [None]:
type(image)

In [None]:
import numpy as np

In [None]:
# the dimensions of the 2D array (rows,columns)
image.shape

In [None]:
image 

Let's calculate some summary statistics:

In [None]:
np.max(image)

In [None]:
image.max()

In [None]:
image.min()

In [None]:
image.mean()

In [None]:
image.std()

In [None]:
# importing a library for plotting
import matplotlib.pyplot as plt
# telling the notebook to keep the plots inline (instead of opening in a new window)
% matplotlib inline 

In [None]:
# displaying a two dimensional image
plt.imshow(image,cmap = 'gray')

In [None]:
# Accessing indivual elements
image[0,0]

In [None]:
# checking the type of the array's entries
type(image[100,100])

In Python indeces start from zero!

In [None]:
# extracting a part from the array
image[0:5,0:5]

In Python the last index of a range is not read: i.e. here we read indeces `0,...,4`.

In [None]:
# let's extract half of this image
leftHemisphere = image[0:208,0:88]

In [None]:
plt.imshow(leftHemisphere, cmap = 'gray')

In [None]:
leftHemisphere.shape

In Python you can drop the first or the last index:

In [None]:
leftHemisphere = image[:208,:88]

In [None]:
leftHemisphere = image[:,:88]

We can also skip indeces:

In [None]:
#every other element:
small_image = image[0:208:2,0:176:2]

In [None]:
small_image.shape

In [None]:
plt.imshow(small_image, cmap = 'gray')

In [None]:
# shortcut
small_image = image[::2,::2]

In [None]:
plt.imshow(small_image,cmap = 'gray')

In [None]:
# we can transpose rows with columns
plt.imshow(small_image.T,cmap = 'gray')

From 2D to 1D.

In [None]:
image.flatten().shape

Or is it a 0D???

In [None]:
all_values = image.flatten()

In [None]:
imageSlice = image[180,:]

In [None]:
plt.plot(imageSlice)
plt.title('Row 181 Profile')

In [None]:
plt.plot(image.mean(axis = 0))
plt.title('Mean Profile')

In [None]:
plt.hist(all_values)
plt.title('Histogram of Intensity Values')

We observe 4 levels of values. Let's explore these individual levels.

In [None]:
# let's look at the distinct intensity values
np.unique(all_values)

In [None]:
# checking conditions
type((image == 0)[0,0])

In [None]:
plt.imshow(image==63, cmap = 'gray')
plt.title('CSF')

In [None]:
plt.imshow(image==127, cmap = 'gray')
plt.title('Grey Matter')

In [None]:
plt.imshow(image==191, cmap = 'gray')
plt.title('White Matter')

In [None]:
volume = sum(image==191)

In [None]:
volume = np.sum(image==191)

In [None]:
volume

TODO: Counting from the end, any numpy array as indeces.

TODO: = vs. copy

---
### Creating Functions

In [None]:
def calculateRegionVolume(image,region):
    volume = np.sum(image == region)
    return(volume)

In [None]:
calculateRegionVolume(image,0)

Exercise: make a function which calculates all 4 regions. Hint you can return more than one values.

In [None]:
def calculateVolumes(image):
    volume1 = np.sum(image == 63)
    volume2 = np.sum(image == 127)
    volume3 = np.sum(image == 191)
    return((volume1,volume2,volume3))

TODO: Explain Scope by removing region.

TODO: Show we can have default parameters, play with order.

---
### Iterations: Lists and For-loops

We need something to iterate over: usually that is a Python's list.

In [None]:
myList = [1,2,3,4,5]

In [None]:
myList

In [None]:
## We can add more elements easily
myList.append(6)
myList

What is the difference with numpy arrays?

We can put anything in lists:

In [None]:
mixedList = [1,'a',3,'avs',True]

In [None]:
type(mixedList[4])

In [None]:
# we can even put another list
mixedList.append([1,23])

In [None]:
mixedList

Python For-loops = iterating over lists

In [None]:
for item in mixedList:
    print(item)

The `range` function is very useful when working with loops.

In [None]:
for i in range(6):
    print(mixedList[i])

TODO: help of functions, write help for my own functions

TODO: change volume to percent (discuss division between python 2 and 3)

We have 44 patients and we want to calculate the volumes for all of them. We want to iterate over all image files in the folder. To extract a list of names for files in a folder matching a certain pattern, we can use a Python library called `blob`.

In [None]:
import glob

In [None]:
image_filenames = glob.glob('../data/*.gif')

In [None]:
print(image_filenames)

Exercise: calculate the volume for the region with intensity 191 for all images and store these volumes in a list. 

Hint: you can initialize an emplty list with 

`l = []`

In [None]:
listOfVolumes = []
for filename in image_filenames:
    image = ndimage.imread(filename)
    listOfVolumes.append(calculateRegionVolume(image,191))

In [None]:
listOfVolumes

Warning glob.glob may return results in different order!!!

How can we make the results safer and more clear?

We want to keep the name of the files: i.e. we need a mapping from the filename to the volume. We can use a dictionary data structure for that.

---
### Dictionaries

In [None]:
myDict = {'John':21,'Pamela': 32,'Eli':20}

In [None]:
# Accessing values
myDict['Pamela']

In [None]:
# the keys
myDict.keys()

In [None]:
# the values
myDict.values()

In [None]:
# keys and values
myDict.items()

In [None]:
# Create new elements:
myDict['Jim'] = 20

In [None]:
myDict

We see that the 'Jim' entry did not get attached in the end. Actually the elements are arranged in an alphabetical order of the keys.

Dictionary entries are not ordered!!!

In [None]:
# Empty Dictionary:
dictOfVolumes = {}

In [None]:
for filename in image_filenames:
    image = ndimage.imread(filename)
    dictOfVolumes[filename] = calculateRegionVolume(image,191)

In [None]:
dictOfVolumes

In [None]:
for filename in image_filenames:
    image = ndimage.imread(filename)
    dictOfVolumes[filename] = calculateVolumes(image)

In [None]:
dictOfVolumes

From list of volumes for each subject to list of volumes for each region: the `zip` function.

In [None]:
v = list(zip(*list(dictOfVolumes.values())))
print(v)

In [None]:
plt.scatter(v[1],v[2])
plt.xlabel('Gray Matter Volume')
plt.ylabel('White Matter Volume')

---
### Classes

Reading a text file:

In [None]:
f = open('../data/OAS1_0002_MR1.txt', 'r')

In [None]:
first_line = f.readline()
second_line = f.readline()

In [None]:
f.close()

In [None]:
second_line

In [None]:
second_line[-3:-1]

In [None]:
class neuroImage:
    image = image
    subjectID = 'OAS1_0002_MR1'
    def calculateVolumes(self):
        return(calculateVolumes(image))

In [None]:
nim = neuroImage()

In [None]:
nim.subjectID

In [None]:
nim.calculateVolumes()

Initializing values of an object: adding a constructor.

In [None]:
class neuroImage:
    
    def __init__(self,subjectID):
        self.image = ndimage.imread('../data/'+subjectID+'_mpr_n4_anon_111_t88_masked_gfc_fseg_tra_90.gif')
        self.subjectID = subjectID
        
    def calculateVolumes(self):
        return(calculateVolumes(image))

In [None]:
nim = neuroImage('OAS1_0002_MR1')

In [None]:
nim.subjectID

Exercise: create a list of objects of class neuroImage for each subject using the corresponding image file and metadata file with fields: image, subjectID, age.

In [None]:
metadata_filenames = glob.glob('../data/*.txt')
print(metadata_filenames)