## A bit more on function
Defining a function lets you **reuse** the same code multiple times

A function, **f**, with inputs, **x**, **y**, **z**, is called through **f(x, y, z)**

The keyword **return** inside the function specifies which value would be returned to the user

## Another example of function
This one returns the **arithmatic mean, geometric mean, and harmonic mean** of two input values

In [1]:
def get_means(x, y):
    return (x + y) / 2, (x * y) ** 0.5, 2 / ((1 / x) + (1 / y))

log((x * y) ** 0.5) => 0.5 log(x * y) => 0.5 (log(x) + log(y))

### Test

In [4]:
print(get_means(1000, 10))

(505.0, 100.0, 19.801980198019802)


In [6]:
print(get_means(200, 30))

(115.0, 77.45966692414834, 52.173913043478265)


## Print() is a function that we will be using a lot
**print(x)** will show the value of **x**

**print(x, y)** will show the values of **x** and **y**

In [7]:
x = 1 + 1 + 1/2 + 1/6 + 1/24 + 1/120 + 1/720
print(x)
print(x, x ** 2)
print('the mean value of feature x is', x)

2.7180555555555554
2.7180555555555554 7.38782600308642
the mean value of feature x is 2.7180555555555554


## Print() can also readily handle other Python objects
Like **list**

In [9]:
y = [3,0,1,1,9,7,9]
print(y)
print(get_means(3, 1))

[3, 0, 1, 1, 9, 7, 9]
(2.0, 1.7320508075688772, 1.5)


In [14]:
print(y[-2])

7


Or **dictionary**

In [15]:
z = {4:'four', 'two':2, 'three':3}
print(z)

{4: 'four', 'two': 2, 'three': 3}


In [18]:
print(z['three'])

3


In [19]:
gene_to_symbol = {'ENSG0000001231236': 'DGS3'}
print(gene_to_symbol['ENSG0000001231236'])

DGS3


## Dictionary is a generalized mapping from key to value
List is a mapping from index to value: y[0] -> 3 acts a mapping from index 0 to value 3

But dictionary's key can be non-integer: z['one'] -> 1

## Let's focus on list for now
First, let's create a list with some names. We can use **list.index()** to ask where something is located in the list

**list** is a **data structure** and **index()** is one of its built-in functions

In [20]:
department = ['medicine', 'radiology', 'pathology', 'pediatrics']
print(department.index('pediatrics'))

3


In [21]:
print(department.index('AI'))

ValueError: 'AI' is not in list

## Whoops! 'AI' is not in the list and so we got an error
We have to be careful when using **index()**

A good way is to check first whether the thing we look for is present **in** the list

In [24]:
if 'radiology' in department:
    print('we have radiology!')
else:
    print(':(')

we have radiology!


## List indexing, the other way around
department = ['medicine', 'radiology', 'pathology', 'pediatrics']

We can use **negative index** to quickly access the **end** of the list

In [25]:
department = ['medicine', 'radiology', 'pathology', 'pediatrics']
print('location 0,1:', department[0][1])
print('location 2,5:', department[2][5])

location 0,1: e
location 2,5: l


In [26]:
print('location -1,-3:', department[-1][-3])

location -1,-3: i


In [27]:
print('location -2,-4:', department[-2][-4])

location -2,-4: l


## List is also useful for specifying the range of for loop
Here is a simple for loop over the elements of **department**

In [30]:
for x in z:
    print(z[x])

four
2
3


### We can run for loop the old-fashioned way
**range()** is a built-in Python function that return numbers, starting from 0 by default. 

**range(10)** returns 0, 1, ..., and 9 in that order

In [31]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


We can further control the **start**, **stop**, and **step** of **range()**

In [32]:
for i in range(1, 10, 2):
    print(i)

1
3
5
7
9


We can even go **backward**

In [33]:
for i in range(10, 1, -2):
    print(i)

10
8
6
4
2


## Instead of looping over elements in a list, we can loop over the indices
Recall that **department** is a **list** which maps 0 -> 'medicine', 1 -> 'radiology', ...

**len()** is a built-in Python function that return the **size** of an object

In [34]:
department = ['medicine', 'radiology', 'pathology', 'pediatrics', 'surgery', 'immunology', 
              'microbiology', 'anesthesiology', '']

for i in range(0, len(department), 2):
    print(i, department[i])

0 medicine
2 pathology
4 surgery
6 microbiology
8 


## Exercises: Let's apply what we learned to do some analyses

In [35]:
patient_age = [18,        47,   12,     8,      4,     65,      17,      34,      77]
patient_name = ['Alice', 'Bob', 'Clare', 'Don', 'Eric', 'Fei', 'Gabriel', 'Henry', 'Ivan']

### Task 1: Count number of patients

In [36]:
print('the number of patient is', len(patient_name))
print('the number of patient is', len(patient_age))

the number of patient is 9
the number of patient is 9


### Task 2: Calculate geometric mean of age

In [41]:
product_age = 1
num_patient = len(patient_age)

for x in patient_age:
    product_age = product_age * x
    
geomean_age = product_age ** (1 / num_patient)
    
print('geometric mean of patient age is', geomean_age)

geometric mean of patient age is 21.39622008180148


In [42]:
print('average of patient age is', sum(patient_age) / len(patient_age))

average of patient age is 31.333333333333332


### Task 3: Find the lowest and highest ages

In [44]:
min_age = 1000
max_age = 0

for x in patient_age:
    if x < min_age:
        min_age = x
    
    if x > max_age:
        max_age = x
    
print('the youngest patient\'s age is', min_age)
print('the oldest patient\'s age is', max_age)

the youngest patient's age is 4
the oldest patient's age is 77


In [43]:
print(max(patient_age), min(patient_age))

77 4


### Task 4: Find the name of the youngest patient

In [45]:
min_age = 1000

for x in patient_age:
    if x < min_age:
        min_age = x

min_patient_index = patient_age.index(min_age)
   
print("the youngest patient's age is", patient_name[min_patient_index])

the youngest patient's age is Eric


### Task 5: Count the number of patients above 60 years old

In [46]:
num_patient_over60 = 0

for x in patient_age:
    if x > 60:
        num_patient_over60 = num_patient_over60 + 1
    
print('number of patient over 60 is', num_patient_over60)

number of patient over 60 is 2


### Task 6: List the names of patients above 60 years old

In [47]:
for i in range(len(patient_age)):
    if patient_age[i] > 60:
        print(patient_age[i], patient_name[i])

65 Fei
77 Ivan


## And now, the BEST feature of Python list
This is a technique called **list comprehension**

Can you guess what the following code will output?

In [48]:
y = [i for i in range(5)]
print(y)

[0, 1, 2, 3, 4]


In [49]:
y = [i for i in range(5) if i % 2 == 0]
print(y)

[0, 2, 4]


In [50]:
y = []

for i in range(5):
    if i % 2 == 0:
        y.append(i)

print(y)

[0, 2, 4]


In [51]:
y = [i for i in range(100) if i > 10 and i ** 2 < 900]
print(y)

[11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]


In [52]:
department = ['medicine', 'radiology', 'pathology', 'pediatrics', 'surgery', 'immunology', 
              'microbiology', 'anesthesiology']

print([x for x in department if x[-1] == 'y'])

['radiology', 'pathology', 'surgery', 'immunology', 'microbiology', 'anesthesiology']


In [53]:
print([x for x in department if not x[-1] == 'y'])

['medicine', 'pediatrics']


### Let's solve Task 5 again with list comprehension

In [54]:
patient_age = [18, 47, 12, 8, 4, 65, 17, 34, 77]
num_above_60 = len([x for x in patient_age if x > 60])

print(num_above_60, [x for x in patient_age if x > 60])

2 [65, 77]


### Let's solve Task 6 again with list comprehension

In [57]:
patient_age = [18, 47, 12, 8, 4, 65, 17, 34, 77]
patient_name = ['Alice', 'Bob', 'Clare', 'Don', 'Eric', 'Fei', 'Gabriel', 'Henry', 'Ivan']

patient_above_60 = [[patient_name[i], patient_age[i]] for i in range(len(patient_name)) if patient_age[i] > 60]
print(patient_above_60)

[['Fei', 65], ['Ivan', 77]]


## Finally, we can access more than one elements of a list at a time
This is call **slicing**

In [58]:
patient_name = ['Alice', 'Bob', 'Clare', 'Don', 'Eric', 'Fei', 'Gabriel', 'Henry', 'Ivan']
print(patient_name[1:5])

['Bob', 'Clare', 'Don', 'Eric']


In [59]:
print(patient_name[2:3])
print(patient_name[4:4])

['Clare']
[]


Similar to **range()**, we can also define **start**, **stop**, and **step** with **slicing**

In [60]:
print(patient_name[1:5:2])

['Bob', 'Don']


We can also do **backward** slicing

In [61]:
print(patient_name[-3:-1])

['Gabriel', 'Henry']


Note that **slicing** assumes default **start = 0** and **stop = end of list**

In [62]:
print(patient_name[:2])
print(patient_name[-3:])

['Alice', 'Bob']
['Gabriel', 'Henry', 'Ivan']


### Task 7: Use slicing to get just 'Alice', 'Don', and 'Gabriel'

In [63]:
patient_name = ['Alice', 'Bob', 'Clare', 'Don', 'Eric', 'Fei', 'Gabriel', 'Henry', 'Ivan']
print(patient_name[::3])

['Alice', 'Don', 'Gabriel']


In [64]:
print(patient_name[::-1])

['Ivan', 'Henry', 'Gabriel', 'Fei', 'Eric', 'Don', 'Clare', 'Bob', 'Alice']
