[View in Colaboratory](https://colab.research.google.com/github/tompollard/buenosaires2018/blob/master/1_intro_to_python.ipynb)

# Santiago y Carolina son los mejores

In this part of the workshop, we will introduce some basic programming concepts in Python. We will then explore how these concepts allow us to carry out an anlysis that can be reproduced.

## Working with variables

You can get output from Python by typing math into a code cell. Try executing a sum below (for example: 3 + 5).

In [0]:
3+5

However, to do anything useful, we will need to assign values to `variables`. Assign a height in cm to a variable in the cell below.

In [0]:
height_cm = 180

Now the value has been assigned to our variable, we can print it in the console with `print`.

In [0]:
print('Height in cm is:', height_cm)

We can also do arithmetic with the variable. Convert the height in cm to metres, then print the new value as before (Warning! In Python 2, dividing an integer by an integer will return an integer.)

In [0]:
height_m = height_cm / 100
print('height in metres:',height_m)

We can check which variables are available in memory with the special command: `%whos`

In [0]:
%whos

We can see that each of our variables has a type (in this case `int` and `float`), describing the type of data held by the variable. We can use `type` to check the data type of a variable.

In [0]:
type(height_cm)

Another data type is a `list`, which can hold a series of items. For example, we might measure a patient's heart rate several times over a period. 

In [0]:
heartrate = [66,64,63,62,66,69,70,75,76]

In [0]:
print(heartrate)

In [0]:
type(heartrate)

## Repeating actions in loops

We can access individual items in a list using an index (note, in Python, indexing begins with 0!). For example, let's view the first `[0]` and second `[1]` heart rate measurements.

In [0]:
print(heartrate[0])

In [0]:
print(heartrate[1])

We can iterate through a list with the help of a `for` loop. Let's try looping over our list of heart rates, printing each item as we go.

In [0]:
for i in heartrate:
    print('the heart rate is:', i)

## Making choices

Sometimes we want to take different actions depending on a set of conditions. We can do this using an `if/else` statement. Let's write a statement to test if a mean arterial pressure (`meanpressure`) is high or low.

In [0]:
meanpressure = 70

if meanpressure < 60:
    print('Low pressure')
elif meanpressure > 100:
    print('High pressure')
else:
    print('Normal pressure')

## Writing our own functions

To help organise our code and to avoid replicating the same code again and again, we can create functions. 

Let's create a function to convert temperature in fahrenheit to celsius, using the following formula:

`celsius = (fahrenheit - 32) * 5/9`

In [0]:
def fahr_to_celsius(temp):
  celsius = (temp - 32) * 5/9
  return celsius
    

Now we can call the function `fahr_to_celsius` to convert a temperature from celsius to fahrenheit.



In [0]:
body_temp_f = 98.6
body_temp_c = fahr_to_celsius(body_temp_f)
print('Patient body temperature is:', body_temp_c, 'celsius')

## Reusing code with libraries

Python is a popular language for data analysis, so thankfully we can benefit from the hard work of others with the use of libraries. Pandas, a popular library for data analysis, introduces the `DataFrame`, a convenient data structure similar to a spreadsheet. Before using a library, we will need to import it.

In [0]:
# let's assign pandas an alias, pd, for brevity
import pandas as pd

We have shared a demo dataset online containing physiological data relating to 1000 patients admitted to an intensive care unit in Boston, Massachussetts, USA. Let's load this data into our new data structure.


In [0]:
url="https://raw.githubusercontent.com/tompollard/tableone/master/data/pn2012_demo.csv"
data=pd.read_csv(url)

The variable `data` should now contain our new dataset. Let's view the first few rows using `head()`. Note: parentheses `"()"` are generally required when we are performing an action/operation. In this case, the action is to select a limited number of rows.

In [0]:
data.head()

We can perform other operations on the dataframe. For example, using `mean()` to get an average of the columns.

In [0]:
data.mean()

If we are unsure of the meaning of a method, we can check by adding `?` after the method. For example, what is `max`?

In [0]:
data.max?

In [0]:
data.max()

We can access a single column in the data by specifying the column name after the variable. For example, we can select a list of ages with `data.Age`, and then find the mean for this column in a similar way to before.

In [0]:
print('The mean age of patients is:', data.Age.mean())

Pandas also provides a convenient method `plot` for plotting data. Let's plot a distribution of the patient ages in our dataset.

In [0]:
data.Age.plot(kind='kde', title='Age of patients in years')