# Python

Most of you will have already used python, but to make sure everyone is at the same point, please complete the following exercises.

1. Define variables.
2. Types of variables: Sets, dictionaries, lists etc.
3. Working with variables
4. Writing a function, iteration
5. List comprehension
6. Pandas and common functions
7. Matplotlib and common functions

## Defining variables

A variable is a defined memory location to store values. If we define a variable, it is easy to call on it repeatedly in our programming.

In python, we define a variable as follows:

    a = 1
    
Here, we have created the variable 'a', and assigned the value 1 to it. This is an integer type. There are also other types available. 

If you use the same variable name during creation of many variables, it will overwrite previous assignments.

Create a new variable for each of the following values:

    1.04
    [1,2,3]
    {"a":1, "b":2, "c":3}
    "hello"

In [None]:
#Work here


# Types

You can then use a handy 'type' function to determine what type of value your variable holds.

    type(a)
    
What are the various types for each of the variables you created above?

In [None]:
#work here


As you can see, having different types available can be very handy, you'll later see there are individual functions available for each type.

Let's look at the 'dict' type. You created a variable above containing the value:

    {"a":1, "b":2, "c":3}
    
The value to the left of each colon is the 'key', whilst the value to the right of each colon is a value. Together, these create key value pairs. Multiple key value pairs are combined into a dictionary using commas.

It may be easier to see it written as follows:

    new_dict = {
        "a" : 1,
        "b" : 2,
        "c" : 3
    }

You don't have to use a string as a key, or an integer as the value. 

Create a dictionary in which the keys are integers, and the values are strings.

In [None]:
#work here


Its very easy to access keys and values within a dictionary:
Try these out:

    new_dict.keys() 
    new_dict.values()
    new_dict[1]
    
If you do not have 1 as a key, replace it with another key you have used.

In [None]:
#work here


We can also perform transformations, for example, if we wanted to take all our dictionary values and put them into a list, we can use the 'list' function.

We can also do this when assigning new variables!

    list(new_dict.values())
    
Have a go at creating a variable containing the integer 1.

Turn it into a float, using the function 
    
    float()
    
Turn it back into an integer using the function

    int()

In [None]:
#work here 


### <i>Extra info</i>

If you have a list with many common values, you can create a unique 'set', which is mutable - it can be changed after being created. 

We won't cover the difference between mutable and immutable here, but if you are interested: https://medium.com/@meghamohan/mutable-and-immutable-side-of-python-c2145cf72747

See what happens:

    my_list = [1,1,2,3,3,3]
    my_set = set(my_list)

In [None]:
#work here


## Working with variables

We can perform all sorts of functions on variables:

    a + b
    c / d
    my_set.add(5)
    my_list.append(1)

Make the new variables 'a' and 'b' to contain strings, what happens when you add them?

In [None]:
#work here


If you create a set:
    
    my_set = set([1,2,3,3])
    
Try to add another '2', to it. 

    my_set.add()

What happens?

Do the same with a list. 

    my_list.append()

What happens?

In [None]:
#work here


## Writing a function

If we wanted to create a new function, we can do so:
    
    def sum_times_difference(a,b): 
        my_sum = a + b 
        my_difference = abs(a-b)
        answer = my_sum * my_difference 
        return answer 
        
We can then call the function:

    sum_times_difference(x,y) 
    
Note, the arguments do not have to be the same name as those given in the definition of the function.
        
Write this function below, and add comments (starting with a '#') to describe what is happening on each line. I've started the first for you.

In [None]:
#work here

def sum_times_difference(a,b): #defined sum_times_difference, taking two arguments.
    my_sum = a + b 
    my_difference = abs(a-b)
    answer = my_sum * my_difference 
    return answer

If a = 1, and b = 4, what is the returned value for sum_times_difference?

In [None]:
#work here
a = 1
b = 4


## Iteration

If we have a list of values, or a range that we wish to iterate through, it is easy to do in python:

    for elem in my_list:
        print(elem)
        
You will notice we've started using print statements now!
        
Create the list 'my_list' with some values, and write the above for loop.

In [None]:
#work here


A shorter way to do this would be to use the built in 'range' function:
    
    range(0,11) 
    
this will create a list containing all numbers from 0 up to (but not including) 11.

Replace my_list with range(0,11) in the for loop. What happens?

In [None]:
#work here


We can also do this with dictionaries and sets. As long as there are multiple values in the variable!

    my_dict = {"A":1, "B":2, "C":3}

    for key, val in my_dict.items():
        print("Key: %s, Value: %i" %(key,val))
        
We can use .items() for dictionaries to extract the keys and values at each iteration. 

Whereas:

    for key in my_dict:
        
Will extract only the keys.
And:

    for val in my_dict.values():
    
Will extract only the values.

The print statement has got more complex too. We can use % to indicate where in the string we wish to replace with a variable/value. 

E.g.:

    "Key %s" %key 
    
We are replacing %s with the variable 'key' as a string (s), we can also use integers(i), floats (f) etc.

Add a line to the loop that uses sum_times_difference(val,2), and print the answer.

What happens if you don't 'print'?

In [None]:
#work here


## List comprehension

A bit of a brain ache is list comprehensions, but can turn your for loops into single one liners.

Let's look at an example:

    my_list = [0,1,2,3,4,5,6]
    [i*2 for i in my_list]
    
Can you tell what it is doing? Hint: read right to left. Run it in the cell below to see.

In [None]:
#work here


Write yourself a new function, that takes a single argument and performs the following:
1. Squares the value and saves to a variable
2. Cube the value and saves to a variable
3. Sums step 1 and 2 and returns the value.

You get a bonus point if you can put steps 1 and 2 into a single row!

Hint, 'to the power of' can be expressed as '**', so $2^2$ becomes:

    2 ** 2

In [None]:
#work here


Now, write a list comprehension using

    my_list = [0,1,2,3,4,5,6]
    
but in each iteration, call your function and print the answer!

In [None]:
#work here


# Python packages (Pandas)

Making python even nicer to use is the wide range of publicly available packages to use in your code.

To import a package, simply write:

    import N
    
(where N is the name of the package). Some packages become preinstalled when you install python or conda, others may need to be installed - but don't worry about this, you've already got everything you'll need!

Let's have a look at pandas.

In [None]:
import pandas as pd

We like to be lazy and abbreviate some package names for ease when writing our code, pandas is typically abbreviated to 'pd'. You'll see the same with numpy ('np') and matplotlib.pyplot ('plt').

[Pandas](https://pandas.pydata.org/) is a package used for data structures and data analysis. We can use it to load in data, save data to files, and perform a variety of functions on the data.

Let's load in an example csv file by running the code below. .head() shows us the first 5 lines. .head(10) will show the first 10.

    data = pd.read_csv("../example.csv")
    data.head()
    
To see other arguments for read_csv(), you can perform 

    help(pd.read_csv)

In [81]:
#work here


The data has been loaded as a DataFrame type. We can call .columns to find column names, and .index to find the indexes.

In [82]:
data.columns

Index(['Name', 'Aged', 'Height'], dtype='object')

Looking at the columns, we can see one is possible misspelled. 'Aged' should be 'Age'. Let's rename it. Luckily, pandas has a function for this, we can use a dictionary as an argument, and setting axis=1 (for columns), and inplace=True, we don't have to save it as a new variable.

In [86]:
data.rename({"Aged":"Age"}, axis=1, inplace=True)

In [87]:
data.head()

Unnamed: 0,Name,Age,Height
0,Angela,8,120
1,Ben,44,172
2,Chrissy,10,130
3,Dave,22,166
4,Edgar,14,150


Like a dictionary, we can call particular columns.

    data["Age"]

Returns all 'Age' values.

    data["Age"].iloc[0]
    
Returns the age for the positional indexer 0 (the first row).

    data["Age"].loc[2]
    
Returns the age for the index 2.


In [None]:
#work here


We can use things like 'len' and 'describe'.

    len(data) 

returns how many rows are in the data.

    data.describe() 
    
will return statistics for numerical column types.

What is the mean for age and height for this dataset?

How many entries are there?

In [None]:
#work here


A handy tool is .value_counts(), which will return the number of entries per unique value.

    data["Age"].value_counts()
    
Will return the age as the index, and the number of entries with Age = Index. (So how many people are of age X).

How many people are aged 10?

In [None]:
#work here
