# STAT40800 Data Programming with Python
## Jake Mac Uilliam


# Week 1
## Introduction to Python

Python is an open source, easy to learn programming language. It is significantly faster than R, but not as fast as C-like languages. Python was developed with an emphasis on readability, allowing users to write concise logical code. It supports both object oriented and structured programming, and unlike C-like languages the memory management is automatic. Its high-level data structures allow data to be manipulated with ease.

Python has an array of packages and modules. Packages are like expansion packs which add different functionalities. We will predominantly use NumPy, SciPy, pandas and matplotlib.


#### Important difference between Python and R
1. Indentations over braces
2. Pass-by-reference
3. Indexing started at 0 rather than 1

#### Fun fact
Python is named after the British comedy group [Monty Python](https://en.wikipedia.org/wiki/Monty_Python). While developing Python, Guido van Rossum was reading the scripts from [Monty Python's Flying Circus](https://en.wikipedia.org/wiki/Monty_Python%27s_Flying_Circus).

## Basic operations
We showed in the *Jupyter Tips* Notebook that Python has the ability to preform basic arithmetic

In [None]:
7+5*3

In [None]:
6/4

Other useful mathematical operations include:
* `**` : to the power of, e.g. `3**2` is $3^2=9$
* `//` : integer division, e.g. `14//3` would return $4$
* `%` : returns remainder of division, e.g. `14%3` would return $2$


In [None]:
4**5

In [None]:
16//3

In [None]:
16%3

## Variables
In practice, we will want to store and manipulate information using variables. For example, we can assign the value of `2+3` to the variable $x$ and then print $x$. The `=` sign is an assignment operator. The below code does the same thing as before, but allows us to store $x$ for later use.

In [None]:
x = 2+3
print(x)


In Python, variables are designed to hold specific types of information. For example, after the command above is executed, the variable $x$ is an integer. There are several types of information that can be stored:

* *Integer:* An integer is a number without a fractional part, e.g. -4, 5, 0, -3.
* *Float:* Any rational number, e.g. 3.432.
* *Boolean:* Variables of this type can be either True or False.
* *String:* Any sequence of characters, e.g. “hello“.

The string “5” and integer 5 are completely different entities to Python, despite their similar appearance. You’ll see the importance of this in the next section.

To check a variable type use the function `type`.

In [None]:
print(type(x))

Let's try other variable types:

In [None]:
a = True
b = 5
c = 3.2
d = "Python"
print(type(a))
print(type(b))
print(type(c))
print(type(d))

#### Exercise 1 

Write a piece of code that computes `4+1` and stores it in a variable X and prints out the value of X, then computes `8-2` and stores it in Y and prints out the value of Y (4 lines).

## Indexing and slicing
A string is made up of characters, which you may want to extract individually. Square brackets are used to access individual characters of the string. **Indexing in Python starts at zero**, so if `word` is a string `word[1]` returns the second character in rather than the first, `word[0]` will return the first entry. This will also be important when we consider collections of variables.

In [None]:
word = 'introduction'
print(word[1])
print(word[3])

Using a negative number as the index counts from the right instead

In [None]:
print(word[-1])
print(word[-3])

The colon operator can be used to access multiple characters. This is known as *slicing*. Run the code below to see how it works:

In [None]:
print(word[5:9])
print(word[:5])
print(word[8:])
print(word[:])
print(word[:-5])

#### Exercise 2
Write a piece of code that stores the string 'programming' as a variable and prints out the 8th letter (m), the 4th to 7th letters (gram) and the last 3 letters (ing) (4 lines of code).


## Collections of variables

Variables store a single piece of information, but more often than not, you will need to store more than one piece of information. Python comes with a number of data structures that allow us store collections of variables.
The most commonly used built-in data structures are 
*lists*, *tuples* and *dictionaries*. We'll look quickly at each of these and how they work.

### Lists

Lists are probably the handiest and most flexible compound data type. They are declared with square brackets [ ].  As with strings, individual elements of a list can be selected using the syntax `a[ind]`.

Create a list:

In [None]:
numbers = [2,5,3,9,1,5,7,6,8]
print(numbers)
print(type(numbers))

Access individual elements:

In [None]:
print('The first item is', numbers[0])
print('The last item is', numbers[-1])

Lists are mutable, which means that the individual elements can be altered

In [None]:
numbers[1] = 20
print(numbers)

#### Methods for lists
Methods are operations that can be applied to an object, such as a list. The syntax for methods is `object.method(arguments)`. Below are examples of some methods that can be applied to lists. Run the code to see what each method does.

In [None]:
letters = ['c','y','a','g','e','A','k','a','r','2','t','h','f','m']

print(letters.count('a'))

letters.insert(2,'b')
print(letters)

letters.remove('g')
print(letters)

letters.sort()
print(letters)

letters_new = ['n','j','s','e']
letters.extend(letters_new)
print(letters)

**Note:** The method changes the object it is applied to, but has no output. The object must be printed afterwards to see the change.

**TIP:** To see all of the methods an object has, use the `dir` function and look at the things that do not start with `__`

In [None]:
dir(letters)

If you are unsure what a particular method does, place a question mark after the method name, i.e. `object.method?`. When you run the code the help file will appear at the bottom of your browser.

In [None]:
letters.sort?

#### Exercise 3

Amy is 1.65m, Brian is 1.7m, Conor is 1.95m, David is 1.8m and Edel in 1.75m.
1. Store these heights in a list called `heights`.
2. Append Frank's height (1.9m) to the list.
3. Print the last height in the list.

__Bonus__

Extract the last value in two different ways: first, by using the index for
the last item in the list, and second, presuming that you do not know how long the list is.

__HINT:__ `len()` can be used to find the length of a collection


In [None]:
len(letters)

### Tuples

Tuples are similar to lists, but the are **immutable** (elements cannot be changed). Tuples are created using round brackets (). We still use square brackets to access individual elements.

In [None]:
people = ('Adam','Ben','Charlie')
print(people[2])

In [None]:
people[2] = 'Daniel'  # this will result in an error message

### Dictionaries

We use dictionaries (dicts for short) when we want to store and retrieve things by their names rather their position. Each item is a *key:value* pair. The keys and the values can be numbers or strings. Dictionaries are declared using {}.

In [None]:
weight = {'Adam' : 97,'Ben' : 105,'Charlie' : 80}

print(weight['Adam'])

Dicts are mutable so we can add to change individual elements, and also add to them

In [None]:
weight['Daniel'] = 86
print(weight)

#### Exercise 4

1. Take the heights from Exercise 3, and store them in a dict instead of a list. Use the names as the keys.
2. Print Conor's height.

## Pass-by-reference
If we create a new object `b` from `a`, changing `b` will also change `a`. This is known as **pass-by-reference**. 

In [None]:
fruit = ['banana','apple','orange']
fruit_copy = fruit
print("original:", fruit)
print("copy:", fruit_copy)
fruit_copy.append('mango')
print("copy after we append 'mango':", fruit_copy)
print("original after we change the copy:", fruit)

__TIP:__ Use `copy` to create a distinct copy that will not alter the original object

In [None]:
num_list = [1,2,3,4,5,6] 
num_list_copy = num_list.copy()
print('original:', num_list)
num_list_copy.extend([7,8,9,10])
print('copy:', num_list_copy)
print('original list after we change the copy:', num_list)