# Lecture 1: Getting started with Python

Based on Software Carpentry's "Programming with Python" https://software-carpentry.org/lessons/ and Data Carpentry's "Data Analysis and Visualization in Python for Ecologists" https://datacarpentry.org/lessons/

Recommended setup: Anaconda / miniconda on Linux or Mac (Windows Subsystem for Linux if on Windows).

### Questions
- How do I program in Python?
- How can I represent my data in Python?
- How can I do the same operations on many different values?

### Objectives
- Perform mathematical operations in Python using basic operators.
- Define the following data types in Python: strings, integers, and floats.
- Define the following as it relates to Python: lists, tuples, and dictionaries.
- Correctly write for loops to repeat simple calculations.

## Operators, variables, data types

### Operators and variables

Any Python interpreter can be used as a calculator:

In [1]:
2 + 2

4

In [2]:
43 - 6

37

In [3]:
6 * 8

48

In [4]:
4 / 3

1.3333333333333333

In [5]:
12 % 5

2

In [6]:
2**3

8

This is great but not very interesting. To do anything useful with data, we need to assign its value to a variable. In Python, we can assign a value to a variable, using the equals sign `=`. For example, we can track the weight of a patient who weighs 60 kilograms by assigning the value 60 to a variable `weight_kg`:

In [7]:
weight_kg = 60

We can then use the variable `weight_kg` in other operations. We can for example convert it to pounds and store that result in a new variable

In [8]:
weight_lbs = weight_kg * 2.2
weight_lbs

132.0

To carry out common tasks with data and variables in Python, the language provides us with several *built-in functions*. To display information to the screen, we use the print function:

In [9]:
print(weight_kg)

60


We can display multiple things at once using only one `print` call:

In [10]:
print('Weight in kilograms:', weight_kg)

Weight in kilograms: 60


Another important built-in function is `type`, which tells us what kind of value a variable contains

In [11]:
type(weight_kg)

int

In [12]:
type(weight_lbs)

float

Note that while `weight_kg` was an integer (a number without decimals), `weight_lbs` became a float (a number with decimals) when we multiplied `weight_kg` with 2.2.

We can also do arithmetic operations on the variable

In [13]:
weight_kg = weight_kg + 3

A shorthand version of the same expression is

In [14]:
weight_kg += 3

This also works with the operators `-`, `*`, `/`

In [15]:
a = 40
a -= 10 # = 30
a *= 2 # = 60
a /= 6 # = 10

From now on, whenever we use `weight_kg`, Python will substitute the value we assigned to it. In layman’s terms, a variable is a name for a value.

In Python, variable names:
- can include letters, digits, and underscores
- cannot start with a digit
- are case sensitive.

This means that, for example:
- `weight0` is a valid variable name, whereas `0weight` is not
- `weight` and `Weight` are different variables

### Basic data types: Strings, integers, and floats

Python knows various types of data. Three common ones are:
- integer numbers
- floating point numbers
- strings.

In the example above, variable `weight_kg` has an integer value of 60. If we want to more precisely track the weight of our patient, we can use a floating point value by executing:

In [16]:
weight_kg = 60.3

If you reassign a variable in Python with a different data type than before the variable will assume this new data type. Python has a *built-in function* to check the type of a variable:

In [17]:
type(weight_kg)

float

Python uses a principle called "duck typing": If it looks like a duck, and quacks like a duck, it probably is a duck. In most other languages you will need to declare the data type and will get an error message if you try and assign a new value with a different type to that variable.

To do a little more interesting math we can import a built-in *package* `math`

In [18]:
import math
math.pi

3.141592653589793

In [19]:
# Area of a circle with radius 2
math.pi * 2**2

12.566370614359172

In [20]:
math.sin(0)

0.0

In [21]:
math.sqrt(16)

4.0

To create a string, we add single or double quotes around some text. To identify and track a patient throughout our study, we can assign each person a unique identifier by storing it in a string:

In [22]:
patient_id = 'pid-001'

In [23]:
type(patient_id)

str

Strings can also be made up of a combination of variables, and there are several ways to format a string

In [24]:
patient_weight_0 = 'The patient weighs ' + str(weight_kg) + ' kg.'
patient_weight_1 = 'The patient weighs {} kg.'.format(weight_kg)
patient_weight_2 = 'The patient weighs %s kg.' % weight_kg

print(patient_weight_0)
print(patient_weight_1)
print(patient_weight_2)

The patient weighs 60.3 kg.
The patient weighs 60.3 kg.
The patient weighs 60.3 kg.


## Sequences: Lists, tuples, and dictionaries

Lists are a common data structure to hold an ordered sequence of elements.A list is basically an ordered collection of elements, and every element has a unique number associated with it — its *index*. This means that we can access elements in a list using their indices. For example, we can get the first number in the list `numbers`, by using `numbers[0]`.

In [25]:
numbers = [1, 2, 3]
numbers[0]

1

Python lets you declare several variables at the same time.

In [26]:
a, b = 1, 2
print(a, b)

1 2


We can for example use this feature to "unpack" a list 

In [27]:
a, b, c = numbers
print(a, b, c)

1 2 3


To add elements to the end of a list, we can use the `append` method. Methods are a way to interact with an object (a list, for example). We can invoke a method using the dot `.` followed by the method name and a list of arguments in parentheses. Let’s look at an example using `append`:

In [28]:
numbers.append(4)
print(numbers)

[1, 2, 3, 4]


To find out what methods are available for an object, we can use the built-in `help` function:

In [29]:
help(numbers)

Help on list object:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

In [30]:
a = numbers.pop()
print(a)

4


A *tuple* is similar to a list in that it’s an ordered sequence of elements. However, tuples can not be changed once created (they are “immutable”). Tuples are created by placing comma-separated values inside parentheses ().

In [31]:
# Tuples use parentheses
a_tuple = (1, 2, 3)
another_tuple = ('blue', 'green', 'red')

# Note: lists use square brackets
a_list = [1, 2, 3]

The big difference between lists and tuples is that tuples are *immutable*, i.e. you can't reassign values

In [32]:
a_tuple[0] = 3

TypeError: 'tuple' object does not support item assignment

Sometimes we need to access a part of a list or tuple. This is called slicing and is done by specifying the starting index (inclusive), and the final index (exclusive). The index `-1` always indicates the last item in a tuple or list.

In [None]:
l = [1, 2, 3, 4, 5]
t = (1, 2, 3, 4, 5)

a = l[1:3]
b = t[2:-1]

print(a, b)

A *dictionary* is a container that holds pairs of keys and values.

In [None]:
translation = {'one': 1, 'two': 2}
translation['one']

In [None]:
cards = {"J": 11, "Q": 12, "K": 13, "A": 14}

Dictionaries work a lot like lists - except that you index them with key values. You can think about a key as a name or unique identifier for the value it corresponds to.

In [None]:
rev = {1: 'one', 2: 'two'}
rev[1]

To add an item to the dictionary we assign a value to a new key:

In [None]:
rev = {1: 'one', 2: 'two'}
rev[3] = 'three'
rev

## Booleans and conditionals

There is one more important data type that we haven't examined yet: booleans. Booleans are binary values used to express logical truth values (true/false)

In [None]:
a = True
b = False

Using boolean operators we create boolean expressions to obtain the truth values of any logical condition we might want to check:

- `>`: greater than
- `<`: less than
- `==`: equal to
- `!=`: not equal to
- `>=`: greater than or equal to
- `<=`: less than or equal to

In [None]:
print(20 > 10)
print(20 < 10)
print(4 == 4)
print(11 != 12)
print(11 >= 11)
print(11 <= 3)

Based on boolean expressions we can ask Python to take different actions, depending on a condition, with an `if` (and an optional else `else`) statement.

For `if`-statements, given their criteria are met, the code block underneath the statement is executed. The code block belonging to an `else`-statement is excecuted if and only if none of the previous criteria are met.

In [None]:
num = 37
if num > 100:
    print('greater')
else:
    print('not greater')
print('done')

The second line of this code uses the keyword `if` to tell Python that we want to make a choice. If the test that follows the if statement is true, the body of the `if` (i.e., the set of lines indented underneath it) is executed, and “greater” is printed. If the test is false, the body of the `else` is executed instead, and “not greater” is printed. Only one or the other is ever executed before continuing on with program execution to print “done”.

Conditional statements don’t have to include an `else`. If there isn’t one, Python simply does nothing if the test is false:

In [None]:
num = 53
print('before conditional...')
if num > 100:
    print(num, 'is greater than 100')
print('...after conditional')

We can also chain several tests together using `elif`, which is short for “else if”. The following Python code uses `elif` to print the sign of a number.

In [None]:
num = -3

if num > 0:
    print(num, 'is positive')
elif num == 0:
    print(num, 'is zero')
else:
    print(num, 'is negative')

We can also combine tests using `and` and `or`. `and` returns true if and only if both parts are true.

In [None]:
if (1 > 0) and (-1 >= 0):
    print('both parts are true')
else:
    print('at least one part is false')

while `or` is true if at least one part is true:

In [None]:
if (1 < 0) or (1 >= 0):
    print('at least one test is true')

Sometimes it is useful to check whether some condition is not true. The Boolean operator `not` can do this explicitly

In [None]:
if not (1 < 0):
    print('1 is not smaller than 0')

## Loops and Iterables

Especially when working with lists we often have a set of tasks that we would like to repeat on each element of the list.

In [None]:
l = [1, 2, 3, 4]
print(l[0])
print(l[1])
print(l[2])
print(l[3])

This is a bad approach for three reasons:

1. Not scalable. Imagine you need to print a list that has hundreds of elements. It might be easier to type them in manually.
2. Difficult to maintain. If we want to decorate each printed element with an asterisk or any other character, we would have to change four lines of code. While this might not be a problem for small lists, it would definitely be a problem for longer ones.
3. Fragile. If we use it with a list that has more elements than what we initially envisioned, it will only display part of the list’s elements. A shorter list, on the other hand, will cause an error because it will be trying to display elements of the list that do not exist.

In [None]:
l = [1, 2, 3]
print(l[0])
print(l[1])
print(l[2])
print(l[3])

`for`-loops solve this problem for us:

In [None]:
l = [1, 2, 3, 4]
for item in l:
    print(item)

This is shorter — certainly shorter than something that prints every number in a hundred-number list — and more robust as well:

In [None]:
l = [1, 2, 3, 4, 5, 6, 7, 8]
for item in l:
    print(item)

Because we can iterate over a list using a for-loop it is called an iterable object. Tuples and dictionaries are also iterables. The general form of a loop is:

In [None]:
t = (4, 5, 6, 7, 8, 9)
for item in t:
    print(item)

In [None]:
d = {1: 'one', 2: 'two', 3: 'three', 4: 'four'}
for key, value in d.items():
    print("{}: {}".format(key, value))

```
for variable in iterable:
    # do things using variable, such as print
```

Using the example above, our loop above works like this:

Each number in the variable `l` is accessed one at a time, stored in the variable `item`, and printed. We can call the loop variable anything we like, but there must be a colon at the end of the line starting the loop, and we must indent anything we want to run inside the loop. It is a good idea to choose variable names that are meaningful, otherwise it would be more difficult to understand what the loop is doing.

In [None]:
l = [1, 2, 3, 4]
for x in l:
    print(x)

Note that a loop variable is a variable that is being used to record progress in a loop. It still exists after the loop is over, and we can re-use variables previously defined as loop variables as well:

In [None]:
print(item, x)

Here’s another loop that repeatedly updates a variable:

In [None]:
length = 0
names = ['Curie-Sklodowska', 'Darwin', 'Turing']
for value in names:
    length = length + 1
print('There are', length, 'names in the list.')

It’s worth tracing the execution of this little program step by step. Since there are three names in names, the statement on line 4 will be executed three times. The first time around, length is zero (the value assigned to it on line 1) and value is Curie-Sklodowska. The statement adds 1 to the old value of length, producing 1, and updates length to refer to that new value. The next time around, value is Darwin and length is 1, so length is updated to be 2. After one more update, length is 3; since there is nothing left in names for Python to process, the loop finishes and the print function on line 5 tells us our final answer.

In [None]:
name = 'Rosalind'
for name in names:
    print(name)
print('after the loop, name is', name)

Finding the length of a loopable object is a very common task that Python has a built-in function for

In [None]:
len(names)

Python also has a built-in function called `range` that generates a sequence of numbers. `range` can accept 1, 2, or 3 parameters.

- If one parameter is given, `range` generates a sequence of that length, starting at zero and incrementing by 1. For example, `range(3)` produces the numbers `0, 1, 2`.
- If two parameters are given, `range` starts at the first and ends just before the second, incrementing by one. For example, `range(2, 5)` produces `2, 3, 4`.
- If `range` is given 3 parameters, it starts at the first one, ends just before the second one, and increments by the third one. For example, `range(3, 10, 2)` produces 3, 5, 7, 9.

In [None]:
range(3)

In [None]:
for i in range(10):
    print(i)

**Iterables** Any Python object that can be looped over with a for-loop is called an *iterable*, such as lists, tuples, and dictionaries, but also the result of the `range`-function.

In [None]:
t = (1, 2, 3, 4, 5)
for item in t:
    print(item)

In [None]:
d = {1: 'one', 2: 'two', 3: 'three', 4: 'four'}
for key, value in d.items():
    print("{}: {}".format(key, value))

If you want to store the results of a loop in a list, there's a shorthand version of the for-loop called a *list comprehension*. For example, a list comprehension that stores square roots of a range of numbers looks like this

In [None]:
x = [i**2 for i in range(5)]
print(x)

This also works with dictionaries, and a dictionary comprehension that stores both the value and its square root looks like this

In [None]:
x = {i: i**2 for i in range(5)}
print(x)

In [None]:
x = {i: i**2 for i in range(5) if i % 2 == 0}
print(x)