# Lab 01 &mdash; Introduction to Python and NumPy
(some content adapted from [w3schools tutorial](https://www.w3schools.com/python), [Python language reference](https://docs.python.org/3/reference/index.html) and [Python standard library](https://docs.python.org/3/library/index.html))

## Roadmap
* [Syntax](#Syntax)
* [Built-in types and operators](#Built-in-types-and-operators)
* [Control flow statements](#Control-flow-statements)
* [Functions](#Functions)
* [List comprehension](#List-comprehension)
* [Deleting elements](#Deleting-elements)
* [Modules](#Modules)
* [How to get help](#How-to-get-help)
* [numpy arrays](#numpy-arrays)


## Syntax

### Indentation

* While other programming languages use `{` `}` to  define scopes and blocks of code, Python uses _indentation_
* So indentation is not just for readability and aesthetics, but wrong indentation generate errors 
* __It's important to be consistent!__
    * never mix tabs and spaces for indentation
    * use same number of spaces
    * _Recommended style:_ use 4 spaces per indentation level
    
See also [Python's official style guide](https://www.python.org/dev/peps/pep-0008)

In [1]:
if 5 > 2:
    print("Five is greater than two!")

Five is greater than two!


In [2]:
if 5 > 2:
print("Five is greater than two!")

IndentationError: expected an indented block (<ipython-input-2-a314491c53bb>, line 2)

### Comments
* Comments start with a `#`. Python will ignore everything on that line after the `#`
* Python doesn't have a syntax for multi-line characters. You can use multiple `#`

In [3]:
# This is a comment
# written in
# more than just one line

### Line structure
* Python just needs a newline character to terminate a command/statement, unlike most languages that use `;`
* Long lines can be split into multiple lines _explicitly_, with a `\`
* Lines can be split also _implicitly_, with just a newline, for expressions in brackets

In [4]:
year = 2019
month = 10
day = 20
hour = 12
minute = 21
if 1900 < year < 2100 and 1 <= month <= 12 \
    and 1 <= day <= 31 and 0 <= hour < 24 \
    and 0 <= minute < 60:   
        print("Looks like a valid date")

Looks like a valid date


In [5]:
year = 2019
month = 10
day = 20
hour = 12
minute = 21
if 1900 < year < 2100 and 1 <= month <= 12 
    and 1 <= day <= 31 and 0 <= hour < 24 
    and 0 <= minute < 60:   
        print("Looks like a valid date")

SyntaxError: invalid syntax (<ipython-input-5-993a02bc3ab5>, line 6)

In [6]:
# list expressions use brackets -- we don't need explicitly line continuation
month_names = ['Januari', 'Februari', 'Maart',      # These are the
               'April',   'Mei',      'Juni',       # Dutch names
               'Juli',    'Augustus', 'September',  # for the months
               'Oktober', 'November', 'December']   # of the year
print(month_names)

['Januari', 'Februari', 'Maart', 'April', 'Mei', 'Juni', 'Juli', 'Augustus', 'September', 'Oktober', 'November', 'December']


### Variables
* In Python, you don't have commands for declaring a variable (e.g. `int num;`). A variable is created as soon as you assign a value to it.
* No type declaration is required either (e.g. `num = 5` instead of `int num = 5;`)
* Variable naming follows the usual rules (start with letter or `_`)
* You can use both `''` and `""` for string expressions (and triple quotes for strings spanning multiple lines)

In [7]:
x = 5 # type int
y = "Johnny" # type string
z = 'Mary' # type string
print(x)
print(y+' and '+z)

5
Johnny and Mary


Note that variables and constants can be displayed even without the need for `print()`

In [8]:
# Multiple variables can be assigned in a single line as follows
x1 = x2 = 5.0 # x1 and x2 are both assigned 5
y, z = 'Johnny', 'Mary'

print(x1==x2) # print whether x1 and x2 are equal
print(y+' and '+z)

True
Johnny and Mary


In [9]:
lipsum = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Dolor sed viverra 
ipsum nunc aliquet bibendum enim. In massa tempor nec feugiat. Nunc aliquet 
bibendum enim facilisis gravida. Nisl nunc mi ipsum faucibus vitae aliquet 
...
"""

## Built-in types and operators

Python has the following built-in types (see also https://docs.python.org/3/library/stdtypes.html for docs on built-in types and https://docs.python.org/3/library/functions.html for built-in functions):
* Null values: the type is called `NoneType`, whose sole value is `None`. `None` denotes absence of value
* Boolean: `bool`
    * Constants: `True`, `False`
    * Logical operators: `or`, `and`, `not`
    * Comparison operators: `<`, `<=`, `>=`, `>`, `==`, `!=`
* Numeric types: `int`, `float`, `complex`
    * Constants and operators as usual, just note 
        * `/` is for (float) division and `//` for integer division
        * `x ** y` is `x` to the power `y` (or, `pow(x,y)`)
        * syntax of complex constants: `3+5j` (`3` real part, `5` imaginary part)
* String: `str`
    * Main operators: `+` (concatenation), `[i]` (`i`-th char), `[i:j]` (substring from position `i` to `j`), `len`, `replace`, `format`

### Casting

* Casting, i.e., type conversion is done using constructor functions `int()`, `float()`, `str()`

In [10]:
c = 3+5j # a complex number
# let's inspect the types of the following expressions
print(type(True))
print(type('False'))
print(type(x))
print(type(x1))
print(type(c))
print(type(None))

<class 'bool'>
<class 'str'>
<class 'int'>
<class 'float'>
<class 'complex'>
<class 'NoneType'>


In [11]:
# examples of casting
x = int("1")
y = int(1.8)
w = float(x)
z = str(w)
print(x)
print(y)
print(w)
print(z)

1
1
1.0
1.0


In [12]:
# some operations on strings
a = "Hello, World!"
print(a[1])   # 2nd character
print(a[1:6]) # substring of from 2nd character and ending on 6th (excluded)
print(len(a)) # string length
print(a.replace("World", "Class"))   # replace all occurrences of a string
print("My age is " + str(18))        # concatenation
b = "My age is {}, I was born in {}" # {} is a placeholder for the format method
print(b.format(25,2020-25))

e
ello,
13
Hello, Class!
My age is 18
My age is 25, I was born in 1995


### Collection types

* Sequence types, i.e., ordered collections that allow duplicates
    * `list` (mutable), `tuple` (immutable)
    * Operators similar to `str` (which is like a kind of `list`)
* Sets: `set`, unordered, doesn't allow duplicates. 
    * Supports main set operations (e.g. `union()`, `intersection()`, `difference()` and etc.)
* Dictionaries: `dict`, unordered but indexed, doesn't allow duplicates. 

**Note:** Python collections need not be type-consistent, i.e., you can mix elements of different types

In [13]:
# list elements are within []
a_list = ["apple", "banana", "cherry", 4, 5, True]
first_el = a_list[0]
a_sublist1 = a_list[1:3] # slice of a_list from 1 to 3
a_sublist2 = a_list[::2] # slice of a_list from start to end with step 2
print(first_el)
print(a_sublist1)
print(a_sublist2)

apple
['banana', 'cherry']
['apple', 'cherry', 5]


In [14]:
a_list[1] = 0      # put a 0 in 2nd position
a_list.insert(1,0) # insert at 2nd position (without replacing)

# check if "banana" is still in a_list
if "banana" in a_list:
    print("Yes, 'banana' is in the list") 
else:
    print("No, 'banana' is not in the list") 

No, 'banana' is not in the list


In [15]:
# remove element 'cherry' from list
a_list.remove('cherry') 
# but then, append it at the end
# equivalent expressions:
#- a_list.append("cherry")
#- a_list+=["cherry"]
a_list=a_list+["cherry"]

# now add a_list to itself twice using * operator
a_list*=2

# loop through the list and print each element
for x in a_list:
    print(x)

apple
0
0
4
5
True
cherry
apple
0
0
4
5
True
cherry


In [16]:
# tuple elements are within ()
e_1 = ('Clapham J', 'Richmond')
e_2 = (e_1[1],'Twickenham')
e_3 = (e_2[1],'Egham')
# set elements are within {}
stations_graph = {e_1,e_2,e_2} # try to put a repeated element 
print(stations_graph)

{('Clapham J', 'Richmond'), ('Richmond', 'Twickenham')}


In [17]:
# add element to stations_graph set
stations_graph.add(e_3)
print(stations_graph)
# check if now {e1,e3} is a subset
print({e_1,e_3}.issubset(stations_graph))

{('Twickenham', 'Egham'), ('Clapham J', 'Richmond'), ('Richmond', 'Twickenham')}
True


In [18]:
# dictionary elements are of the form key : value 
#     and are within {}
my_journey = {
    "start_station" : 'Clapham J',
    "connections" : stations_graph
}
print(my_journey)
# access dictionary item by referring to key
print(my_journey["start_station"])


{'start_station': 'Clapham J', 'connections': {('Twickenham', 'Egham'), ('Clapham J', 'Richmond'), ('Richmond', 'Twickenham')}}
Clapham J


In [19]:
# we can use constructors to cast one collection type into another
repeated_list = [1,2,2,3,3,3,4,4,4,4]
non_repeated_set = set(repeated_list)
print(non_repeated_set)
print(list(non_repeated_set))

{1, 2, 3, 4}
[1, 2, 3, 4]


### The `in` keyword
It has two functions:
* it is used to test if an element is present in a collection
* when used in a for loop, it allows to iterate over the elements of a collection

In [20]:
# prints whether or not element 1 is in 'repeated_list'
print(1 in repeated_list)
# loops over the elements of `repeated_list`
for x in repeated_list:
    print(x)

True
1
2
2
3
3
3
4
4
4
4


## Control flow statements

* `if` and `while` are as other programming languages (i.e., branch on condition, and loop as long as condition is true)
* The `for` loop is slightly different: it doesn't use a Boolean condition, but it iterates over the elements of a collection. More specifically, it uses an `iterator` object
    * a useful function is `range(n)` that returns the sequence `[0,1,...,n-1]`, and thus, can be used in place of a counter variable in the `for` loop.
    * Variants are: `range(m,n)` which returns the sequence `[m,m+1,...,n-1]` and `range(m,n,k)` which returns `[m, m+k, m+2k, ...]`

Let's look at their syntax with some concrete examples.

In [21]:
# if-elif-else example
a = 200
b = 33
if b > a:
    print("b is greater than a")
elif a == b:
    print("a and b are equal")
else:
    print("a is greater than b")
    
# equivalent, compact one-line syntax
print("A") if a > b else print("=") if a == b else print("B")

a is greater than b
A


In [22]:
# while-continue-break example
i = 0
while i < 6:
    i += 1
    # continue to next iteration if i is 2
    if i == 2:
        continue
    print(i)
    # exit the loop if i is 4
    if i == 4:
        break

1
3
4


In [23]:
# for loop equivalent to above while loop
for i in range(1,7):
     # continue to next iteration if i is 2
    if i == 2:
        continue
    print(i)
    # exit the loop if i is 4
    if i == 4:
        break


1
3
4


## Functions

* A function is a block of code with a _name_, that can take _parameters_, and can return some _values_ (or none)
* We can set default values for parameters, such that if we call the function without parameters, it uses the default value
    * Just beware of default mutable arguments (e.g., lists) &mdash; see https://docs.python-guide.org/writing/gotchas/ for a tricky example
* Python also supports _lambda functions_, i.e., anonymous functions that can be used as any other Python expression
    * a lambda function is an expression (not a statement) and thus, for instance, can be passed as an argument or returned, or be an element of a list
    * the body of a lambda function is a single expression, the syntax is
    ```python
    lambda arguments : expression
    ```
    * a regular function is instead a block of statements, and thus, can be only invoked


In [24]:
# define a function with two arguments, one of which has a default
# introduction is the function name, fname and country are the parameters
# UK is the default value for parameter country
def introduction(fname, country = "UK"):
    return "Hi, my name is " + fname + " and I'm from " + country

print(introduction("Nicola", "Italy"))
print(introduction("Michael"))    

Hi, my name is Nicola and I'm from Italy
Hi, my name is Michael and I'm from UK


In [25]:
# equivalent lambda encoding of introduction function
intro_lambda = lambda fname, country="UK" : "Hi, my name is " + fname + \
    " and I'm from " + country
print(intro_lambda("Michael"))

Hi, my name is Michael and I'm from UK


In [26]:
# another example to see the power of lambda functions
# they are especially useful for functional-like programming (using e.g. map, filter, reduce)
num_list = [0,1,2,3,4]
# map(fun,it) applies function fun to every item of iterable object it
# we define a function using lambda that returns the square of a number
squared_list = map(lambda x: x**2, num_list)
# map returns an iterator object. we need to cast it to a list
print(list(squared_list))

[0, 1, 4, 9, 16]


## List comprehension

* _List comprehensions_ provide a concise way to create lists. 
* Especially when you have to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.
* the syntax is 
```python
[expression(var) for var in iterable]
```
and the result is a list resulting from evaluating `expression` for every item of `iterable`
or 
```python
[expression(var) for var in iterable if condition(var)] 
```
which is same as above, but retains only elements such that `condition` is satisfied
* `expression` can be any expression, including another list comprehension $\to$ list comprehensions can be nested
* __Let's see some examples!__

In [27]:
# the code below achieves the same results as the map function above
#     by using list comprehension. The result is more compact and readable
squared_list = [x**2 for x in num_list] 
print(squared_list) # this time we don't need to cast to list

# now include only the squares of the even numbers
even_squared_list = [x**2 for x in num_list if x%2==0] 
print(even_squared_list)

[0, 1, 4, 9, 16]
[0, 4, 16]


In [28]:
# You can use multiple for expressions, as if they were nested loops
#     in this example, we generate all possible pairs (a,b), 
#     where a and b are in different lists and such that a and b are distinct
combs = [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
# which is a very compact way to express the following code
# combs = []
# for x in [1,2,3]:
#     for y in [3,1,4]:
#         if x != y:
#             combs.append((x, y))
print(combs)

[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]


In [29]:
# example of nested list comprehensions
# this example generates an identity matrix of size n (as a multi-dimensional list)
n = 5
I = [ [(1 if i==j else 0) for i in range(n)] for j in range(n)]
print(I)

[[1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 1]]


## Deleting elements

* In Python, we can remove elements (e.g., variables, functions, list items, etc) using the syntax
```python
del target
```
where `target` is what you want to delete

In [30]:
soon_to_be_deleted = "it doesn't really matter"
del soon_to_be_deleted
print(soon_to_be_deleted)

NameError: name 'soon_to_be_deleted' is not defined

In [31]:
# we can delete elements of collections as well
a = [-1, 1, 66.25, 333, 333, 1234.5]
# deletes first element
del a[0]
print(a)
# note the difference with remove() function, which
#     removes elements by value and not by position

# deletes 3rd and 4th elements
del a[2:4]
print(a)
# deletes all elements
del a[:]
print(a)

[1, 66.25, 333, 333, 1234.5]
[1, 66.25, 1234.5]
[]


## Modules

* Modules are code libraries that provide further functionalities beyond those built-in in Python
* Python standard library contains many useful modules
* We are going to see additional libraries in this course
* To use a library, we need to add an `import` statement

In [32]:
# math is a Python module that provides access to the mathematical functions defined by the C standard
# see also https://docs.python.org/3/library/math.html

# basic import statement 
import math
# use syntax module_name.identifier to use 'identifier' defined in 'module_name'
print(math.pi) # print pi 
print(math.sin(math.pi/2)) # print sin(pi/2)

3.141592653589793
1.0


For convenience, modules are often renamed into shorter aliases using the syntax
```python 
import module_name as alias
```
Popular libraries have standard aliases/abbreviations (e.g., `numpy` $\to$ `np`)

In [33]:
# import numpy using alias np
import numpy as np
# call function array defined in numpy
a = np.array([[1,0],[0,1]])
print(a)

[[1 0]
 [0 1]]


We might not need all that is inside a module, but only just a few functions and constants. In such cases, we can use a different syntax as in the example below

In [44]:
# we import only pi and sin from math module
from math import pi, sin
# we don't need to use the module name any longer
print(pi) 
print(sin(pi/2))
# but ...
print(math.cos(pi/2)) # generates error (cos not imported)

3.141592653589793
1.0
6.123233995736766e-17


## Classes and objects

Python is an object-oriented programming language. Everything in Python is a class/object (even basic, built-in types/constants). Here we won't cover these topics, but see for instance https://docs.python.org/3/tutorial/classes.html and https://www.w3schools.com/python/python_classes.asp to learn how to use classes, objects, and inheritance.

## How to get help

If you don't know what a function does or what are its arguments, you can get its documentation directly using Python. To look up for `something`, the syntax is
```python
help(something)
```
or the more compact syntax
```python
?something
```

In [None]:
# it shows help for help function
?help

In [None]:
import math
?math.sin 

## numpy arrays

* NumPy is a library of the [SciPy open-source project](https://www.scipy.org/index.html), which also includes other relevant libraries like __Pandas__, __SciPy__, __Matplotlib__ (which will see later in the course), and  **IPython** (the kernel behind Jupyter).
* We will see the main features of numpy arrays: creation, indexing, operations on arrays, and differences with Python's native lists
* They are very similar to Matlab's array (if you're familiar with them)
* See also [numpy's cheatsheet](https://www.datacamp.com/community/blog/python-numpy-cheat-sheet) 

### Basics

In [35]:
# np is the standard alias for numpy
import numpy as np

In [36]:
# Create a 3x1 numpy array
a = np.array([1,2,3])

# Print object type
print(type(a))

# Print shape
print(a.shape)

# Print some values in a
print(a[0], a[1], a[2])

# change 2nd value and print
a[1] = 4
print(a)

<class 'numpy.ndarray'>
(3,)
1 2 3
[1 4 3]


In [37]:
# Create a 2x2 numpy array
b = np.array([[1,2],[3,4]])

# Print shape
print(b.shape)

# change value in first row and 2nd column and print
b[0,1] = 9
print(b)

(2, 2)
[[1 9]
 [3 4]]


In [38]:
# 2x3 zero array 
d = np.zeros((2,3))
print(d)

# 4x2 array of ones
e = np.ones((4,2))
print(e)

# 2x2 constant array
f = np.full((2,2), 9)
print(f)

# 3x3 random array (random values between 0 and 1) 
# it calls function 'random' of np's 'random' module
#     which draws uniformly distributed values in the interval (0,1)
g = np.random.random((3,3))
print(g)

[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]
[[9 9]
 [9 9]]
[[0.87891411 0.75086239 0.45110004]
 [0.80503875 0.47568514 0.59402417]
 [0.11416102 0.12105366 0.16646944]]


### Indexing

We can select slices of the array using the symbol `:`. For instance:
* `[3:7]` selects all indexes between 3 and 7 (excluded)
* `[:,4]` selects all indexes along the first dimension, and only index 4 in second dimension

In [39]:
# Create 3x4 array
h = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(h)

# Slice array to make a 2x2 sub-array
# :2 is equivalent to 0:2, i.e., it selects indices 0 and 1
i = h[:2, 1:3]
print(i)

# print fourth column of h
print(h[:,3])

# change 2nd row of i
i[1,:] = [8,9]
print(i)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[2 3]
 [6 7]]
[ 4  8 12]
[[2 3]
 [8 9]]


### Datatypes and math operations

Basic mathematical functions operate element-wise on arrays, and are available both as operator overloads and as functions in the numpy module

In [40]:
# Integer
j = np.array([1, 2])
print(j.dtype)  

# Float
k = np.array([1.0, 2.0])
print(k.dtype)         

# Force Data Type
l = np.array([1.0, 2.0], dtype=np.int64)
print(l.dtype)
print(l)

int32
float64
int64
[1 2]


In [41]:
# by writing 1. np knows we want a float
x = np.array([[1.,2.],[3.,4.]])
y = np.array([[5.,6.],[7.,8.]])

# Element-wise sum
print(x + y)
# equivalent to 
# print(np.add(x, y))

# Element-wise difference
print(x - y)
# equivalent to 
# print(np.subtract(x, y))

# Element-wise product
print(x * y)
# equivalent to 
# print(np.multiply(x, y))

# Element-wise division
print(x / y)
# equivalent to 
# print(np.divide(x, y))

# Element-wise square root
print(np.sqrt(x))

[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


In [42]:
x = np.array([[1,2],[3,4]])

# Sum of all elements
print(np.sum(x))

# Column-wise sum
print(np.sum(x, axis=0)) 

# Row-wise sum
print(np.sum(x, axis=1))

# Mean of all elements
print(np.mean(x))

# Column-wise mean
print(np.mean(x, axis=0)) 

### Row-wise mean
print(np.mean(x, axis=1))

10
[4 6]
[3 7]
2.5
[2. 3.]
[1.5 3.5]


### Select elements by condition
Instead of specifying the index for selecting, one can use a Boolean array describing the elements to select

In [43]:
# generates a numpy array from 1 to 20
first_twenty = np.arange(1, 21, 1)
print(first_twenty)
# generates a Boolean array indicating whether or not
#     each element of first_twenty is odd 
odds = (first_twenty % 2)==1
# do the same for elements multiple of 3
mul3 = (first_twenty % 3)==0

# now select from the array only the multiples of 3
print(first_twenty[mul3])

# select the elements that are multiples of 3 and odd
print(first_twenty[mul3 & odds])

# select the elements that are multiples of 3 but not odd
print(first_twenty[mul3 & ~odds])

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[ 3  6  9 12 15 18]
[ 3  9 15]
[ 6 12 18]


### Native lists vs numpy arrays
Main differences:
* Lists can have multiple datatypes. numpy arrays cannot.
    * For this reason, mathematical operations are not defined on lists. But on arrays they are.
* Memory for lists is allocated dynamically. Numpy arrays instead get contiguous blocks of memory upon creation
    * which is why lists are less efficient in general
    * but more efficient on operations that change the size of the data structure (like append)


See more at official docs pages:  
[Lists](https://docs.python.org/3/tutorial/datastructures.html), [arrays](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.array.html), [more on arrays](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.creation.html)