<a href="https://colab.research.google.com/github/stevengiacalone/Python-workshop/blob/main/Session_1_Python_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome

Welcome to the first session of the Python workshop! In this practical, we will learn about the programming language Python as well as NumPy, a library that containts tons of useful tools and pre-defined functions.

Credit: Much of this notebook was written by Mathieu Blondel. You can read more about the material here on the official Python website (https://docs.python.org/3/tutorial/) or this useful W3schools website (https://www.w3schools.com/python/default.asp).

# Notebooks

Throughout this workshop, we'll be working in Jupyter notebooks through the Google colab platform. Notebooks are composed of "cells" that can contain chunks of code, text, or images. Through colab, we use CPUs and GPUs on Google's servers to run our code, so you should be able to execute all of the cells in the workshop even if your computer is not very powerful. However, access to colab requires an internet connection.

Let's run our first cell of code. Below, we define a string called hello_world. To run the cell, click on the "play" button on the left-hand side. Alternatively, you can press ctrl+enter, cmd+enter, or shift+enter when the cell is selected (my preferred method).

In [1]:
hello_world = "hello world"

Variables that you defined in one cell can later be used in other cells, so you don't need to re-define things every time. Let's print the string we defined in the previous cell.

In [2]:
print(hello_world)

hello world


To add a new cell, click on this cell then click on "+ Code" button in the top left or above/below an existing cell when hovering over it. You can delete cells by left-clicking on them and selecting the "Delete cell" option.

# Python

Python is one of the most popular programming languages out there, both in academia and in industry. It is essential to learn this language for anyone interested in research and data science. In this session, we will review Python basics.

## Arithmetic operations

Python supports the usual arithmetic operators: + (addition), * (multiplication), / (division), ** (power), // (integer division). Arithmetic operations follow the standard PEMDAS rules. Let's try a few below.

In [3]:
# side note: this is a comment - you can write comments (which will be
# ignored when you execute the code) by putting "#" at the start of the line

# addition
sum = 1 + 1
print("sum =", sum)

# subtraction
difference = 4 - 1
print("difference =", difference)

# multiplication
product = 5 * 2
print("product =", product)

# division
quotient = 20 / 5
print("quotient =", quotient)

# integer division
int_quotient = 20 // 5
print("integer quotient =", int_quotient)

# power
power = 2**5
print("power =", power)

# put it all together
x = ((10*5) - 30)**2 / 10
print("x =", x)

sum = 2
difference = 3
product = 10
quotient = 4.0
integer quotient = 4
power = 32
x = 40.0


You might have noticed that some of the outputs are integers, while others have decimals. This brings us to our next subject of data types.

## Data types

Python supports a number of different data types. Here we'll discuss a few of the more commonly used ones.

- int: A normal integer (i.e., no decimal places)
- float: A number with precision to at least one decimal place
- str: A string, which can be a sequence of any characters in quotes
- bool: A boolean, which can be True or False
- list: A sequence of elements that can be modified (more on these later)
- tuple: Like a list, but cannot be modified and takes up less memory
- range: A list of integers counting up from 1 - useful for loops (more on these later)
- dict: A dictionary, an ordered collection of key-value pairs (more on these later).

You can check the type of a variable using type(var). Let's explore below.

In [4]:
this_int = 1
print("this_int is a ", type(this_int))

this_float = 1.0
print("this_float is a ", type(this_float))

this_str = "abc123"
print("this_str is a ", type(this_str))

this_bool = True
print("this_bool is a ", type(this_bool))

# lists are defined with square brackets
this_list = [1, 3, 5, 7, 9]
print("this_list is a ", type(this_list))

# tuples are defined with round brackets
this_tuple = (1, 3, 5, 7, 9)
print("this_tuple is a ", type(this_tuple))

# ranges are defined with the syntax range(N)
this_range = range(5)
print("this_range is a ", type(this_range))

# dicts are defined with curly brackets and require both a keyword and a value
this_dict = {"name": "john", "age": 30, "height": 180}
print("this_dict is a ", type(this_dict))

this_int is a  <class 'int'>
this_float is a  <class 'float'>
this_str is a  <class 'str'>
this_bool is a  <class 'bool'>
this_list is a  <class 'list'>
this_tuple is a  <class 'tuple'>
this_range is a  <class 'range'>
this_dict is a  <class 'dict'>


You can also change the type of a variable using pre-defined functions like int(), float(), and str(). Let's change the types of a few of the variables defined above.

In [5]:
this_int_to_str = str(this_int)
print(this_int, "->", this_int_to_str)
print("this_int_to_str is now a ", type(this_int_to_str))

this_int_to_float = float(this_int)
print(this_int, "->", this_int_to_float)
print("this_int_to_float is now a ", type(this_int_to_float))

this_float_to_int = int(this_float)
print(this_float, "->", this_float_to_int)
print("this_float_to_int is now a ", type(this_float_to_int))

1 -> 1
this_int_to_str is now a  <class 'str'>
1 -> 1.0
this_int_to_float is now a  <class 'float'>
1.0 -> 1
this_float_to_int is now a  <class 'int'>


Note that this will not work with everything. For example, if you want to change a string to a float or int, the string can only contain numbers. Similarly, if you try to change a float with multiple decimal places to an int, it will simply cut off everything after the decimal.

In [6]:
# this will return an error
int(this_str)

ValueError: invalid literal for int() with base 10: 'abc123'

In [7]:
# this will work, but you lose some information
test_float = 1.234
test_float_to_int = int(test_float)
print(test_float_to_int)

1


## Lists

Lists are a container type for ordered sequences of elements. Lists can be initialized empty

In [8]:
my_list = []

or with some initial elements

In [9]:
my_list = [1, 2, 3]

Lists have a dynamic size and elements can be added (appended) to them

In [10]:
my_list.append(4)
my_list

[1, 2, 3, 4]

We can access individual elements of a list (indexing starts from 0)

In [11]:
my_list[2]

3

We can access "slices" of a list using `my_list[i:j]` where `i` is the start of the slice (again, indexing starts from 0) and `j` the end of the slice. This will return the elements between index `i` and `j-1`, inclusive. For instance:

In [12]:
my_list[1:3]

[2, 3]

Omitting the second index means that the slice shoud run until the end of the list

In [13]:
my_list[1:]

[2, 3, 4]

We can check if an element is in the list using `in`

In [14]:
5 in my_list

False

The length of a list can be obtained using the `len` function

In [15]:
len(my_list)

4

Lastly, elements of a list can be removed using the remove() method or the del keyword.

In [16]:
# use my_list.remove(x) to remove an element of the list with a specific value
my_list.remove(4)
print(my_list)

[1, 2, 3]


In [17]:
# use del mylist[idx] to remove a an element by its index
del my_list[-1] # the index "-1" indicates the last element of the list
print(my_list)

[1, 2]


## Strings

Strings are used to store text. They can be defined using either single quotes or double quotes.

In [18]:
string1 = "some text"
string2 = 'some other text'

Strings behave similarly to lists. As such we can access individual elements in exactly the same way

In [19]:
string1[3]

'e'

and similarly for slices

In [20]:
string1[5:]

'text'

String concatenation is performed using the `+` operator

In [21]:
string1 + " " + string2

'some text some other text'

## Conditionals

As their name indicates, conditionals are a way to execute code depending on whether a condition is True or False. As in other languages, Python supports `if` and `else` but `else if` is contracted into `elif`, as the example below demonstrates.

In [22]:
my_variable = 5
if my_variable < 0:
    print("negative")
elif my_variable == 0:
    print("null")
else: # my_variable > 0
    print("positive")

positive


Here `<` and `>` are the strict `less` and `greater than` operators, while `==` is the equality operator (not to be confused with `=`, the variable assignment operator). The operators `<=` and `>=` can be used for less than or equal and greater than or equal comparisons.

Contrary to other languages, blocks of code are delimited using indentation. Here, we use 4-space indentation.

## Loops

Loops are a way to execute a block of code multiple times. There are two main types of loops: while loops and for loops.

While loop

In [23]:
i = 0
my_list = [1,2,3,4,5]
while i < len(my_list):
    print("i =", i, ", my_list[i] =", my_list[i])
    i += 1 # equivalent to i = i + 1

i = 0 , my_list[i] = 1
i = 1 , my_list[i] = 2
i = 2 , my_list[i] = 3
i = 3 , my_list[i] = 4
i = 4 , my_list[i] = 5


For loop

In [24]:
my_list = [1,2,3,4,5]
for i in range(len(my_list)):
    print("i =", i, ", my_list[i] =", my_list[i])

i = 0 , my_list[i] = 1
i = 1 , my_list[i] = 2
i = 2 , my_list[i] = 3
i = 3 , my_list[i] = 4
i = 4 , my_list[i] = 5


If the goal is simply to iterate over a list, we can do so directly as follows

In [25]:
my_list = [1,2,3,4,5]
for element in my_list:
    print("my_list[i] =", element)

my_list[i] = 1
my_list[i] = 2
my_list[i] = 3
my_list[i] = 4
my_list[i] = 5


The `break` keyword breaks a loop at the current line. The `continue` keyword stops the current iteration at the current line and continues to the next iteratio.

In [26]:
my_list = [1,2,3,4,5]
for i in range(len(my_list)):
    if i > 2:
        # breaks the loop after the third iteration
        break
    print("i =", i, ", my_list[i] =", my_list[i])

i = 0 , my_list[i] = 1
i = 1 , my_list[i] = 2
i = 2 , my_list[i] = 3


In [27]:
my_list = [1,2,3,4,5]
for i in range(len(my_list)):
    if i == 2:
        # skip this iteration and continue to the next if i = 2
        continue
    print("i =", i, ", my_list[i] =", my_list[i])

i = 0 , my_list[i] = 1
i = 1 , my_list[i] = 2
i = 3 , my_list[i] = 4
i = 4 , my_list[i] = 5


## Functions

To make code more readable and efficient, it is common to separate the code into different blocks, responsible for performing precise actions: functions. A function takes some inputs and processes them to return some outputs.

In [28]:
def square(x):
  return x ** 2

def multiply(a, b):
  return a * b

# Functions can be combined
this_product = multiply(a=2, b=3)
print(this_product)
this_square = square(x=this_product)
print(this_square)

6
36


## Exercises

Let's try coding a few things up ourselves.

**Exercise 1.** Using a conditional, write the [relu](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) function defined as follows

$\text{relu}(x) = \left\{
   \begin{array}{rl}
     x, & \text{if }  x \ge 0 \\
     0, & \text{otherwise }.
   \end{array}\right.$

In [29]:
def relu(x):
  # Write your function here
  return

relu(-3)

**Exercise 2.** Using a foor loop, write a function that computes the [Euclidean norm](https://en.wikipedia.org/wiki/Norm_(mathematics)#Euclidean_norm) of a vector, represented as a list.

In [30]:
def euclidean_norm(vector):
  # Write your function here
  return

my_vector = [0.5, -1.2, 3.3, 4.5]
# The result should be roughly 5.729746940310715
euclidean_norm(my_vector)

**Exercise 3.** Using a for loop and a conditional, write a function that returns the maximum value in a vector.

In [31]:
def vector_maximum(vector):
  # Write your function here
  return

## Going further

Clearly, it is impossible to cover all the language features in this short introduction. To go further, we recommend the following resources:



*   List of Python [tutorials](https://wiki.python.org/moin/BeginnersGuide/Programmers)
* Four-hour [course](https://www.youtube.com/watch?v=rfscVS0vtbw) on Youtube



# NumPy

NumPy is a popular library for storing arrays of numbers and performing computations on them. Not only this enables to write often more succint code, this also makes the code faster, since most NumPy routines are implemented in C for speed.

To use NumPy in your program, you need to import it as follows

In [32]:
import numpy as np

## Array creation



NumPy arrays can be created from Python lists

In [33]:
my_array = np.array([1, 2, 3])
my_array

array([1, 2, 3])

NumPy supports array of arbitrary dimension. For example, we can create two-dimensional arrays (e.g. to store a matrix) as follows

In [34]:
my_2d_array = np.array([[1, 2, 3], [4, 5, 6]])
my_2d_array

array([[1, 2, 3],
       [4, 5, 6]])

We can access individual elements of a 2d-array using two indices

In [35]:
my_2d_array[1, 2]

6

We can also access rows

In [36]:
my_2d_array[1]

array([4, 5, 6])

and columns

In [37]:
my_2d_array[:, 2]

array([3, 6])

Arrays have a `shape` attribute

In [38]:
print(my_array.shape)
print(my_2d_array.shape)

(3,)
(2, 3)


Contrary to Python lists, NumPy arrays must have a type and all elements of the array must have the same type.

In [39]:
my_array.dtype

dtype('int64')

The main types are `int32` (32-bit integers), `int64` (64-bit integers), `float32` (32-bit real values) and `float64` (64-bit real values).

The `dtype` can be specified when creating the array

In [40]:
my_array = np.array([1, 2, 3], dtype=np.float64)
my_array.dtype

dtype('float64')

We can create arrays of all zeros using

In [41]:
zero_array = np.zeros((2, 3))
zero_array

array([[0., 0., 0.],
       [0., 0., 0.]])

and similarly for all ones using `ones` instead of `zeros`.

We can create a range of values using

In [42]:
np.arange(5)

array([0, 1, 2, 3, 4])

or specifying the starting point

In [43]:
np.arange(3, 5)

array([3, 4])

Another useful routine is `linspace` for creating linearly spaced values in an interval. For instance, to create 10 values in `[0, 1]`, we can use

In [44]:
np.linspace(0, 1, 10)

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

Another important operation is `reshape`, for changing the shape of an array

In [45]:
my_array = np.array([1, 2, 3, 4, 5, 6])
my_array.reshape(3, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

Play with these operations and make sure you understand them well.

## Basic operations

In NumPy, we express computations directly over arrays. This makes the code much more succint.

Arithmetic operations can be performed directly over arrays. For instance, assuming two arrays have a compatible shape, we can add them as follows

In [46]:
array_a = np.array([1, 2, 3])
array_b = np.array([4, 5, 6])
array_a + array_b

array([5, 7, 9])

Compare this with the equivalent computation using a for loop

In [47]:
array_out = np.zeros_like(array_a)
for i in range(len(array_a)):
    array_out[i] = array_a[i] + array_b[i]
array_out

array([5, 7, 9])

Not only this code is more verbose, it will also run much more slowly.

In NumPy, functions that operates on arrays in an element-wise fashion are called [universal functions](https://numpy.org/doc/stable/reference/ufuncs.html). For instance, this is the case of `np.sin`

In [48]:
np.sin(array_a)

array([0.84147098, 0.90929743, 0.14112001])

Some other useful functions include:
- np.max() and np.min(): returns the max or min elements of an array
- np.argmin() and np.argmin(): returns the index of the max or min element of an array
- np.sum(): returns the sum of all elements in an array

There are tons of others. You can usually find a function that meets your needs via a Google search.

Matrix transpose can be done using `.transpose()` or `.T` for short

In [52]:
my_array = np.array([[1,2,3],
                     [4,5,6]]
                    )
my_array.T

array([[1, 4],
       [2, 5],
       [3, 6]])

## Slicing and masking

Like Python lists, NumPy arrays support slicing

In [53]:
np.arange(10)[5:]

array([5, 6, 7, 8, 9])

We can also select only certain elements from the array

In [54]:
x = np.arange(10)
mask = (x >= 5)
x[mask]

array([5, 6, 7, 8, 9])

## Exercises

**Exercise 1.** Create a 3d array of shape (2, 2, 2), containing 8 values. Access individual elements and slices.

**Exercise 2.** Rewrite the relu function (see Python section) using [np.maximum](https://numpy.org/doc/stable/reference/generated/numpy.maximum.html). Check that it works on both a single value and on an array of values.

In [55]:
def relu_numpy(x):
  return

relu_numpy(np.array([1, -3, 2.5]))

**Exercise 3.** Rewrite the Euclidean norm of a vector (1d array) using NumPy (without for loop)

In [56]:
def euclidean_norm_numpy(x):
  return

my_vector = np.array([0.5, -1.2, 3.3, 4.5])
euclidean_norm_numpy(my_vector)

**Exercise 4.** Write a function that computes the Euclidean norms of a matrix (2d array) in a row-wise fashion. Hint: use the `axis` argument of [np.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html).

In [57]:
def euclidean_norm_2d(X):
  return

my_matrix = np.array([[0.5, -1.2, 4.5],
                      [-3.2, 1.9, 2.7]])
# Should return an array of size 2.
euclidean_norm_2d(my_matrix)