# Lab 2 - Python Crash Course Pt.2

## Loops

### Looping over lists

Looping over an array is possible using a foreach style loop

In [1]:
array = [1,2,3,4,5]

for element in array:
    print(element)

1
2
3
4
5


If you need the indicies, instead of the traditional while loop
```python
i = 0
while i < len(array):
    array[i] * = 2
    i += 1
```
consider the more concise alternative using the built-in `range` function

In [2]:
array = [1,2,3,4,5]

for i in range(len(array)):
    array[i] *= 2
array

[2, 4, 6, 8, 10]

From the documentation, `range` returns an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step

In [3]:
print( range(1,10,2) )
print( list(range(1,10,2)) ) # you can cast it to a list if you want to use it like one

range(1, 10, 2)
[1, 3, 5, 7, 9]


In Jupyter notebooks, you can quickly check the function's documentation by using a `?` instead of the function parameters.

P.S. see what happens when you use `?` on a variable like `array`

In [4]:
range?

[0;31mInit signature:[0m [0mrange[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


### Looping over dictionaries

In [5]:
months = {
    "January": 1,
    "Feburary": 2,
    "March": 3,
    "April": 4,
    "May": 5,
    "June": 6,
    "July": 7,
    "August": 8,
    "September": 9,
    "October": 10,
    "November": 11,
    "December": 12
}

for month_name in months: # looping over keys
    print(month_name)

January
Feburary
March
April
May
June
July
August
September
October
November
December


In [6]:
for month_number in months.values(): # looping over values
    print(month_number)

1
2
3
4
5
6
7
8
9
10
11
12


In [7]:
for month_name, month_number in months.items(): # looping over key and value pairs
    print(month_name, month_number)

January 1
Feburary 2
March 3
April 4
May 5
June 6
July 7
August 8
September 9
October 10
November 11
December 12


## List comprehensions

List comprehensions provide a concise way of creating lists

In [8]:
# Using a for loop
squares = []
for i in range(1,11):
    squares.append(i**2)
squares

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [9]:
# Using list comprehension
squares = [i**2 for i in range(1,11)]
squares

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

List comprehension can include if conditions

In [10]:
# Using a for loop
squares = []
for i in range(1,11):
    if i**2 % 2 == 0:
        squares.append(i**2)
squares

[4, 16, 36, 64, 100]

In [11]:
# Equivalent code using list comprehension
squares = [i**2 for i in range(1,11) if i**2 % 2 == 0] # even squares
squares

[4, 16, 36, 64, 100]

In [12]:
# Making it a bit clearer / more readable by breaking it over multiple lines
# P.S. a line of code in python can be split over multiple lines as long as it's between brackets
squares = [
    i**2
    for i in range(1,11)
    if i**2 % 2 == 0
]
squares

[4, 16, 36, 64, 100]

And can also create nested lists

In [13]:
times_table = [[x*y for x in range(1,13)] for y in range(1,13)]
times_table

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
 [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24],
 [3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36],
 [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48],
 [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60],
 [6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72],
 [7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84],
 [8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96],
 [9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99, 108],
 [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120],
 [11, 22, 33, 44, 55, 66, 77, 88, 99, 110, 121, 132],
 [12, 24, 36, 48, 60, 72, 84, 96, 108, 120, 132, 144]]

**Exercise Break**

Solve Exercise(s): 1 - 3

## Functions

Functions don't need to return something (if not specified, returns `None` by default)

In [14]:
def greet():
    print('Hello there')

greet()

Hello there


You can specify parameters for the function to take in

In [15]:
def multiply(x, y):
    return x*y

multiply(10, 5)

50

You can add optional arguments by specifying a default value for the parameter

Here is a root function $\sqrt[n]{x^m}$ that calculates the square root $\sqrt{x}$ by default

In [16]:
def root(x, m=1, n=2):
    return x**(m/n)

root(9)

3.0

You can also explicitly assign the parameters with the arguments you are passing

In [17]:
root(x=8, n=3)

2.0

## Lambdas

Lambdas are simply short anonymous (unnamed) functions designed to be created in-line with other code. They are functionally similar to regular functions with a single return line, but they are intended for single-use. They are a very useful and concise way of applying complex data cleaning or processing steps over values in a dataset.

In [18]:
# minimal example that takes a parameter x and returns x^2
# syntax is a bit similar to writing a mathematical function 
# f(x) = x^2
lambda x: x**2

<function __main__.<lambda>(x)>

In [19]:
# alternatively, storing the anonymous function in a variable for multiple calls
# P.S. not a good coding style, it'd be better to define a normal function instead
square = lambda x: x**2
print(square(8))
print(square(20))

64
400


Lambdas are especially useful when used in conjunction with a function like `map` which applies a function to each element of a given collection

In [20]:
list(map(lambda x: x**2, [1,2,3,4,5]))

[1, 4, 9, 16, 25]

In [21]:
# Which is equivalent to (if using a named function):
def square(x):
    return x**2

values = [1,2,3,4,5]

list(map(square, values))

[1, 4, 9, 16, 25]

**Exercise Break**

Solve Exercise(s): 4 - 6 (7 is optional)

*Sneak peek of how lambdas can be used later on with datasets*

In [22]:
import pandas as pd

# let's say we have a dataframe with names 
dataset = pd.DataFrame(data = ['Braund, Mr. Owen Harris',
 'Cumings, Mrs. John Bradley (Florence Briggs Thayer)',
 'Heikkinen, Miss. Laina',
 'Futrelle, Mrs. Jacques Heath (Lily May Peel)',
 'Allen, Mr. William Henry',
 'Moran, Mr. James',
 'McCarthy, Mr. Timothy J',
 'Palsson, Master. Gosta Leonard',
 'Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)',
 'Nasser, Mrs. Nicholas (Adele Achem)'], columns = ['Name'] )

# Have a look at the data
dataset.head()

Unnamed: 0,Name
0,"Braund, Mr. Owen Harris"
1,"Cumings, Mrs. John Bradley (Florence Briggs Th..."
2,"Heikkinen, Miss. Laina"
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)"
4,"Allen, Mr. William Henry"


Let's say, we'd like to extract the titles of each name for a new column named title.

*Note: Deriving and extracting values, i.e. features, is called feature engineering! An important step in the data science workflow that requires creative thinking and domain knowledge.*

In [23]:
dataset['Title'] = dataset['Name'].apply(lambda x: x.split(" ")[1].replace(".", ""))

# Take a look at how our dataset looks like now
dataset

Unnamed: 0,Name,Title
0,"Braund, Mr. Owen Harris",Mr
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",Mrs
2,"Heikkinen, Miss. Laina",Miss
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",Mrs
4,"Allen, Mr. William Henry",Mr
5,"Moran, Mr. James",Mr
6,"McCarthy, Mr. Timothy J",Mr
7,"Palsson, Master. Gosta Leonard",Master
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",Mrs
9,"Nasser, Mrs. Nicholas (Adele Achem)",Mrs
