# Lab 1 - Python

## Purpose
The purpose of this notebook is to make you more confident coding with Python and Jupyter Notebooks, which we will be using for the rest of the year (also in the second semester in Inteligência Artificial).

## Methodology
Review basic operations, data types and programming concepts in Python. Solve programming challenges.

## Results
Fast track to programming in Python.

---

## Setup

### Library import

In [1]:
import numpy as np
import 

#### Numpy

The fundamental package for scientific computing with Python

<img src="https://numpy.org/images/logo.svg" width="100" height="100">

[NumPy](https://numpy.org/)

---

# Review

## Types

https://docs.python.org/3.6/library/stdtypes.html

### Boolean

In [2]:
True

True

In [3]:
False

False

In [4]:
type(True)

bool

#### Boolean Operations 

Suppose variables `x` and `y`

    
<table><thead>
<tr>
<th style="text-align: center">Operator</th>
<th>Description</th>
<th>Code</th>
</tr>
</thead><tbody>
<tr>
<td style="text-align: center">and</td>
<td>True if both are true</td>
<td><code>x and y</code></td>
</tr>
<tr>
<td style="text-align: center">or</td>
<td>True if at least one is true</td>
<td><code>x or y</code></td>
</tr>
<tr>
<td style="text-align: center">not</td>
<td>True only if false</td>
<td><code>not x</code></td>
</tr>
</tbody></table>

In [5]:
x = True
y = False

In [6]:
x and y

False

In [7]:
x and not y

True

---

###  Numeric 
- int
- float

In [8]:
a_integer = 2
type(a_integer)

int

In [9]:
a_float = 2.0
type(a_float)

float

Cast from Float to Integer

In [10]:
int(2.0)

2

Cast from Integer to Float

In [11]:
float(2)

2.0

---

### Sequence Types

#### Text Sequence or String

In [12]:
a_string = "Data"
type(a_string)

str

Concatenate Strings

In [13]:
a_string + " Science"

'Data Science'

#### List,  Mutable Dynamic Arrays


Empty List

In [14]:
list_array = []
list_array

[]

Multiple types 

In [15]:
list_array = ["one", "two", 3, 4.0]
list_array

['one', 'two', 3, 4.0]

Mutability

In [16]:
list_array[1] = "A"
list_array

['one', 'A', 3, 4.0]

Append at the end

In [17]:
list_array.append("Data")
list_array.append("Science")
list_array.append("Intro")
list_array.append(10)
list_array

['one', 'A', 3, 4.0, 'Data', 'Science', 'Intro', 10]

Remove at the end

In [18]:
list_array.pop()
list_array

['one', 'A', 3, 4.0, 'Data', 'Science', 'Intro']

Remove Operation by index

In [19]:
del list_array[1]
list_array

['one', 3, 4.0, 'Data', 'Science', 'Intro']

Size of list

In [20]:
len(list_array)

6

Acessing list elements

In [21]:
list_array[0]

'one'

In [22]:
list_array[-1]

'Intro'

Indexing or **Slicing** List

```python
start:end(exclusive):step
```

From first to second, all values

In [23]:
list_array

['one', 3, 4.0, 'Data', 'Science', 'Intro']

From first to last, every two values

In [24]:
list_array[0:-1:2]

['one', 4.0, 'Science']

---

#### Range

In [25]:
range(4)

range(0, 4)

In [26]:
list(range(4))

[0, 1, 2, 3]

In [27]:
list(range(2, 4))

[2, 3]

---

#### Tuples

In [28]:
a_tuple = ("one", "two", "three")
a_tuple

('one', 'two', 'three')

Tuples are immutable! For example, if you try to update or delete one value it will return an error

In [29]:
a_tuple[1] = "hello"

TypeError: 'tuple' object does not support item assignment

In [30]:
del a_tuple[1]

TypeError: 'tuple' object doesn't support item deletion

Adding elements to a tuple creates a new one

In [None]:
a_second_tuple = a_tuple + (23,)
a_second_tuple

Memory address of each variable

In [None]:
id(a_tuple)

In [None]:
id(a_tuple + (23,))

---

### Mapping Types

#### Dictionary

A dictionary is a mapping object composed of *keys* and *values* and *keys* need to hashable (lists and dictionaries are not)

For example `{'data': 0, 'science': 0}`


In [None]:
a_dict = dict()
a_dict

In [None]:
a_dict = {}
a_dict

In [None]:
a_dict = {1: 2}
a_dict

In [None]:
type(a_dict)

In [None]:
a_dict = {'data': 2, 1: 2, 'science': 4}
a_dict

Access the value of a key

In [None]:
a_dict['data']

Delete entry with the key ```data```

In [None]:
del a_dict['data']

In [None]:
a_dict

---

## Membership Operators

In [None]:
5.0 in [ 1, 2, 3, 4]

In [31]:
5.0 not in [ 1, 2, 3, 4]

True

---

## Control Flow Tools

### For Statements

```python
for <var> in <iterable>:
    <statement(s)>
```

`iterable` is a collection of objects—for example, a list or tuple. The `statement(s)` in the loop body are denoted by indentation, as with all Python control structures, and are executed once for each item in `iterable`. The loop variable `var` takes on the value of the next element in `iterable` each time through the loop.

In [32]:
for value in [1,2,3,4]:
    print(value)

1
2
3
4


#### range is often used as the iterable

```python
    for <var> in range(limit):
        <statement(s)>
```

In [33]:
for value in range(len(list_array)):
    print(value)

0
1
2
3
4
5


### If Statements

```python
    if <expr>:
        <statement>
    elif <expr>:
        <statement>
    else:
        <statement>
```

`expr` is an expression evaluated in a Boolean context
    
`statement` is a valid Python statement, which must be indented

In [34]:
if 2 > 4:
    print("2 > 4")
else:
    print("2 <= 4")

2 <= 4


### List Comprehensions

Create lists with elements with a single statement

```python
[ expression for a_value in a_collection ]

```

In [35]:
[ a_value + 10 for a_value in [1,2,3,4,5] ]

[11, 12, 13, 14, 15]

List comprehension with conditions

```python
[ expression for a_value in a_collection if condition ]

```

In [36]:
[ a_value + 10 for a_value in [1,2,3,4,5] if a_value >= 3 ]

[13, 14, 15]

---

## Functions

In [37]:
def functionName(argument_one, argument_two):
    """
    Comment what it does
    """
    
    
    do_stuff
    
    return value


In [38]:
functionName?

### Call a function

```python
functionName(argument_one, argument_two)
```

In [40]:
def soma(x, y):
    """
    Return the sum between x and y
    """

    return x + y

In [41]:
soma(2, 4)

6

### Built-in Functions

We have already used some. But there are more:

https://docs.python.org/3/library/functions.html

## Code Standards

[PEP 8: Function and Variable Names](https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)  

**Summary**
 - Use Snake case in functions and variables, lowercase with words separated by underscores as necessary to improve readability
     - data_science  


 - For constants use all capital letters with underscores separating words
     - DATA_SCIENCE  


 - Use CapWords convention for classes
     - DataScience

# Challenges - Part I

**Cosine similarity from scratch**


## Question 1/6

The dot product between two vectors $v1$ e $v2$ of size $n$ is defined as:

$$\sum_{i=1}^n v1_i . v2_i $$

Create a `for` loop necessary to compute the dot product between any two vectors of the same size.



In [42]:
# SOLUTION 1

v1 = [1,2,3,4]
v2 = [3,2,5,4]

dot_prod = 0
for i in range(len(v1)):
    dot_prod += v1[i]*v2[i]
print(dot_prod)

38


In [43]:
# SOLUTION 2

v1 = [1,2,3,4]
v2 = [3,2,5,4]

res = []
if len(v1) == len(v2): # validar se os vetores são do mesmo tamanho
    for i in range(len(v1)): # aqui utilizamos um ciclo for tradicional 
        res.append(v1[i]*v2[i])
    res = sum(res)

print('the dot product of both vectors is', res)    


the dot product of both vectors is 38


## Question 2/6

Try using the built-in function `zip`. Pass vectors `v1` and `v2` as parameters.

**Note.** `zip` produces an object of class `zip`, but we can cast it as a `list` using the built-in function `list()` on that object

In [44]:
# SOLUTION

# defina uma variável x e instancie com a aplicação da função zip(v1,v2)

x = zip(v1,v2)

# aplique a função list à variável x para ver o conteúdo do resultado

list(x)


[(1, 3), (2, 2), (3, 5), (4, 4)]

Based on the result obtained explain what the `zip`function does.

In [45]:
# your answer here

In [46]:
# SOLUTION

# zip cria uma lista de tuplos onde cada tuplo corresponde a nth componente de cada um dos vectores.

---
## Question 3/6

Try using the built-in funcion `zip()` to compute the dot product in single line of code.

- **Hint 1:** use a list comprehension
- **Hint 2:** use the built-in function `sum()` to obtain the sum of all values of a list

In [47]:
# SOLUTION

if len(v1) == len(v2):
  res = sum([item[0]*item[1] for item in zip(v1,v2)])
else:
  res = -1
res

38

---
## Question 4/6

The norm of a vector (its magnitude) is defined as:

$$ \lVert x \rVert = \sqrt{\sum_{i=1}^n x_i^2}$$

Write a for loop that computes a norm of a given vector $v$.

In [48]:
# SOLUTION v1

total = 0
for item in range(len(v1)):
    total += v1[item]**2
total**0.5

5.477225575051661

In [49]:
# SOLUTION v2 - forma Pythonica

import math

v1 = [1,2,3,4]

norm = math.sqrt(sum([item**2 for item in v1]))
norm

# list comprehension:
# One of the language’s most distinctive features is the list comprehension, which you can use to create powerful
# functionality within a single line of code.
# TO DO...

5.477225575051661

## Question 5/6

Recall how to create a funcion:

```python
def name_of_the_function(list of arguments):
    do something
    ...
    return the_result
```

1. Define a function that computes the dot product of two vectors $v1$ e $v2$

2. Define a function that computes the norm of a vector $v$


In [50]:
# dot product

def dot_product(v1, v2):
    if len(v1) == len(v2):
        res = sum([item[0]*item[1] for item in zip(v1,v2)])
    else:
        res = -1
    return res



# norm

def norm_l2(v):
    return math.sqrt(sum([item**2 for item in v])) 

In [51]:
# Now that both funcions are defined, you can call them

dot_product(v1, v2)

38

In [52]:
norm_l2(v1)

5.477225575051661

In [53]:
norm_l2(v2)

7.3484692283495345

## Question 6/6

In Data Science there is this famous measure of similarity called cosine similarity. 

Given two vectors $v1$ e $v2$, it is defined as:

$$
  cos(v1, v2) = \frac{v1 . v2}{\lVert v1 \rVert \times \lVert v2 \rVert}
$$  

Using the above functions, define a new one that computes this similarity.


In [54]:
# SOLUTION

def sim_cos(v1, v2):
    return dot_product(v1,v2)/ (norm_l2(v1)*norm_l2(v2))

sim_cos(v1, v2)
  

0.9441175904999112

# Calculator Challenge
Write a Python function that accepts three parameters.  
- The first parameter is a numeric value.
- The second is one of the following mathematical operators: +, -, /, or .
- The third parameter will also be a numeric value

The function should perform a calculation and return the results.

For example, parameters `(2.5, '.', 3)` should return `7.5`.

In [55]:
# Solution

def calculator(int1 :float, operation :str, int2:float):
    """
    performs some calculation
    """
    if operation == '+':
        return int1+int2
    elif operation == '-':
        return int1-int2
    elif operation == '/':
        return int1/int2
    else:
        return int1*int2
    

# Extra Challenges

Write a list comprehension that filters an existing list of strings:

1. firstly, the ones with more than four letters
2. secondly, the ones with at least two different vowels
3. all the above


In [56]:
# Solution 1

my_words = ['pyramid', 'car', 'quiet', 'data', 'read']

[item for item in my_words if len(item) > 4]

['pyramid', 'quiet']

In [57]:
# Solution 2 part 1

def count_distinct_vowels(word):
    vowels = ['a','e','i','o', 'u']
    cnt = set([letter for letter in word if letter in vowels])
    return len(cnt)

count_distinct_vowels('pyramid')

2

In [58]:
# Solution 2 part 2

[item for item in my_words if count_distinct_vowels(item) >= 2]

['pyramid', 'quiet', 'read']

In [59]:
# Solution 3

[item for item in my_words if len(item) > 4 and count_distinct_vowels(item) >= 2]

['pyramid', 'quiet']