# Practical session week 1 - Python foundation

## Purpose
The purpose of this practical session is to make you more confident coding with Python and using Jupyter Notebooks, which we will be using for the rest of the year--both for Data Science and Artificial Intelligence.


## Goals
1. Working with Jupyter notebooks, cells, code and markdown
2. Review the basic Python data structures, operators and flow of control
3. Review the use and importance of custom libraries to augment Python's power

## Results
Fast track to programming in Python.

---

---

# Python Review

## Basic data types and their methods

https://docs.python.org/3.13/library/stdtypes.html

### Boolean

In [3]:
import math

#start a bolean variable 'a' as a True
a = True

#start a bolean variable 'b' as a False
b = False

#### Boolean Operations

Suppose variables `x` and `y`

    
<table><thead>
<tr>
<th style="text-align: center">Operator</th>
<th>Description</th>
<th>Code</th>
</tr>
</thead><tbody>
<tr>
<td style="text-align: center">and</td>
<td>True if both are true</td>
<td><code>x and y</code></td>
</tr>
<tr>
<td style="text-align: center">or</td>
<td>True if at least one is true</td>
<td><code>x or y</code></td>
</tr>
<tr>
<td style="text-align: center">not</td>
<td>True only if false</td>
<td><code>not x</code></td>
</tr>
</tbody></table>

In [None]:
# test all operators in a and b


---

###  Numeric
- int
- float

In [4]:
#Initiate an integer variable 'a' with the value 10
#and print the type
a = 10

In [5]:
#Initiate an float variable 'a' with the value 10.1
#and print the type
a = 10.1

Cast from Float to Integer

In [6]:
#integer part of a float can be extracted using int()
int(10.4)

10

Cast from Integer to Float

In [7]:
#convert an integer to a float using float()
float(10)

10.0

---
### Sequence Types

#### Text Sequence or String

In [9]:
#strings can be assigned by passing data in text format, either in single or double quotes
#test both cases for 'Data'
a = 'Data'
b = 'BigData'

#test if the two strings are equal
a == b

#checking the type of the variable
type(a)

str

Concatenate Strings

In [11]:
#to concatenate strings, just use the "+" operator.
#create a new string "Data Science" using "+"
c = a + b
c

'DataBigData'

#### List,  Mutable Dynamic Arrays


Empty List

In [12]:
#create a list using square brackets
list1=[]

#check the type
type(list1)

list

Multiple types

In [13]:
#in Python, a list can hold multiple types of variables"
#try this code "list_array = ["one", "two", 3, 4.0]"
#Check the type of all components on this list

list_array = ["one", "two", 3, 4.0]
type(list_array)

list

get list elements

In [14]:
#access the second element on the list and print the value.
list_array[1]

'two'

In [15]:
#You can also use negative numbers to access elements in reverse order
#test by printing the -1 element
list_array[-1]

4.0

Mutability

In [16]:
#data from a list can be accessed by passing the index inside square brackets
#lists can also be modified using the '=' operator
#test change a value from list
list_array[-1] = 'ultimo'
list_array[-1]

'ultimo'

Append at the end

In [17]:
#to add a new item to the end of the list, just use the append() method
#create a list and add itens using append()
list_array.append('ultimo_mesmo')

#print the contents of the list
list_array

['one', 'two', 3, 'ultimo', 'ultimo_mesmo']

Remove the last element

In [18]:
#to remove an item from a list, use pop
#if no parameter is passed inside the parentheses, it removes the last item
#pop always returns the item that was removed from the list
#try it
list_array.pop()
list_array

['one', 'two', 3, 'ultimo']

Remove Operation by index

In [21]:
#It is possible to pass the address within the parentheses to remove a specific item
#the pop always returns the item that was removed from the list
#remove item from index 2
list_array.pop(2)
list_array

['one', 'two', 'ultimo']

In [23]:
#whe can use also 'del' method
#try to remove a item from the list
del list_array[0]
list_array

['two', 'ultimo']

Size of list

In [24]:
#We can quickly get the size of a list using 'len'
#What is the final size of the list after the various removals?
len(list_array)

2

Indexing or **Slicing** List

```python
start:end(exclusive):step
```

From first to second, all values

In [28]:
#Initially, create a list with 5 elements
#Then, print the first 3 elements using slice
a = [1,2,3,4,5]
a[0:3]

[1, 2, 3]

From first to last, every two values

In [29]:
#now do a slice starting from the first 3 elements, skipping 2 by 2
a[0:3:2]

[1, 3]

---
#### Range

In [32]:
#range is used to create lists of sequential numbers
#try the range(4)
#how can you print a range?
b = [i for i in range(1,5)]
b

[1, 2, 3, 4]

In [33]:
#Create also a range from 2 up to 7 (7 not included)
c = [i for i in range(2,7)]
c

[2, 3, 4, 5, 6]

---
#### Tuples

In [34]:
#Tuples are a type of data that are stored data together and linked
#Tuples are created using regular parentheses
#create a tuple ("one", "two", "three")
d = ("one", "two", "three")
d

('one', 'two', 'three')

In [35]:
#it is possible to access a tuple value by passing the address inside []
#Try to access the second element of the tuple.
d[1]

'two'

Tuples are immutable! For example, if you try to update or delete one value it will return an error

In [36]:
#It is not possible to update a value in a tuple
#try to change the second element of the tuple using "="
d[1] = 2
d

TypeError: 'tuple' object does not support item assignment

Adding elements to a tuple creates a new one

In [39]:
#one solution would be to create a new tuple from an existing one
#add a tuple piece (23,) to the tuple you have just created and create a new tuple called a_second_tuple"
a_second_tuple = d + (23,)
a_second_tuple
#However, it is important to emphasize that the first tuple continues to exist and remains immutable.


('one', 'two', 'three', 23)

Memory address of each variable

In [40]:
#to prove this, we can access the address of the tuples using the id() method
#test both created tuples to see if the addresses are different
id(a_second_tuple)

2189330017680

---

### Mapping Types

#### Dictionary

A dictionary is a mapping object composed of *keys* and *values* and *keys* need to hashable (lists and dictionaries are not)

For example `{'data': 0, 'science': 0}`


In [41]:
#create a dictionary
#use dict() or {}
#test both ways
a_dict = dict()

#test using type(), the dictionary you just created
print(type(a_dict))

<class 'dict'>


Access the value of a key

In [42]:
#dictionaries accept various types of values
#keys need to be hashable
#create a dictionary a_dict = {'data': 2, 1: 2, 'science': 4}
a_dict = {'data': 2, 1: 2, 'science': 4}
a_dict

#Note that the keys are not of the same type.

{'data': 2, 1: 2, 'science': 4}

In [43]:
#acesse o value contido na key 'data'
a_dict['data']

2

Delete entry with the key ```data```

In [44]:
#It is possible to delete a value from a dict using the del function
#delete the data contained in the key data and print the dictionary
del a_dict['data']
print (a_dict)

{1: 2, 'science': 4}


---

## Operator "in"

In [46]:
#test if the float 0.5 is in list a=[ 1, 2, 3, 4] using "in"
a=[ 1, 2, 3, 4]
0.5 in a

False

---

## Control Flow Tools

### For Statements

```python
for <var> in <iterable>:
    <statement(s)>
```

In [47]:
#for loop will interact with an object and return all parts of that object
#using the structure above, create a for loop that iterates over the list [1,2,3,5] and print the values
for x in a:
    print(x)


1
2
3
4


#### range is often used as the iterable

```python
    for <var> in range(limit):
        <statement(s)>
```

In [48]:
#We can even use a range and manipulate the addresses of a list
#using the len() function, create a for loop with range that prints the addresses of the list [1,2,3,5]
for x in range(1,5):
    print(x)


1
2
3
4


### If Statements

```python
    if <expr>:
        <statement>
    elif <expr>:
        <statement>
    else:
        <statement>
```

`expr` is an expression evaluated in a Boolean context
    
`statement` is a valid Python statement, which must be indented

In [50]:
#If in Python works just like in other languages but without the need for '()' or '{}'
#It is mandatory to indent the if structure for it to work
#Create an if statement that tests if 2 > 4. If true, print 'yes', otherwise print 'no'
if 6 > 4:
    print('true')
else:
    print('false')

true


### List Comprehensions

Create lists with elements with a single statement

```python
[ expression for a_value in a_collection ]

```

In [53]:
#create a loop structure that iterates over the list [1,2,3,4,5] and prints the values +10 Ex: [11,12,13,14,15]
a = [1,2,3,4,5]
for x in a:
    print(x + 10)


#now using list comprehensions, create the same structure as above

a = [print(i+10) for i in range(1,6)]


11
12
13
14
15
11
12
13
14
15


List comprehension with conditions

```python
[ expression for a_value in a_collection if condition ]

```

In [59]:
#Following the same structure above, now print the number plus 10 for only the values greater than 3 in the list [1, 2, 3, 4, 5].
b = [i for i in range(1,6)]
for i in b:
    if i > 3:
        print(i+10)


14
15


---

## Functions

In [60]:
def functionName(argument_one, argument_two):
    """
    Comment what it does
    """


    do_stuff

    return value


In [None]:
functionName?

### Call a function

```python
functionName(argument_one, argument_two)
```

In [62]:
#using the structure above, create a function that takes two numbers as parameters and returns their sum

def sum(a,b):
    return a+b


3

### Built-in Functions

We have already used some. But there are more:

https://docs.python.org/3/library/functions.html

## Code Standards

[PEP 8: Function and Variable Names](https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)  

**Summary**
 - Use Snake case in functions and variables, lowercase with words separated by underscores as necessary to improve readability
     - data_science  


 - For constants use all capital letters with underscores separating words
     - DATA_SCIENCE  


 - Use CapWords convention for classes
     - DataScience

# Challenges - Part I

**Cosine similarity from scratch**


## Question 1/6

The dot product between two vectors $v1$ e $v2$ of size $n$ is defined as:

$$\sum_{i=1}^n v1_i . v2_i $$

Create a `for` loop necessary to compute the dot product between any two vectors of the same size.



In [3]:
# SOLUTION 1

# creating the two vectors
v1 = [1,2,3,4]
v2 = [3,2,5,4]

result = 0
for index,value in enumerate(v1):
    result += v1[index] * v2[index]
result

38

## Question 2/6

Try using the built-in function `zip`. Pass vectors `v1` and `v2` as parameters.

**Note.** `zip` produces an object of class `zip`, but we can cast it as a `list` using the built-in function `list()` on that object

In [2]:
list_2=list(zip(v1,v2))
list_2


[(1, 3), (2, 2), (3, 5), (4, 4)]

Based on the result obtained explain what the `zip`function does.

In [None]:
Junta os elementos em tuplas


---
## Question 3/6

Try using the built-in funcion `zip()` to compute the dot product in single line of code.

- **Hint 1:** use a list comprehension
- **Hint 2:** use the built-in function `sum()` to obtain the sum of all values of a list

In [12]:
# zip is used to combine the two lists (ignoring excess values)
# it uses the addresses to access the items in the tuple and performs the multiplication
v1 = [1,2,3,4]
v2 = [3,2,5,4]
list(zip(v1,v2))
sum([i[0]*i[1] for i in list(zip(v1,v2))])



38

---
## Question 4/6

The norm of a vector (its magnitude) is defined as:

$$ \lVert x \rVert = \sqrt{\sum_{i=1}^n x_i^2}$$

Write a for loop that computes a norm of a given vector $v$.

In [15]:
import math
v1 = [2,2,2,2]
result = 0
for i,v in enumerate(v1):
    result += v * v
    if i == len(v1)-1:
        new_result = math.sqrt(result)

new_result


4.0

## Question 5/6

Recall how to create a funcion:

```python
def name_of_the_function(list of arguments):
    do something
    ...
    return the_result
```

1. Define a function that computes the dot product of two vectors $v1$ e $v2$

2. Define a function that computes the norm of a vector $v$

3. Call both function

In [17]:
def dot_product(v1,v2):
    result = 0
    for index,value in enumerate(v1):
        result += v1[index] * v2[index]
    return result

def norm(v1):
    result = 0
    for i,v in enumerate(v1):
        result += v * v
        if i == len(v1)-1:
            return math.sqrt(result)

dot_product([1,2],[1,2])
norm([2,2,2,2])




4.0

## Question 6/6

In Data Science there is this famous measure of similarity called cosine similarity.

Given two vectors $v1$ e $v2$, it is defined as:

$$
  cos(v1, v2) = \frac{v1 . v2}{\lVert v1 \rVert \times \lVert v2 \rVert}
$$  

Using the above functions, define a new one that computes this similarity.


In [None]:

def coisine(v1,v2):
    return (dot_product(v1,v2))/(norm(v1)*norm(v2))

In [19]:
print(coisine([1,2,3,4],[3,2,5,4]))




0.9441175904999112