# 1. Introduction to Numpy

<br>
<center>
<img src="https://files.realpython.com/media/The-Python-Print-Function_Watermarked.26066d64ad82.jpg">
</center>


# 1. Introduction to Numpy

## 1.1. What is Numpy?
--- 

- Numpy is a Python **package**
- Offers several mathematical tools, such as calculating **mean**, **maximum** etc. 
- Most important for **vector operations** (linear algebra)
- When analyzing data, normally we want to operate over an entire **collection of values** (i.e. `lists`)
- Last but no least, Numpy is fast!



# 1. Introduction to Numpy


## 1.2. Installing Numpy
---

- First and foremost, we need to **install** the **package**
- To do this, write the following on your notebook

In [158]:
!pip install numpy



# 1. Introduction to Numpy


## 1.3. Importing Numpy
---

In [45]:
import numpy as np

# 1. Introduction to Numpy

## 1.4. Motivating Numpy
---
- Let's say we want to calculate the BMI index for several people

$$BMI = \frac{weight}{height^2}$$


In [1]:
height = [1.50, 1.60, 1.70]
weight = [60, 70, 80]

---
- It would be cool if we could do the following operation:

In [2]:
weight / height**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

---
- Python `lists` do not support **vector** operations
- Our intention was that each element of the list would be operated by the **same index** (element-wise operation)
- But `lists` do not know how to do that
- **Solution**: Numpy!

In [4]:
import numpy as np

np_height = np.array(height)
np_weight = np.array(weight)
np_height

array([1.5, 1.6, 1.7])

---
- Trying the same calculation as before...

In [5]:
np_weight / np_height**2

array([26.66666667, 27.34375   , 27.6816609 ])

---
- Comparing the different behavior between `np.array` and `list`

In [9]:
# PASTES THE LISTS TOGETHER
weight + height

[60, 70, 80, 1.5, 1.6, 1.7]

In [10]:
# ELEMENT-WISE SUM
np_weight + np_height

array([61.5, 71.6, 81.7])

# 1. Introduction to Numpy


## 1.5. First Numpy tool - the `array`
---

- As we have seen in the example before a `np.array` is similar to a Python `list`
- Like an `int` or `float` a `np.array` is just another **type** - a more complex one, built by other people


---
- `np.array` is **indexed** just like a Python `list`
- So, we can use `[ ]` to access an **element**

In [12]:
weight[0], np_weight[0]

(60, 60)

---
- And we can also **slice** the `np.array`

In [42]:
weight[1:], np_weight[1:]

([70, 80], array([70, 80]))

# 1. Introduction to Numpy


## 1.6. Numpy **types** interacting with Python **types**
---

- **Numpy** numbers are different from Python numbers, because they have different **types**!

In [14]:
type(np_weight[0])

numpy.int64

---
- We can **operate** both **Numpy** numbers and **Python** numbers

In [16]:
result = np_weight[0] * 2
print(result)
type(result)

120


numpy.int64

- **NOTE**: **Type conversion** occurs! Always converted to **Numpy** numbers

---
- **NOTE**: `np.array` can only contain one **type** though
- Watch what happens

In [8]:
np.array([1, '1', True])

array(['1', '1', 'True'], dtype='<U21')

- Everything got **converted** to a `string`! Be careful not to mix **types** with `np.array`

# 1. Introduction to Numpy

## 1.7. Exercise - Celsius to Fahrenheit, `np.array` style
---

$$F = \frac{9}{5} * C + 32$$

---

In [23]:
temperatures = np.array([30, 50, 70])
(9/5) * temperatures + 32

array([ 86., 122., 158.])

# 1. Introduction to Numpy

## 1.8. Exercise - Manual Linear Regression
---

$$y = \sum_{i=0}^{N}{a_i*x_i}$$

In [24]:
coefs = np.array([0.2, 0.5, 0.75])
# variable 1 variable 2 variable 3
data = np.array([1, 3, 5])

---

In [25]:
result = sum(coefs * data)
result

5.45

# 1. Introduction to Numpy

## 1.9. `np.array` subsetting
---

- We can **subset** (take a few elements) from a `np.array`
- This can be done with a `np.array`

- This **array** can be of **indexes**


In [31]:
sample = np.array([1, 2, 3, 4, 5])
sample[[0, 3, -1]]

array([1, 4, 5])

---
- Or an **array** of `booleans` (yes/no for **each** position)

In [34]:
sample[[True, False, True, False, True]]

array([1, 3, 5])

---
- If a `boolean` array (or `list`) is not of the same size...

In [33]:
sample[[True, False]]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 5 but corresponding boolean dimension is 2

- **NOTE**: An **array** of `booleans` has to be of the **same size**

# 1. Introduction to Numpy

## 1.10. Exercise - subsetting `np.array` with `booleans`
---

Get the first a last elements of a `range` of numbers from 1 to 5 using `np.array`.


1. Create a `np.array` with a `list` of numbers, that you get from a `range`
2. Print the **first** and **last** elements of the `np.array`, using:
    - A `list` of numbered indexes
    - A `list` of `booleans`

---

In [73]:
array = np.array(range(1, 6))
print(array[[0, -1]])
print(array[[True, False, False, False, True]])

[1 5]
[1 5]


# 1. Introduction to Numpy

## 1.11. Further notes on `np.array` subsetting with `booleans`
---

- As you might recall , we can get a `boolean` as a result from **conditionals**

In [35]:
print(1==1)
print(1!=1)

True
False


---
- With `np.array` as we expect by now, when operating we get a `np.array` of the **same size**
- Because it is **element-wise**!

In [37]:
print(np_weight)
np_weight == 60

[60 70 80]


array([ True, False, False])

---
- As we saw, we can **subset** an **array** with another **array of booleans** 

In [44]:
people_with_60_kilos = np_weight == 60
np_height[people_with_60_kilos]

array([1.5])

# 1. Introduction to Numpy

## 1.12. Exercise - Get heights of people with less than 70 kilos

---

**HINT**: Remember that one of the **boolean operators** is the `<`

---

In [75]:
people_with_less_than_70_kilos = np_weight < 70
np_height[people_with_less_than_70_kilos]

array([1.5])

# 1. Introduction to Numpy

## 1.13 2-D `np.array` - Welcome to the Matrix
---

In [46]:
type(np_weight)

numpy.ndarray

- `numpy.ndarray` - stands for $N$-dimensional **array**
    - 1-D
    - 2-D
    - 3-D
    - ...
    - 56789-D

---
- Let's build a 2D `np.array` from `np_weight` and `np_height`

In [47]:
data = np.array([np_weight, np_height])
data

array([[60. , 70. , 80. ],
       [ 1.5,  1.6,  1.7]])

---
- We can use `shape` to find out what are the **dimensions** of our `np.array`
- **NOTE**: this is not a **method** like `append()`. Methods have parenthesis `( )`. This is called an **attribute**, because it doesn't have parenthesis

In [49]:
# ROWS X COLUMNS
data.shape

(2, 3)

# 1. Introduction to Numpy

## 1.14 Subsetting 2-D `np.array`
---

- We can also **index** the 2-D `np.array`, just like we would in a `list` of `lists`

```
        0      1     2
array(
       [[60. , 70. , 80. ],   0
       [ 1.5,  1.6,  1.7]]    1
       )
```

- **REMEMBER**: Indexes start at 0!

In [50]:
data[0][-1]

80.0

---
- We get the first list `[60. , 70. , 80. ]` with **index** 0
- Then, we get the last element `80.0` with **index** -1

---
- `np.arrays` allow us to be more flexible with our indexing.
- We can separate **sequential** indexes, by separating them with a **comma** `,`

In [51]:
data[0, -1]

80.0

---
- Suppose that we want to take both `height` and `weight` from the first person
- So, we want both first and second `lists`, but only with the first **elements**
- We can use `:` to get all the `lists`

In [68]:
data[:]

array([[60. , 70. , 80. ],
       [ 1.5,  1.6,  1.7]])

- **REMEMBER**: We can think of `:` as a `list` **slice** from beggining to end. Everything!

---
- We can now **index** both lists, at the same time
- We use the same logic as if it were one list only

In [63]:
data[:, 0]

array([60. ,  1.5])

---
- Watch what happens if we use separate square brackets `[ ]`

In [64]:
data[:][0]

array([60., 70., 80.])

- This is because `data[:]` returns both `lists` and we are just **indexing** the first
- So we have to use **sequential** indexing of Numpy

---
- We can also **slice** the list using this notation

In [66]:
result = data[:, 1:]
print(result)
result.shape

[[70.  80. ]
 [ 1.6  1.7]]


(2, 2)

- We retain our **matrix** structure

---
- We can also do the inverse
- Select the first `list` and take **all** values using `:`

In [67]:
data[0, :]

array([60., 70., 80.])

- **MOTIVATION**: This **matrix** shape of our data is what is used when training models (Machine Learning)

# 1. Introduction to Numpy

## 1.15. Summary of Numpy Indexes
---

<br>
<center>
    <img src="https://www.oreilly.com/library/view/python-for-data/9781449323592/httpatomoreillycomsourceoreillyimages2172114.png">
    </center>



# 1. Introduction to Numpy

## 1.16. Exercise - subsetting 2-D `np.array`
---

- [1, :]
- [: 1:-1]
- [:, mask]



---

In [159]:
print()




# 1. Introduction to Numpy

## 1.17. Numpy statistics
---

- If your data is of small size, you can just look at it (family data)

- But what if we have hundreds of datapoints? (city wide survey?)
- You can get to know your data, by looking at it's statistics
    - What is the **mean** weight of the whole family?
    - What is the **minimum** and **maximum** weights?
    - For people between 1.70m and 2m whats is the **mean** weight?
- Statistics give us an overall picture of our data


In [80]:
np_city = np.array(
    [
        1.0 + (np.random.rand(5000) * (2.1 - 1.0)),
        20 + (np.random.rand(5000) * (120 - 20))
    ]
)

- We now have data for a city-wide survey of weight and height

In [89]:
np_city

array([[  1.4857498 ,   1.73759679,   1.21730934, ...,   1.95636957,
          1.57995815,   1.24585698],
       [ 76.57469787, 101.35517438,  36.44883043, ...,  60.71934684,
        101.11100416,  63.72072813]])

In [90]:
np_city.shape

(2, 5000)

---
- Numpy offers several statistical tools. One of them is the **mean**
- They come implemented in the Numpy package, so like we did for `np.array`, we do `np.mean`

In [91]:
# AVERAGE HEIGHT
np.mean(np_city[0, :])

1.5509670730617482

---
- But, our `np.array` also has **methods**
- And it includes `mean`

In [92]:
np_city[0, :].mean()

1.5509670730617482

# 1. Introduction to Numpy

## 1.18. Mean, Standard Deviation, Median, Minimum, Maximum
---

- **Mean** with `np.mean()` or `np.array.mean()`

In [122]:
print(np.mean(np_city[0, :]))
print(np_city[0, :].mean())

1.5509670730617484
1.5509670730617484


---
- **Standard Deviation** with `np.std()` or `np.array.std()`

In [123]:
print(np.std(np_city[1, :]))
print(np_city[1, :].std())

28.84662240339476
28.84662240339476


---
- **Median** is only accessible through the Numpy package, with `np.median()`

In [124]:
np.median(np_city[1, :])

68.40169991721066

In [125]:
np_city[1, :].median()

AttributeError: 'numpy.ndarray' object has no attribute 'median'

---
- **Minimum** with `np.min()`, `np.array.min()` or Python `min()`

In [126]:
print(np.min(np_city[1, :]))
print(np_city[1, :].min())
print(min(np_city[1, :]))

20.002781674879248
20.002781674879248
20.002781674879248


---
- **Maximum** with `np.max()`, `np.array.max()` or Python `max()`

In [127]:
print(np.max(np_city[1, :]))
print(np_city[1, :].max())
print(max(np_city[1, :]))

119.99685135177963
119.99685135177963
119.99685135177963


# 1. Introduction to Numpy

## 1.19. Calculating statistics
---

Calculate the following:
1. `np_weight` **mean** using `np.array.mean()` - the array **method**
2. `np_weight` **maximum** using `np.max()` - the Numpy package **function**
3. `np_weight` **total sum** using `sum()` - Python's **built-in function**

In [128]:
print(np_weight.mean())
print(np.max(np_weight))
print(sum(np_weight))

70.0
80
210


# 1. Introduction to Numpy

## 1.20. Other Functions
---



- **Sum** with `np.sum()`, `np.array.sum()` or Python `sum()`

In [113]:
print(sum(np_city[0, :]))
print(np.sum(np_city[0, :]))
print(np_city[0, :].sum())

7754.835365308759
7754.835365308741
7754.835365308741


---
- **Sort** with `np.sort()`, `np.array.sort()` or Python `sorted()`

In [121]:
print(sorted(np_city[0, :])[:10])
print(np.sort(np_city[0, :]))
# INPLACE - CHANGES VARIABLE
np_city[0, :].sort()
print(np_city[0, :])

[1.0000145004799257, 1.0001051477574863, 1.000143329489774, 1.0006611150888316, 1.000997089398832, 1.0013739505753136, 1.0015204943658784, 1.0017365801472062, 1.0018867052026383, 1.0021427820136912]
[1.0000145  1.00010515 1.00014333 ... 2.09896207 2.09900546 2.09902032]
[1.0000145  1.00010515 1.00014333 ... 2.09896207 2.09900546 2.09902032]


# 1. Introduction to Numpy

## 1.21. Summary
---



# 2. New Python `type` - Dictionaries

<br>
<center>
    <img src="https://files.realpython.com/media/Cool-New-features-in-Python-3.7_Watermarked.cfa5288c143d.jpg">
</center>

# 2. New Python `type` - Dictionaries
## 2.1. Introduction
---

- Python **dictionaries** or `dict` are like a real-life dictionary

<br>
<center>
            <img src='img/dictionary.png' width=10%>
    </center>
<br>

- We **lookup** dictionaries, for **key** words and their corresponding **definition** or **value**
- In Python, it is the same thing, we have **keys** and **values**
- Each **key** coresponds to **one value**

# 2. New Python `type` - Dictionaries
## 2.2. Motivation
---

- Suppose we have a `list` of phone contacts (done in Class 1)

In [129]:
name = "Pedro"
phone_number = 917040672

friend_name = "Xzibit"
friend_phone_number = 100200300

contact1 = [name, phone_number]
contact2 = [friend_name, friend_phone_number]
contacts = [contact1, contact2]
print(contacts)

[['Pedro', 917040672], ['Xzibit', 100200300]]


---
- If we want to get Xzibit's phone contact, we would have to **iterate** the `list` until we find Xzibit
- What if we could look for Xzibit by the name `Xzibit`.
- **SOLUTION**: Python dictionaries!

- We can make `dict` using **curly brackets** `{ }`
- We use `:` to **assign** a **value** to a **key**

In [132]:
contacts = {
    'Pedro': 917040672,
    'Xzibit': 100200300
}
print(contacts)
print(contacts['Xzibit'])

{'Pedro': 917040672, 'Xzibit': 100200300}
100200300


- `'Pedro'` and `'Xzibit'` are **keys**
- Both numbers are **values**

# 2. New Python `type` - Dictionaries
## 2.3. Exercise - 
---

1. Build a `dict` with the following data:
    - **Key** `'weight'` and **value** `np_weight`
    - **Key** `'height'` and **value** `np_height`
2. Finally, **sum** both `np_weight` and `np_height`, but **using the dictionary to access the values** (i.e. `dict['x'] + dict['y']`)

---

In [136]:
data = {
    'weight': np_weight,
    'height': np_height
}
data['weight'] + data['height']

array([61.5, 71.6, 81.7])

# 2. New Python `type` - Dictionaries
## 2.3. Dictionaries of dictionaries
---

- We can build complex **data structures** with dictionaries
- We can have `lists` of `lists`, but we can also have `dict` of `dict` of `dict` of ...


In [134]:
contacts = {
    'Pedro': {
        'number': 917040672,
        'address': {
            'city': 'Lisbon',
            'zip_code': '1600-007'
        }
    },
    'Xzibit': {
        'number': 100200300,
        'address': {
            'city': 'USA',
            'zip_code': '001-230'
        }
    },
}

---
- We can chain **sequential indexes**, given that we know the structure of our `dict`

In [135]:
print(contacts['Xzibit']['address']['zip_code'])

001-230


# 2. New Python `type` - Dictionaries
## 2.3. Iterating dictionaries
---

- This is how we iterate a `list`

In [137]:
numbers = range(5)
for element in numbers:
    print(element)

0
1
2
3
4


---
- Iterating a `dict` is the same thing
- The element we get though, is the `dict` **key**

In [138]:
for key in contacts:
    print(key)
    print(contacts[key])

Pedro
{'number': 917040672, 'address': {'city': 'Lisbon', 'zip_code': '1600-007'}}
Xzibit
{'number': 100200300, 'address': {'city': 'USA', 'zip_code': '001-230'}}


---
- There is a way to iterate both **keys** and **values** at the same time
- But we have to use the `items()` **method** of dictionaries to achieve this
- Since we are getting both a **key** and a **value** at the same time, we have to use **two iteration variables**.
- We'll call our **iteration variables** `key` and `value`

In [140]:
for key, value in contacts.items():
    print(key)
    print(value)

Pedro
{'number': 917040672, 'address': {'city': 'Lisbon', 'zip_code': '1600-007'}}
Xzibit
{'number': 100200300, 'address': {'city': 'USA', 'zip_code': '001-230'}}


# 2. New Python `type` - Dictionaries
## 2.3. Summary
---


# 3. Functions with **named** arguments

<br>
<center>
    <img src="https://files.realpython.com/media/Newbie_Watermarked.a9319218252a.jpg">
</center>

# 3. Functions with **named** arguments
## 3.1. Reviewing functions with an example - Min Max Scaling
---

- In Data Science, sometimes it's needed to **scale** data
- To **scale** is to shift the **domain** of the data (i.e. `[0, 100] -> [0, 1]`)
- This is helpful when developing predictive models

**Min Max Scaling** between 0 and 1:
$$x' = \frac{x - min(x)}{max(x) - min(x)}$$

**Min Max Scaling** between $a$ and $b$:
$$x' = a + \frac{(x - min(x))(b-a)}{max(x) - min(x)}$$

---
- The overall structure of a **function** is as follows

In [150]:
def min_max_scaling(data, a, b):
    minimum = np.min(data)
    maximum = np.max(data)
    
    numerator = (data - minimum) * (b - a)
    denominator = maximum - minimum
    return a + (numerator / denominator)

In [151]:
min_max_scaling(np_weight, 0, 1)

array([0. , 0.5, 1. ])

In [152]:
min_max_scaling(np_weight, 10, 567)

array([ 10. , 288.5, 567. ])

---
- What if we want to add **default** behavior to our **scaling function**?
- We want it **by default** to scale between 0 and 1
- We can make use of **named arguments** on the **function definition**
- **Named arguments** have **default values** 

In [153]:
def min_max_scaling(data, a=0, b=1):
    minimum = np.min(data)
    maximum = np.max(data)
    
    numerator = (data - minimum) * (b - a)
    denominator = maximum - minimum
    return a + (numerator / denominator)

---
- Now we can call the **function** without the need to specify the `lower_bound` and `upper_bound

In [154]:
min_max_scaling(np_weight)

array([0. , 0.5, 1. ])

---
- When calling the **function**, we can **write** the names of the arguments. Like so:

In [156]:
min_max_scaling(np_weight, b=60, a=30)

array([30., 45., 60.])

- **NOTE**: You can still call the function without writing the arguments names
- **NOTE**: You can alter the order of the **named arguments**, as long as you write their name!
- **THOUGHT**: Imagine that a **function** had 100 arguments. You would have to know the **exact order** of each argument. What a headache! We can make our lives easier by using **named arguments** 

---
- **NOTE**: But you can't put a **named argument** before a **non-named argument**

In [157]:
min_max_scaling(b=60, np_weight, a=30)

SyntaxError: positional argument follows keyword argument (<ipython-input-157-7e30830ec238>, line 1)

# 3. Functions with **named** arguments
## 3.7. Summary
---

# 4. Introduction to Pandas

<br>
<center>
        <img src="https://files.realpython.com/media/List-Comprehensions-in-Python_Watermarked.39cf85bdd5d0.jpg">
</center>

# 4. Python Lists
## 4.1 Python Lists

- What if we are gathering family height data. Will we have a **variable** for each of the numbers? 

In [43]:
height1 = 1.50
height2 = 1.60
height3 = 1.70

- The answer to this problem are **`lists`**
- A `list` is just a set of items - a **collection of values**


- You can make a `list` with square brackets `[ ]`. Like this: 

In [44]:
heights = [1.50, 1.60, 1.70]
print(heights)

[1.5, 1.6, 1.7]


- A `list` is like a box with boxes, and each box can have a different **type**


- Let's add different **typed** **variables** to our `list`

In [45]:
heights = [
    'ze', 1.50,
    'manel', 1.60,
    'josefina', 1.70
]
print(heights)

['ze', 1.5, 'manel', 1.6, 'josefina', 1.7]


- I heard you like lists, so I put lists on your lists!

In [46]:
heights = [
    ['ze', 1.50],
    ['manel', 1.60],
    ['josefina', 1.70]
]
print(heights)

[['ze', 1.5], ['manel', 1.6], ['josefina', 1.7]]


# 4. Python Lists

<center>
    <img src="https://www.meme-arsenal.com/memes/0ca36be035e2c8c79d6af4543c7a6990.jpg" width=20%>
    </center>

## 4.2. Exercise - Variables and Creating Lists

Let's create a contact list, containing **names** and **phone numbers**.\
The objective is to combine **variable assignment**, building `lists`, and finally a **list of lists**.

1. **Assign** your name to a **variable** named `name` and your phone number to a **variable** named `phone_number`
2. **Assign** a friends name to a **variable** named `friend_name` and the corresponding phone number to `friend_phone_number`
3. Create a `list` named `contact1` containing both `name` and `phone_number`
4. Create a `list` named `contact2` containing both `friend_name` and `friend_phone_number`
5. Finally, create a `list` named `contacts` containing both **variables** `contact1` and `contact2`
6. `Print` the resulting list `contacts`

In [47]:
name = "Pedro"
phone_number = 917040672

friend_name = "Xzibit"
friend_phone_number = 100200300

contact1 = [name, phone_number]
contact2 = [friend_name, friend_phone_number]
contacts = [contact1, contact2]
print(contacts)

[['Pedro', 917040672], ['Xzibit', 100200300]]


# 4. Python Lists
## 4.3. Actually using lists

- Great, we have a bunch of numbers in a `list`. What now?
- We can access each **element** of the `list`
- Lists have **indexes**

<center>
    <img src="https://cdn.dribbble.com/users/201599/screenshots/1545461/book.jpg" width=20%>
</center>


- To use the **index** to get an element of the `list` we also use square brackets `[ ]`
- Unlike a book **index**, that start chapters with 1, in Python **indexes start with `0`**


In [48]:
print(contacts[1])

['Xzibit', 100200300]


- The result is still a `list`, so we can chain these **indexes**

In [49]:
print(contacts[0][1])

917040672


- What happens if we try to access a non-existing **index**? For instance `100`

In [50]:
print(contacts[100])

IndexError: list index out of range

- **Indexes** can work from **last to first** element. Begins with `-1`

In [51]:
print(contacts[-1])

['Xzibit', 100200300]


# 4. Python Lists
## 4.4. "Slicing" lists

- What if we want multiple **elements** of a `list`? This is called **slicing**



- Uses the notation `[i:j]`for **from** `i` **to** `j`
- `j` is not inclusive!


In [52]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[1:3])

[1, 2]


- We can also **slice** **from** `i` (to the end)

In [53]:
print(numbers[3:])

[3, 4, 5, 6, 7, 8, 9]


- And we can **slice** **to** `j` (from the start) is also possible!

In [54]:
print(numbers[:5])

[0, 1, 2, 3, 4]


# 4. Python Lists
## 4.5. Exercise - Slicing and operating with lists

The objective of this exercise is to **slice** the same `list` in two different parts and joining them together into **one** `list` (not a list of lists).

1. Create a `list` of numbers from 0 to 9. Name it `numbers`;
2. **Slice** `numbers` to get the first three numbers and name it `slice1`;
3. **Slice** `numbers` to get the last three numbers and name it `slice2`;
4. Create a `list` from lists `slice1` and `slice2`.

**HINT**: `Lists` can be **operated** with. Just like `int`s and `str`s
<br>
**HINT**: Try summing the two lists

In [55]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
slice1 = numbers[:3]
slice2 = numbers[-3:]
result = slice1 + slice2
print(result)

[0, 1, 2, 7, 8, 9]


# 4. Python Lists
## 4.6. Changing elements in a list

- The same way that we access an element of the list we can also change it
- Like it was a variable, we use the `=` sign

In [56]:
numbers[0] = 7000
print(numbers)

[7000, 1, 2, 3, 4, 5, 6, 7, 8, 9]


**Curious** note for future work:
- What happens if a variable that has a list is assigned to another variable?
- Does changing one of the variables, change the other?

In [57]:
x = ['a', 'b', 'c']
y = x
y[1] = 'z'
print(y)
print(x)

['a', 'z', 'c']
['a', 'z', 'c']


- `x` also changed! Beware in the future!

<center>
<div style="content:''; clear:both; display:table">
  <div style="float:left;width:50%;padding:5px">
    <img src="scrot3.png" width=200>
  </div>
  <div style="float:left; width:50%; padding:5px">
    <img src="scrot2.png" width=200>
  </div>
</div> 
</center>

# 4. Python Lists
## 4.7. Adding elements to a list

- You can add elements to the end of the list using the `append`

In [58]:
x.append('D')
print(x)

['a', 'z', 'c', 'D']


- Notice that we didn't put the result in a variable
- That is because `append` already altered variable `x`. 

- How would we add elements to the end of the list just using list operations?

**HINT**: we summed lists earlier

In [59]:
x = x + ['E']
print(x)

['a', 'z', 'c', 'D', 'E']


# 4. Python Lists
## 4.8. Summary

- `Lists` are a python **type** and a **collection of values**
- `Lists` are made with square brackets `["value1", "value2"]`
- `Lists` can have different **types** inside
- `Lists` have **indexes** and it starts at `0`, or `-1` if starting from last
- We access a `list` value also using square brackets `list[0]`
- `Lists` can be **sliced**, to take several values 
- We can alter singular values of `lists` by doing `list[0] = 'a'`
- We can add elements to the end of the `list` with `append`
- `Lists` can be summed togehter to **join** the lists

# 5. Python Functions
<br>
<center>
<img src="https://files.realpython.com/media/building_with_python_watermark.2ebe5beb5b1e.jpg">
    </center>

# 5. Python Functions

## 5.1 Functions 

- FYI, we have already used **functions** like `type` and `print` (see that green text on your notebook?)
- Like in Math $f(x)=y$ the function `f` takes in **argument** `x` and **returns** the result `y`



- In short, a function is reusable code! We have used `print` a **lot** by now
- Functions are how IT got so big, you can just share your functions, and never have to write them



- A function **returns** a value, so you can also put the result in a variable
- Let's look at a few examples, that come with Python

`min` and `max`

In [60]:
numbers = [0, 1, 2, 3, 4]
max_value = max(numbers)
min_value = min(numbers)
print(max_value, min_value)

4 0


- So if there are a lot of already made functions, how do we find them?
    - With time...
    - With practice...
    

<center>
<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcSuBOhO_O1WbFWe6Iit3tN5Mv8hL7bhKjZduG7lVHa7WzXCsqRl" width=700 height=700>
</center>

# 5. Python Functions
## 5.2. Exercise - Private Investigator I

We want to find how to **round** numbers in Python. We don't want the decimal part.

1. The Internet is your friend (maybe). Especially Google (maybe).
2. Search the internet for `PYTHON HOW TO ROUND NUMBERS`
3. Look for `stackoverflow.com`, these ones are the best.
4. Round the number `7.1`

Keep in mind that if the answer is too long, it is probably not what you are looking for!

In [61]:
# ANSWER
print(round(7.1))

7


# 5. Python Functions
## 5.3. Exercise - Private Investigator II

This time around, we want to know the **length** of a `list`.

Happy finding!

In [62]:
numbers = [1, 2, 3]
print(len(numbers))

3


- This is why it is useful to know the **keywords** like
    - **Type**
    - **List**
    - etc...

# 5. Python Functions
## 5.4. Methods

- Until now we have seen one, the `append` for the `lists`
- You can identify them when they come after a dot `.` (i.e. `list.append('E')`) 
- **Method** is just a fancy word for **function** that belongs to a **type**

### String Methods
#### `split`
- allows you to **separate** a string by a **character**.

In [63]:
string = "Hello world!"
print(string)
splitted_string = string.split(' ')
print(splitted_string)

Hello world!
['Hello', 'world!']


#### `join`
- inverse of split. Glues a list of words (`string`s) together **with** a charactesr

In [64]:
sentence = " ".join(splitted_string)
print(sentence)

Hello world!


### List Methods
#### `sort`
- sorts all the values inside your list

In [65]:
numbers = [5, 2, 3]
sorted_list = numbers.sort()
print(numbers, sorted_list)

[2, 3, 5] None


####  `reverse`
- reverses the values of the list. First goes last, etc.

In [66]:
numbers = [5, 2, 3]
reversed_list = numbers.reverse()
print(numbers, reversed_list)

[3, 2, 5] None


- **WARNING**: methods like `reverse`, `sort` and `append` modify the variable, but **don't return anything! (`None`)** 

# 5. Python Functions

## 5.5. Building our own functions

- Functions are a convenient way to divide your code into useful blocks
- It allows us to:
    1. Order and organize our code 
    2. Make the code more readable, especially for **other** people
    3. Reuse code and share with others 
    
- Function code blocks have to be organized in a strict way 



Example:

In [67]:
def celsius_to_farh(temp):
    print("Executing function. Beep-boop.")
    return 9/5 * temp + 32
print(celsius_to_farh(20))

Executing function. Beep-boop.
68.0


# 5. Python Functions

### 5.5.1. Anatomy of a function

<center>
<img src="https://geo-python.github.io/site/develop/_images/Function_anatomy-400.png">
</center>

1. **Mandatory**: `def` keyword
2. **Mandatory**: function name
3. **Mandatory**: function parenthesis `( )`
4. **Optional**: funtion argument or parameter, inside the parenthesis
5. **Mandatory**: double dots `:` after parenthesis
6. **Optional**: `return` statement. The function can just `print`, for instance
7. **Optional**: return value

- This is how we **call** a function. 
    1. Write its name 
    2. Open parenthesis `( )`
    3. Put input arguments, if applicable

In [68]:
def celsius_to_farh(temp):
    print("Executing function. Beep-boop.")
    return 9/5 * temp + 32
print(celsius_to_farh(20))

Executing function. Beep-boop.
68.0


# 5. Python Functions

## 5.6. Indentation

- In Python there's a thing called **indentation**
- It just means your code has to be organized in a certain fashion

In [69]:
def celsius_to_farh(temp):
    print("Executing function. Beep-boop.")
    return 9/5 * temp + 32

- If you don't follow **indentation**...

In [70]:
def celsius_to_farh(temp):
print("Executing function. Beep-boop.")
return 9/5 * temp + 32

IndentationError: expected an indented block (<ipython-input-70-bbeb791d6066>, line 2)

- You can indent your line of code by pressing the `<TAB>` key. Right next to your `Q` key.

# 5. Python Functions
## 5.7. Quick Response Exercise

Is this a valid function?

In [71]:
def useless_function():
    return

<p style="color:#789922"> > YES! </p>

# 5. Python Functions
## 5.8. Exercise - Writing your first function

Remember inflation rate?

$$IR = \frac{CPI_2 - CPI_1}{CPI_1}$$
<br>
$$CPI_{2000} = 0.0386$$
$$CPI_{2019} = 0.0042$$

1. Write a **function** called `inflation_rate` that takes two `CPI` arguments (`cpi_1` and `cpi_2`) and calculates the inflation rate.
2. Call the **function** you created and **assign** the result in a **variable** named `ir`.
3. `Print` the result

In [72]:
def inflation_rate(cpi_1, cpi_2):
    return (cpi_2 - cpi_1) / cpi_1
ir = inflation_rate(0.0386, 0.0042)
print(ir)

-0.8911917098445595


# 5. Python Functions
## 5.9. Summary

- Python already has **built-in** functions, like `print`, `min`, `max`, `len` and `round` (green text!)
- **Google** is the programmers best friend - hone up those PI skills
- **Methods** are **functions** that belong to a **type** like `append` for `lists`
- **Functions** are composed of:
    - `def` 
    - function name
    - arguments
    - return
- Python has **indentation** (just makes your code look pretty)
- **Function** results can be assigned to variables


# 6. Loops and iterations
<br>
<center>
    <img src="https://files.realpython.com/media/Pythons-range-function_Watermark.5e8ea929167e.jpg">
</center> 

# 6. Loops and iterations
## 6.1. The `for` loop


- What if we want to do something to **every** element of the `list`?
- **For** every index of the list, do something. 

In [73]:
ez_list = [1, 2, 3]
print(ez_list[0])
print(ez_list[1])
print(ez_list[2])

1
2
3



- No one wants to write `list[i]` 100 times if the list has a length of 100
- The answer for this problem, are **loops** or **iterations**
- This is called a `for` **loop**

In [74]:
ez_list = [1, 2, 3]
for i in ez_list:
    print(i)

1
2
3


# 6. Loops and iterations

## 6.2. The anatomy of a `for` loop

<center>
<img src="https://www.dataquest.io/wp-content/uploads/2018/06/91NoaP0.jpg">
<center>

1. **Mandatory**: `for` keyword
2. **Mandatory**: temporary variable name
3. **Mandatory**: `in` keyword
4. **Mandatory**: name of the list we want to **iterate**
5. **Mandatory**: double dots `:` after list name

# 6. Loops and iterations

## 6.3. Compound Exercise - Summing the squres of a list of numbers

We want to sum up the entirety of a list of numbers from 1 to 10, but we want to sum the **square** of each element.

There are two ways of doing this exercise.

1. Create a list of numbers from 1 through 10 (**EXTRA**: you can search google for `python how to create a list of numbers`)
2. Iterate through the list of numbers, using a **for** loop

### Magnum PI Route
3. Create an additional empty list (just `[]` and call it `helper`) outside of the for loop (this is really important)
4. As you iterate, `append` the square of the number to the empty list
5. Search google for `python how to sum a list of numbers` (**HINT**: it is a function like `max` or `min`)
6. Put in a function named `sum_list` that takes in an argument named `number_list`
7. Print the result when calling the function

### Back to Basics Route
3. Create an additional variable called `sum_total` outside of the **for** loop, with the value of 0. This is where we will store the result (**HINT**:  We can do `x = x + 1`)
4. Iterate the list and add the square of the number to `sum_total`
5. Put in a function named `sum_list` that takes in an argument named `number_list`
5. Print the result when calling the function

In [75]:
# BACK TO BASICS ROUTE

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def sum_list(number_list):
    sum_total = 0
    for num in numbers:
        sum_total = sum_total + num**2
    return sum_total
print(sum_list(numbers))

385


- We can create lists of numbers using the `range` function. Like slices the last element is **not** inclusive. Beware!
- To **sum** a list of numbers we have the `sum` function

In [76]:
# FULL MAGNUM PI ROUTE
numbers = range(1, 11)

def sum_list(number_list):
    sum_total = []
    for num in numbers:
        sum_total.append(num**2)
    return sum(sum_total)
print(sum_list(numbers))

385


# 6. Loops and iterations

## 6.4. Summary

- We can **iterate** a list with a `for` loop
- The `for` loop is composed of:
    - `for` keyword
    - temporary variable name
    - `in` keyword
    - the `list` that we want to iterate
- `range` allows us to do lists of sequential numbers
- `sum` allows us to add every **element** of a `list` 

# 7. What if? What else? Conditionals
<br>

<center>
<img src="https://files.realpython.com/media/The-Python-OR-Keyword_Watermarked.18e16a4721d8.jpg">
</center>

# 7. What if? What else? Conditionals


- What if we want to give **different behavior** according to the value of a variable?
- For instance, `if` a member of our company belong to the `content` team, **then** `print` something.
- **Conditionals** are a very powerful tool that allow us to **alter the flow** of how our code works.



Example:

In [77]:
team = 'content'
if team == 'content':
    print("This person comes from the content team.")

This person comes from the content team.


# 7. What if? What else? Conditionals

## 7.1. Anatomy of an `if` statement

<center>
    
<img src="https://dq-blog-files.s3.amazonaws.com/if-else/if_syntax.svg" width=120%>
</center>

1. **Mandatory**: `if` keyword
2. **Mandatory**: **condition**
5. **Mandatory**: double dots `:` after the **condition**
3. **Optional**: `else` keyword
5. **Mandatory**: double dots `:` after `else` keyword

- New things that are notable here:
    - The **condition**: `team == 'content'`
    - The `==` sign. What does it mean?
    
- A condition is like a thing that happens or doesn't happen. A **yes** or a **no**
- So if _this_ (condition) happens **then** do this, **else** do something different

# 7. What if? What else? Conditionals

## 7.1. Intermission. A new type appears - Booleans

- Conditions answer a **Yes** or **No** question
- This **Yes** or **No** is a **type** in Python - **Boleans**
- They can be either **True** (yes) or **False** (no)


In [78]:
boolean = True
type(boolean)

bool

- What happens in an **if statement**:
 - `If` the **condition** is **True** **then** do something
 - `Else` (it is **False**) do something different

In [79]:
if True:
    print("THE TRUTH!")
else:
    print("This is False")

THE TRUTH!


- In the example earlier what is happening is that our **condition evaluates to True** if it is an even number

In [80]:
team = 'content'
if team == 'content':
    print("Content Team")
else:
    print("Not content team")

print(team == 'content')
condition_value = (team == 'content')
print(condition_value)

Content Team
True
True


- We still don't know a lot about `==`. It just _tests_ equality

# 7. What if? What else? Conditionals

## 7.2. Comparison Operators

- Comparison operators are like `+` or `*` - they operate with two variables
- But they give a **boolean** result
- They ask a question i.e. _"Is the first number **equal** to the second one?"_. 
- **Yes** or **No** / `True` or `False` -> `Boolean`


| Symbol &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp; | Task Performed &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;|
|--------|----------------|
| <p style="font-size:18px;">==</p>      | <p style="font-size:18px;">True, if it is equal</p>         |
| <p style="font-size:18px;">!=</p>      | <p style="font-size:18px;">True, if not equal to</p>    |
| <p style="font-size:18px;"> < </p>      | <p style="font-size:18px;">Less than</p>       |
| <p style="font-size:18px;">></p>      | <p style="font-size:18px;">Greater than</p>            |
| <p style="font-size:18px;"><=</p>      | <p style="font-size:18px;">Less than or equal to</p> |
| <p style="font-size:18px;">>=</p>     | <p style="font-size:18px;">Greater than or equal to</p> |



In [81]:
print(1==1)

True


- `==` and `!=` can also be used on strings. It can be used on anything!

In [104]:
print('a' == 'c')

False


In [82]:
print(1!=1)

False


In [83]:
print(1<2)

True


# 7. What if? What else? Conditionals

## 7.3. Boolean operators - for booleans

- There are also **boolean** operators that operate **boolean** values
- They are like math, but for **True** and **False**, **Yes** or **No**


| Symbol &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp;&nbsp; &nbsp; &nbsp; | Task Performed &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;|
|--------|----------------|
| <p style="font-size:18px;">and</p>      | <p style="font-size:18px;">Logical <b>and</b> </p>         |
| <p style="font-size:18px;">or</p>      | <p style="font-size:18px;">Logical <b>or</b></p>    |
| <p style="font-size:18px;">not</p>      | <p style="font-size:18px;">Logical <b>not</b></p>       |
| <p style="font-size:18px;">in</p>      | <p style="font-size:18px;">If an element is <b>in</b> a list</p>            |


- A mind trick that worked for me is to think the following
    - **True** = 1
    - **False** = 0
- And actually...

In [84]:
print(int(True))

1


- So, the **and** works like **multiplication**

In [85]:
# 1*1 = 1
print(True and True)

True


In [86]:
# 1*0 = 0
print(True and False)

False


- The **or** works like **addition**

In [87]:
# 1 + 0 = 1
print(True or False)

True


In [88]:
# 1+1 = 2 (1)
print(True or True)

True


- The **not** is just the opposite

In [89]:
print(not True)

False


In [90]:
print(not False)

True


- The **in** deserves a special note, it doesn't follow the same logic as the ones above
- It tests if an element is **inside** or **belongs to** a **list**!

In [91]:
print(1 in [1,2,3])

True


In [92]:
print("7" in [1,2,3])

False


- On a curious note, it also works for **strings**

In [93]:
print("S" in "USA")

True


- Because `strings` are really just `lists`!

# 7. What if? What else? Conditionals

## 7.5. An example

In [94]:
numbers = range(1, 50)
for num in numbers:
    if (num % 2 == 0) and (num % 3 == 0):
        print("Found a number divisible by 2 and 3: ", num)

Found a number divisible by 2 and 3:  6
Found a number divisible by 2 and 3:  12
Found a number divisible by 2 and 3:  18
Found a number divisible by 2 and 3:  24
Found a number divisible by 2 and 3:  30
Found a number divisible by 2 and 3:  36
Found a number divisible by 2 and 3:  42
Found a number divisible by 2 and 3:  48


# 7. What if? What else? Conditionals

## 7.6. Exercise - All temperatures in Fahrenheit

Our client gave us a list of temperatures both in Celsius and Fahrenheit.
<br>
We want to clean the data, and have a **list** of just numbers.

Here's the list the client gave us.

```
temperatures = ["32C", "7F", "15C", "8C"]
```

We notice, the client gave us a list of `strings` and that they have the indicator of
- C for Celsius
- F for Fahrenheit

**NOTE**: We still have our conversion formula to Fahrenheit, called `celsius_to_fahr`.

Exercise:
0. Copy the list of temperatures to your notebook
1. Create a function named `clean_data` that recieves a `list` of `string` temperatures and will:
    0. Have a `list` called `clean_temperatures`, that we will use to put our results
    1. Iterate all the elements in the list
    2. **If** the **last character** of the string is `C` then, convert to Fahrenheit (use our previous function), and add the **number** to our list.
    3. **Else**, just add the number to our list
    4. Return the temperatures
    
**HINT**: You can access the last character using **reverse indexing** (`string[-1]`) 
<br>
**HINT**: You can **slice** the string until the last character using (`string[:-1]`), because it is **exclusive**
<br>
**HINT**: You can convert a string to a number by doing (`int(string)`)

In [95]:
temperatures = ["32C", "7F", "15C", "8C"]

def clean_data(temperatures):
    clean_temperatures = []
    for t in temperatures:
        if t[-1] == 'C':
            int_temp = int(t[:-1])
            fahr = celsius_to_farh(int_temp)
            clean_temperatures.append(fahr)
        else:
            int_temp = int(t[:-1])
            clean_temperatures.append(int_temp)
    return clean_temperatures

print(clean_data(temperatures))

Executing function. Beep-boop.
Executing function. Beep-boop.
Executing function. Beep-boop.
[89.6, 7, 59.0, 46.4]


# 7. What if? What else? Conditionals

## 7.7. Summary

- **Conditions** allow us to alter the flow of our code
- The `if` statement contains:
    - `if` keyword
    - **condition**
    - `else` keyword if applicable
- `Booleans` are `True` or `False`
- We can compare two **variables** with **comparison operators** to have a `True` or `False`
- `if` **condition** "activates" if it's value **evaluate** to `True`
- **Boolean operators** allow us to combine several **conditions**
    - "All conditions must be `True`" -> `and`
    - "At least one condition must be `True` -> `or`
    - "This condition must be `False`" -> `not` (`if not False:`)
    - "This element must belong to this list" -> `in`

# 8 Giving code a home

<br>
<center>
    <img src="https://pbs.twimg.com/media/DjHjJ6MWwAAgxeb.jpg">
    </center>


# 8 Giving code a home


- Where does all of this code go?
- Code is stored inside files or **scripts** (`file.py`)
- This is the "original" way to use Python (i.e you _run_ a script/file)
- A bunch of files are what is called a **package**

```
pkg/
    finances.py
    accounting.py
```

- You can make your own **packages**
- Or you can use **packages** developed by other people
- For instance, **Numpy** and **Pandas** like we'll see in next class

# 8 Giving code a home

## 8.1. Importing a package

- To use a package, you have to **import** it

In [96]:
import numpy

- We know that **Numpy** has a **function** named `array`

In [97]:
arr = array([1,2,3])
print(arr)

NameError: name 'array' is not defined

- Even though we know `array` exists, Python doesn't know that it comes from **Numpy**
- So, we add the name of the **package** before, and add a dot `.`

In [98]:
arr = numpy.array([1,2,3])
print(arr)

[1 2 3]


- We could also use a `from` **import** statement to only **import** what we need

In [99]:
from numpy import array

In [100]:
arr = array([1,2,3])
print(arr)

[1 2 3]


- Usually when using **Numpy** we want to use several functionalities, so a **from** is not proper
- And it is not good for whoever is reading your code (if a technical person)
- We can **import** a package with an **alias** using `as`

In [101]:
import numpy as np

In [102]:
arr = np.array([1,2,3])
print(arr)

[1 2 3]


# 8 Giving code a home

## 8.2. Exercise - calculating area of a circle

We want to calculate the area of a circle.
Remember the formula:

$$A = 2\pi R$$

But we don't know the value of $\pi$ and 3.14 is just not good enough.
Don't worry, we have the `math` package from Python.

1. Import the `math` package that comes with Python
2. Assign `math.pi` to a variable named `pi`
3. Create a function named `circle_area` that returns the area of a circle, when given an argument `radius`
4. Print the area for a circle with radius 10

In [103]:
import math
pi = math.pi

def circle_area(radius):
    return pi * radius**2
print(circle_area(10))

314.1592653589793


# 8 Giving code a home

## 8.3. Summary

- Code lives in **packages**. This is how they are distributed across the world
- Python already comes with some **packages** like `math`
- **Numpy** and **Pandas** are packages
- We can **import** a package in multiple ways:
    - `import numpy`
    - `import numpy as np`
    - `from numpy import array`

# 9. Homework Assignment

## 9.1. Doing your first package

1. Where you created your notebook, create a file called `script.py`
2. Put all the **functions** done in class into that file
3. Open a new notebook, and import your script (`import script`)
4. Try using **your functions** through **your package**

## 9.2. Slides hosted on Github

1. Go to `https://pedroallenrevez.github.io/PythonClasses/1-Python/index.slides.html#/` and access the class slides
