# Introduction to Python - Part 2: Containers

In this notebook we will briefly look at data containers and simple programming structures:
1. We will introduce the concept of **lists**
1. We will use **for loops** to authomate repetitive tasks

___
#### Acknowledgement
This notebook loosely follows the content of [Chapter 3](https://learning.oreilly.com/library/view/python-crash-course/9781098156664/c03.xhtml) and [Chapter 4](https://learning.oreilly.com/library/view/python-crash-course/9781098156664/c04.xhtml) of _Python Crash Course, 3rd Edition_ by Eric Matthes. Code from the book can be downloaded from the authors' [GitHub repository](https://github.com/ehmatthes/pcc_3e). 
___
A **container** is a data structure in Python that **contains other objects** or data. Containers are useful for grouping data together and allowing for efficient access and modification of that data. Some common containers are:

- **`list`** (list: mutable; indexed by integers; items are stored in the order they were added)
  - `[3, 5, 6, 3, 'dog', 'cat', False]`
- **`tuple`** (tuple: immutable; indexed by integers; items are stored in the order they were added)
  - `(3, 5, 6, 3, 'dog', 'cat', False)`
- **`set`** (set: mutable; not indexed at all; items are NOT stored in the order they were added; can only contain immutable objects; does NOT contain duplicate objects)
  - `{3, 5, 6, 3, 'dog', 'cat', False}`
- **`dict`** (dictionary: mutable; key-value pairs are indexed by immutable keys; items are NOT stored in the order they were added)
  - `{'name': 'Jane', 'age': 23, 'fav_foods': ['pizza', 'fruit', 'fish']}`

When defining lists, tuples, or sets, use commas (,) to separate the individual items. When defining dicts, use a colon (:) to separate keys from values and commas (,) to separate the key-value pairs.

___
## Part 1 - Lists

See [Chapter 3](https://learning.oreilly.com/library/view/python-crash-course/9781098156664/c03.xhtml) of _Python Crash Course, 3rd Edition_ by Eric Matthes.

**Lists** are by far the most used, and most useful type of container in Python. A list represents an **ordered aggregation of elements**. Lists may contain anything from strings to numbers, with constituent elements retaining their sequence yet lacking any inherent relation. 

As lists typically consist of multiple elements, so using plural names is conventional. In Python, lists are **denoted by square brackets** `[]`, with individual components **separated by commas**.

In [1]:
tech_firms = ['Amazon', 'Apple', 'Google', 'Microsoft']
stock_prices = [12, 23, 31, 44]

print(tech_firms)
print(stock_prices)

['Amazon', 'Apple', 'Google', 'Microsoft']
[12, 23, 31, 44]


### Accessing Elements in a List

Individual elements within a list can be **accessed using their postion** (or more correctly their _index_) inside square brackets:


In [2]:
tech_firms[0]

'Amazon'

Please notice that in Python the **first element** of a list has position `0` and not `1`. This is a quirk of the language that is important to keep in mind. I nthe same way, if we want to access the **last element** of a list we can indicate its position as `-1`. This is very convenient because it allows us to access the _last_ element even if we do not know how many objects are contained in the list. 

In [3]:
tech_firms[-1]

'Microsoft'

The individual element of a list, as indicated by its index, can be used as a **common variable**:

In [4]:
print(f'The most valuable tech firm is {tech_firms[1]}')

The most valuable tech firm is Apple


___
### Exercise 1b.01
Create a list called `cities` with the name of **five cities** that you would like to visit. Then print the second-to-last city name in ALL CAPS.

In [5]:
cities = ['Buenos Aires', 'Nairobi', 'Montreal', 'Dublin', 'Kyoto']
print(cities[-2].upper())

DUBLIN


___
### Exercise 1b.02
Using the two lists created above (`tech_firms` and `stock_prices`), use the f-string notation to **print the string** _"The price of Goolge is 23 dollars"_ by accessing the content of the two lists **using the position** of the individual elements.

In [6]:
print(f'The price of {tech_firms[2]} is {stock_prices[1]} dollars')

The price of Google is 23 dollars


___
### Modifying Elements of a list
Beside accessing elements in a list, we may want to **modify the list** itself by changing, adding or removing items from it. Here we will look at the most common methods and tools to work with lists.

First of all we can modify an element of a list by assigning a new value to the specific position. For example let's create new list with cool AI firms:

In [7]:
ai_firms = ['Open AI', 'Deepmind', 'Mindverse']
print(ai_firms)

['Open AI', 'Deepmind', 'Mindverse']


Since in early 2023 Google has bought 10% of **Anthropic** for $300m, we think that this company should now be in our top 3 list.

In [8]:
ai_firms[-1] = 'Anthropic'
print(ai_firms)

['Open AI', 'Deepmind', 'Anthropic']


As we can see we have modified the last element of the list. 

#### Adding new elements
If we want to add new elements to a list we can use the **[`list.append()`](https://www.w3schools.com/python/ref_list_append.asp)** method to add the new element **at the end** of the list, or we can use **[`list.insert()`](https://www.w3schools.com/python/ref_list_insert.asp)** to add the new element in a **specific position**. 

In [9]:
ai_firms.append('Mindverse')
print(ai_firms)

['Open AI', 'Deepmind', 'Anthropic', 'Mindverse']


In [10]:
ai_firms.insert(3, 'TruEra')
print(ai_firms)

['Open AI', 'Deepmind', 'Anthropic', 'TruEra', 'Mindverse']


#### Removing elements from a list
We can remove elements from a list by using the **[`list.pop()`](https://www.w3schools.com/python/ref_list_pop.asp)** method.

In [11]:
ai_firms.pop(3)
print(ai_firms)

['Open AI', 'Deepmind', 'Anthropic', 'Mindverse']


The `list.pop()` method has two interesting features. The first one is that we can use the method without the positional indicator. In this case we will **pop the last element** of the list. Second, we can assign the "popped element" to a varaible for further use. 

In [12]:
old_firm = ai_firms.pop()
print(ai_firms)

print(f'{old_firm} used to be in our list of cool AI companies')

['Open AI', 'Deepmind', 'Anthropic']
Mindverse used to be in our list of cool AI companies


#### Other Stuff
Other useful methods and commands can be used to **sort a list** in alphabetical order (**[`list.sort()`](https://www.w3schools.com/python/ref_list_sort.asp)**), to reverse a list (**[`list.reverse()`](https://www.w3schools.com/python/ref_list_reverse.asp)**) and to measure the lenght of a list(**[`len(list)`](https://www.w3schools.com/python/ref_func_len.asp)**).

In [13]:
ai_firms.sort()
print(ai_firms)

['Anthropic', 'Deepmind', 'Open AI']


In [14]:
ai_firms.reverse()
print(ai_firms)

['Open AI', 'Deepmind', 'Anthropic']


In [15]:
len(ai_firms)

3

Please notice that `len(list)` **is a function and not a method**, hence the different syntax. A function is a general python command that has to be applied to an object, while a method is a property of the object itself.

___
### Exercise 1b.03
Many times we may want to show a list in alphabetical order **without loosing the original sequence**. Have a look at the **[`sorted(list)`](https://www.w3schools.com/python/ref_func_sorted.asp)** function and apply it to print a sorted version of the following list of cities. Afterwards print the original list to make sure that the original sequence has not been lost.

In [16]:
itinerary = ['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Kansas City', 'Saint Louis', 'Philadelphia']

In [17]:
print(sorted(itinerary))
print(itinerary)

['Kansas City', 'Las Vegas', 'Los Angeles', 'Philadelphia', 'Saint Louis', 'Salt Lake City']
['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Kansas City', 'Saint Louis', 'Philadelphia']


___
### Exercise 1b.04
In the `itinerary` list replace Kansas City with Topeka. Print the modified itinerary. Try to achieve this result in two different ways: using a single command, and using two different methods.

In [18]:
itinerary = ['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Kansas City', 'Saint Louis', 'Philadelphia']

In [19]:
itinerary[3] = 'Topeka'
print(itinerary)

['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Topeka', 'Saint Louis', 'Philadelphia']


Alternatively, using two methods:

In [20]:
itinerary = ['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Kansas City', 'Saint Louis', 'Philadelphia']

itinerary.pop(3)
itinerary.insert(3, 'Topeka')
print(itinerary)

['Los Angeles', 'Las Vegas', 'Salt Lake City', 'Topeka', 'Saint Louis', 'Philadelphia']


___
## PART 2 - Loops

See [Chapter 4](https://learning.oreilly.com/library/view/python-crash-course/9781098156664/c04.xhtml) of _Python Crash Course, 3rd Edition_ by Eric Matthes.

Very often we will need to **perform the same operation** on each element of a list. For example, let's assume that we need to print the name of each company in the Dow Jones Industrial Avrage Index (DJIA).

In [21]:
djia_companies = ['3M', 'American Express', 'Amgen', 'Apple', 'Boeing', 'Caterpillar', 
                  'Chevron', 'Cisco', 'Coca-Cola', 'Disney', 'Dow', 'Goldman Sachs', 
                  'Home Depot', 'Honeywell', 'IBM', 'Intel', 'Johnson & Johnson', 'JPMorgan Chase', 
                  'McDonalds', 'Merck', 'Microsoft', 'Nike', 'Procter & Gamble', 'Salesforce', 
                  'Travelers', 'UnitedHealth Group', 'Verizon', 'Visa', 'Walgreens Boots Alliance', 'Walmart']


For sure we **do not want to repeat** 30 `print()` commands...that would be just dumb.

In [22]:
print(djia_companies[0])
print(djia_companies[1])
print(djia_companies[2])

3M
American Express
Amgen


Thankfully most coding languages allow to "loop" over the elements of container. **Looping** enables you to **execute the same operation** or sequence of operations on every element in a list. Consequently, you can handle lists of any size, even those containing millions of items, with ease and efficiency. Let's see an example of a `for` loop to print the name of all the companies.

In [23]:
for company in djia_companies:
    print(company)

3M
American Express
Amgen
Apple
Boeing
Caterpillar
Chevron
Cisco
Coca-Cola
Disney
Dow
Goldman Sachs
Home Depot
Honeywell
IBM
Intel
Johnson & Johnson
JPMorgan Chase
McDonalds
Merck
Microsoft
Nike
Procter & Gamble
Salesforce
Travelers
UnitedHealth Group
Verizon
Visa
Walgreens Boots Alliance
Walmart


One of the main advantages of Python over other languages is that the code is "easy to read in English". In this case **the first line** of the looping structure is quite self-explanatory: _For every company contained in the list djia_companies..._ 

We also notice that the **second line** second line of the structure is indented. The indentation indicates that all those actions have to be **performed on each element** of the list. We could have multiple operations. For example we can perform two different operations on each company:

In [24]:
for company in djia_companies:
    capitalized_name = company.upper()
    print(capitalized_name)

3M
AMERICAN EXPRESS
AMGEN
APPLE
BOEING
CATERPILLAR
CHEVRON
CISCO
COCA-COLA
DISNEY
DOW
GOLDMAN SACHS
HOME DEPOT
HONEYWELL
IBM
INTEL
JOHNSON & JOHNSON
JPMORGAN CHASE
MCDONALDS
MERCK
MICROSOFT
NIKE
PROCTER & GAMBLE
SALESFORCE
TRAVELERS
UNITEDHEALTH GROUP
VERIZON
VISA
WALGREENS BOOTS ALLIANCE
WALMART


All the indented commands will be repeated. The idnentations is authomatic in most python text editors after we type the keyword `for`. We need to **manually de-indent** at the end of the loop: 

In [25]:
for company in djia_companies:
    capitalized_name = company.upper()
    print(capitalized_name)
    
print('These are the constituents of the DJIA Index')

3M
AMERICAN EXPRESS
AMGEN
APPLE
BOEING
CATERPILLAR
CHEVRON
CISCO
COCA-COLA
DISNEY
DOW
GOLDMAN SACHS
HOME DEPOT
HONEYWELL
IBM
INTEL
JOHNSON & JOHNSON
JPMORGAN CHASE
MCDONALDS
MERCK
MICROSOFT
NIKE
PROCTER & GAMBLE
SALESFORCE
TRAVELERS
UNITEDHEALTH GROUP
VERIZON
VISA
WALGREENS BOOTS ALLIANCE
WALMART
These are the constituents of the DJIA Index


The empty line is just for sake of making the code **easier to read**. What really indicates the end of the loop is the change in indentation. 

Another common device to make the code easier to read is to **choose a reasonable name** for the looping variable. In this case we chose `company` becuase this makes clear refrence to the name of the list. If we want to loop over a list called `books`, we can make the code easier to read by using:

```python
for book in books:
```

rather than something generic as

```python
for element in books:
```
___
### Looping over list of numbers
So far we have used lists of strings. The same features apply to **lists of numbers**. For example:

In [26]:
prices = [12, 23, 53, 12, 3, 33, 21]

for price in prices:
    double = price * 2
    print(f'The double of {price} is {double}')

The double of 12 is 24
The double of 23 is 46
The double of 53 is 106
The double of 12 is 24
The double of 3 is 6
The double of 33 is 66
The double of 21 is 42


Sometimes we may want to **iterate over all numbers** between a given lower and upper limit. In this case we do not want to have to manually create the list. In this case we can use the function **[`range()`](https://www.w3schools.com/python/ref_func_range.asp)** to create a sequence of numbers.

In [27]:
for number in range(5):
    print(number)

0
1
2
3
4


In [28]:
for number in range(3,7):
    print(number)

3
4
5
6


We can also use the **[`list()`](https://www.programiz.com/python-programming/methods/built-in/list)** constructor together with `range()` to create a numerical list:

In [29]:
numbers = list(range(4))
print(numbers)

[0, 1, 2, 3]


___
### Exercise 1b.05
The **[`range()`](https://www.w3schools.com/python/ref_func_range.asp)** function accepts a third argument, the size of the "step" between consecutive numbers. Use this to print all the **odd numbers** between 1 and 10

In [30]:
for number in range(1,10,2):
    print(number)

1
3
5
7
9


___
### Exercise 1b.06 (Black Belt)
Use a loop to **sum all the numbers** between 1 and 20. You should only print the final answer (it should be 210).

In [31]:
total = 0 #We initialize the result variable

for number in range(21):
    # In each iteration we add a number to the running total
    total = total + number

#Outside the loop we print the result
print(total)

210


___
### Slicing a List
Sometimes we do not want to use, or to loop through, an entire list. In this case we can **"slice"** a list using the **positional indicators**:

In [32]:
prices = [12, 23, 53, 7, 3, 33, 21]

print(prices[2:4])

[53, 7]


While the technique is easy to udnerstand, the **indexing is always confusing**...

First of all we need to remember that **Python counts from 0**, so the element with index equal to 2 is actually the 3rd element of the list. Second, as with the function `range()`, also in this case **Python stops one item before** the second index you specify. So in this case, the last element of our slice is the element with index equal to 3, the 4th element in the original list.

There are other ways to use the slicing notation. For example if we omit the first index in a slice, Python automatically **starts at the beginning** of the list:

In [33]:
print(prices[:4])

[12, 23, 53, 7]


On the other side, if we omit the second index, the slice will **go to the end** of the original list:

In [34]:
print(prices[2:])

[53, 7, 3, 33, 21]


We can also use negative numebrs to start **counting from the end** of the list:

In [35]:
print(prices[:-3])

[12, 23, 53, 7]


In [36]:
print(prices[-2:])

[33, 21]


We can also use slices in a `for` loop:

In [37]:
for price in prices[2:5]:
    print(price*2)

106
14
6


___
### Exercise 1b.07
Use the `range()` function to create a list called `first_five` containing the numbers between 1 and 5 (no zero!). Use a loop to print each number in the list excluding the first and the last. In each iteration the loop should print the string _"The number X is between 1 and 5"_.

In [38]:
first_five = list(range(1,6))
print(first_five)

[1, 2, 3, 4, 5]


In [39]:
for number in first_five[1:-1]:
    print(f'The number {number} is between 1 and 5')

The number 2 is between 1 and 5
The number 3 is between 1 and 5
The number 4 is between 1 and 5


___