# Python for everyone -- 05 Lists

## Prelude: using Jupyter notebooks

The document that you are reading right now is a Jupyter notebook, in this course we write our code in such notebooks.

Jupyter notebooks allow to combine formatted text and executable code side-by-side; therefore you can have your code, explanations and note, its output and its interpretation all in one place. It is often used for data science applications, publishing together with a paper, and for classes.

The notebooks are made up of cells, which can be either markdown or code.

**Important thing to get used to:**

Whatever expression is on the last line of a code cell, it gets printed out.

In [None]:
2 + 3

You can do other things before it, only the last line gets printed out automatically:

In [None]:
s = "hello"
new_s = s.capitalize() + "!"
new_s

You can print out things before the last line using the `print` function.

In [None]:
x = 10
y = 12
print(f"{x} + {y} = {x+y}")
x-y

And only expressions that return a value/evaluate to something get printed out.

In [None]:
x = 3.14 

The above expression changed the variable `x`, but did not return anything.

Rule of thumb: whatever you can print out with `print()` gets printed out by the last line too.

In [None]:
print(x)
x

In [None]:
print(x=10)

Another important thing:

Cells are not guaranteed to be executed in the order that they appear. You run them in whatever order and as many times as you decide. This can have strange effects 

In [None]:
print(y)

In [None]:
y = 10

You might even end up deleting the line of code that creates the variable `y`, but that does not delete `y` from the internal state of Python.

You can restart the kernel to start from a clean slate. This deletes all variables and results.

Two good ideas:
* If you are uncertain what a variable contains. Check it out (for example, by printing it out)!
* When you are done with a project/homework: restart the kernel and test the code again.

## Lists

We so far looked at data types that only contained a single value (number, text, true/false). However, we often have to work with a sample or collection of data points.

Python also has container types that can contain multiple objects. The first one we look at are lists.

A list of numbers:

In [None]:
l_num = [1, 3, 12, -1]

To define a list: enclosed in square brackets `[...]`, you list elements separated by a comma.

List of strings:

In [None]:
l_str = ['apple', 'barn', 'hopscotch']

Lists can also contain mixed types:

In [None]:
l_mix = [1, 0.23, "wow!"]

## Accessing elements

We can retrive elelements of a list the same way as we accessed single characters of a string. 

Accessing a single element of a list:

In [None]:
l_mix[0]

Same way as for strings: indexing starts from **0**!

In [None]:
l_mix[1]

You can access the last element with `-1`:

In [None]:
l_mix[-1]

Second to last:

In [None]:
l_mix[-2]

#### 🔴 Exercise -- Access elements

Find two ways to access the `"me!"` from the following list.

In [None]:
l = [1, 2, "me!", 4, 5]


<details><summary><u>Solution.</u></summary>
<p>
    
```python

print(l[2],l[-3])
    
```
    
</p>
</details>

Furthermore, you can select a sublist using slicing:

In [None]:
l_mix[1:3]

- The first number in the slice `[1:3]` (`1` in this case) is the index of the first elemening **included**
- The second number in the slice `[1:3]` (`3` in this case) is the index of the first elemening **excluded**

Leaving one of the ends of the slice blank extends the substring to the beginning or end of the list: 

In [None]:
l_mix[:3]

In [None]:
l_mix[2:]

#### 🔴 Exercise -- Slicing

Using slicing, select a sublist containing only the strings from the following list.

How many ways can you do it?

In [None]:
r = [1, 2, 3, 4, "chewin'", "out", "a", "rhythm"]


<details><summary><u>Solution.</u></summary>
<p>
    
```python

r[4:], r[-4:], r[4:8], r[-4:8]
    
```
    
</p>
</details>

A list can even contain another list:

In [None]:
l_mix = [1, 0.23, 'wow!', [1, 3, 12, -1]]

In [None]:
l_mix

You can accessing the inner list:

In [None]:
l_mix[3]

Or:

In [None]:
l_mix[-1]

Access an element of the inner list

In [None]:
l_mix[3][0]

Remember: this syntax also works for accessing single characters or substrings from a string:

In [None]:
name = 'Ghengis Khan'
name[8]

In [None]:
name[8:]

#### 🔴 Exercise -- Slicing

Selecting elements by their index and using slicing, select a substring containing "digital".

In [None]:
l = [1, 2, 3, ["a","b","digital humanities"]]


<details><summary><u>Solution.</u></summary>
<p>
    
```python

l[3][2][:7]
    
```
    
</p>
</details>

## List methods

Similarly to strings, lists also come with a number of methods:

In [None]:
l = []

`.append()`: adds a new element to the end of a list

In [None]:
l.append("hello") # adds a new element to the end of a list

In [None]:
l

In [None]:
l_num

`.count()`: counts the number of occurences of a value.

In [None]:
l_num = [1, 3, 12, -1]
l_num.count(3)

In [None]:
l_num.append(3)

In [None]:
l_num

In [None]:
l_num.count(3)

`.remove()`: removes first occurrence of a value.

In [None]:
l_num.remove(-1)

In [None]:
l_num

In [None]:
l_num.remove(3)

In [None]:
l_num

`.sort()`: sort the elements in increasing order.

In [None]:
l_num.sort()

In [None]:
l_num

In [None]:
l_num.sort(reverse=True)

In [None]:
l_num

Note an important difference between strings and lists:

String methods never modified the string, but always returned a new one:

In [None]:
s = "heirloom tomatoes"
new_s = s.upper()
s, new_s

List methods, on the other hand, often change the list that they are called on:

In [None]:
l = [1,2,3]
l.append("wow!") # does not return anything, changes L
l

* List is called a **mutable** object: it can be modified in place.
* Strings are **immutable**: they cannot be modified without creating a new copy.

This is also causes different behavior if you try to change an element of a list or a character of a string:

In [None]:
l = [1 ,2, 3]
l[1] = "wow!"
l

In [None]:
s = "lore"
s[2]='v'

## Sorting in-place vs new list

Above we saw a way to sort a list: `.sort()` modifies the list by re-ordering the elements it contains. This is called sorting **in-place**:

In [None]:
baby_weights = [3100, 4210, 3670, 4830, 3050]
baby_weights.sort() # does not return anything, but changes baby_weights
baby_weights

Python has another function called `sorted()`, which also sorts the values of a list, but instead of modifying the list, it creates a new one and leaves the original untouched.

In [None]:
baby_weights = [3100, 4210, 3670, 4830, 3050]
sorted_baby_weights = sorted(baby_weights) # returns new list, leaves original unchanged
baby_weights, sorted_baby_weights

#### 🔴 Exercise -- sorting

Write code that asks the user for a number and appends it to the list below. Then prints out the list sorted and the list in the original order.

In [None]:
nums = [1,45,-3, 8]



<details><summary><u>Solution.</u></summary>
<p>
    
```python
new_num = int(input('enter new number'))

nums.append(new_num)
sorted_nums = sorted(nums)

print("Original list:", nums)
print("Sorted list:", sorted_nums)
```
    
</p>
</details>

## Checking contents

You can check if a list contains an element with the `in` operator:

In [None]:
my_pets = ["parrot", "gold fish", "slime mold", "frog"]

pet_type = "cat"
if pet_type in my_pets:
    print(f"I have a {pet_type}!")
else:
    print(f"I do have a {pet_type}!", )

#### 🔴 Exercise -- random word

Define a new list `l_words` that contains a bunch of words. Then write code that randomly picks a word from this list.

Hint: We already briefly used the `random` module which contains functions related to randomness. Check out the documentation of the `random.choice` function.

<details><summary><u>Solution.</u></summary>
<p>
    
```python

l_words = ['apple', 'disgust', 'digital', 'hello']

print( random.choice(l_words) )
    
```
    
</p>
</details>

## Useful functions

Functions can take lists as arguments, commonly used examples are `len`, `max`, `min` and `sum`. 

Length of a list = number of elements contained inside

In [None]:
l_num = [3, -1, 12, 5]
len(l_num)

You can find the maximum element of a list using the built-in `max` function:

In [None]:
max(l_num)

Likewise:

In [None]:
min(l_num)

It should come as no surprise that the `sum()` function calculates the sum of the values in a list:

In [None]:
sum(l_num)

#### 🔴 Exercise -- Average

Students recieve a grade in the class on a scale of 0 to 100. The following list contains the grades of the students, calculate the average grade.

Hint: you need to use to of the functions above.

In [None]:
grades = [84, 97, 87, 86, 85, 82, 68, 78, 80, 71, 72, 93, 96, 93, 93, 95, 96, 68, 71, 71]



<details><summary><u>Solution.</u></summary>
<p>
    
```python
avg = sum(grades)/len(grades)
avg 
```
    
</p>
</details>

### 🔴 Exercise in two parts

If you were browsing the internet in the early 2000s, you might encountered so-called coorporate bullshit generators. Something like [this](https://posfaim.github.io/dnds5027/bs_gen.html).

We will use Python to generate similar random sentences. When we encounter a problem that we would like to solve with programing the first step is to make plan and break it down into smaller tasks. We can then tackle these tasks one-by-one always testing the code along the way.

#### 🔴 Exercise -- random word

Define a new list `l_words` that contains a bunch of words. Then write code that randomly picks a word from this list.

Hint: We already briefly used the `random` module which contains functions related to randomness. Check out the documentation of the `random.choice` function.

<details><summary><u>Solution.</u></summary>
<p>
    
```python

l_words = ['apple', 'disgust', 'digital', 'hello']

print( random.choice(l_words) )
    
```
    
</p>
</details>

#### 🔴 Exercise -- random sentence

Concatenate the random words to create a random sentence.

<details><summary><u>Solution.</u></summary>
<p>
    
```python

l_subject = ['Marton', 'Robyn', 'A dog', 'A police officer']
l_verb = ['ate', 'looked at', 'sat on']
l_object = ['my homework', 'my lunch', 'my car keys']
    
random.choice(l_subject) + ' ' + random.choice(l_verb) + ' ' + random.choice(l_object) + '.' 

    
```
    
</p>
</details>

## Iterating over a list

A common reason for writing a program is to automate: the computer can rapidly perform a task many times that would take a human very long.



In [None]:
books = [
    'Moby Dick (1851)',
    'The world according to Garp (1978)',
    'Networks: an Introduction (2018)',
    'Portraits of Empires (2023)'
]

Let's say you would like to print out the titles with out the year they were published:

In [None]:
print(books[0][:-7])

In [None]:
print(books[0][:-7])
print(books[1][:-7])
print(books[2][:-7])
print(books[3][:-7])

This runs in a jiffy. But writing the code is cumbersome:
* Lot of copy-pasting
* Hard to change
* Not very readable

Also, what if instead of 4 books, I have 30? Or 3 million?

The solution is to use a **for loop** to iterate over the elements of a list and execute the same code for each element:

In [None]:
for book in books:
    print(book[:-7])

We can also collect the titles in a new list for later use:

In [None]:
titles = []
for book in books:
    titles.append(book[:-7])
titles

#### 🔴 Exercise -- punctuation in Moby Dick

What is the most common punctuation in Moby Dick? Use a `for` loop to print out the number of occurrences of the following punctuation marks: ".", ",", "!", and "?"

In [None]:
import requests # import package

# download resource
response = requests.get("https://posfaim.github.io/dnds5027/data/moby_dick-english.txt") # this returns a special object, not a string
moby_dick = response.text # extract the contents of the webpage into a string

<details><summary><u>Solution.</u></summary>
<p>
    
```python
for punc in [".", ",", "!", "?"]:
    count = moby_dick.count(punc)
    print(f"There are {count} {punc}-s in Moby Dick.")
```
    
</p>
</details>

Loops greatly expand what we can do with programming and next week will be devoted to this.