## 1-minute introduction to Jupyter ##

A Jupyter notebook consists of cells. Each cell contains either text or code.

A text cell will not have any text to the left of the cell. A code cell has `In [ ]:` to the left of the cell.

If the cell contains code, you can edit it. Press <kbd>Enter</kbd> to edit the selected cell. While editing the code, press <kbd>Enter</kbd> to create a new line, or <kbd>Shift</kbd>+<kbd>Enter</kbd> to run the code. If you are not editing the code, select a cell and press <kbd>Ctrl</kbd>+<kbd>Enter</kbd> to run the code.

# Lesson 6a: Data structures - list and tuple

So far we have been dealing with variables that represent individual values: an `int`, `float`, `bool`, or `str`. This soon becomes unmanageable when we are working with larger amounts of data.

For example, if we are working with student records, we know each student might have the following information to manage:

- name: str
- class: str
- date_of_graduation: date
- age: int
- height: float
- weight: float
- is_sg_citizen: bool

We will need a way to iterate through these item labels and their values. Furthermore, we need a way to group students e.g. by class.

Most programming languages provide ways to group multiple pieces of data under a single object or variable. These ways of grouping data are called **data structures**.

## Python lists

In Python, a list is a simple collection of objects.

You can initialise a list from a collection of objects by enclosing them in `[`square brackets`]` and separating them with commas (`,`).

Try the following lines of code one by one in the cell below:

1. `numbers = [1, 2, 3, 4, 5]`  
   (_A list of numbers. We generally name lists by the kind of data they contain, and in plural._)
3. `fruits = ['apple', 'blueberry', 'carrot',]`  
    (_A list can contain strings too. And it will ignore any trailing commas._)
2. `items = [1, '2', 3.0, '4.0', 5]`  
   (_In fact, a list can consist of a mixture of integers, floats, strings, and other objects._)
4. `data = []`  
   (_This is how you initialise an empty list._)
5. `data = list()`  
   (_Another way to initialise an empty list._)

Remember that an assignment statement (using the assignment (`=`) operator) will not produce any output. You will have to invoke the variable again on a separate line.

In [None]:
# Write your code here



### List indexing

List indexing works very similarly to string slicing, which we just covered in lesson 5. In fact, strings are implemented in Python internally as a special type of list.

The elements of a list can be retrieved using their **index**.

The first element of a list has index `0`. The subsequent elements have an index that is 1 greater than the previous element. List indexes always go in running order.

`list[0]` returns the first element of the list. `list[1]` returns the next element, and so on.

### List slicing

List slicing allow you to return multiple elements from a list, just as string slicing allows you to return substrings of characters from a string.

Try the following lines of code one by one in the cell below:

`yourlist = ["Alice", "Bob", "Charlie", "Dawn", "Ernest"]`
1. `yourlist[0:0]`  
   (_This should give you an empty list._)
2. `yourlist[0:1]`  
    (_How does this differ from `yourlist[0]`_?)
3. `yourlist[0:4]`  
    (_Why doesn't this return the whole list_?)
4. `yourlist[0:6]`  
    (_If the slice exceeds the largest index, you won't get an error but those indexes will be ignored._)
5. `yourlist[0:5:2]`  
   (_Can you figure out what the third number in the slice does_?)
6. `yourlist[0:4:2]`  
    (_Why is `yourlist[4]` not in the result_?)
7. `yourlist[0::2]`  
   (_What happens when you remove any of the numbers in the slice? Can you figure out what the default values are_?)
8. `yourlist[::-1]`  
   (_This is a common pattern to get the reverse of a list._)
9. `yourlist[0:4:-1]`  
   (_Why doesn't this work_?)
10. `yourlist[4:0:-1]`  
    (_Is this the same as expression 8_?)
11. `type(yourlist[0:0])`  
    (_A list indexed with a slice always returns a list, even if the result is only an empty list._)

In [None]:
yourlist = ["Alice", "Bob", "Charlie", "Dawn", "Ernest"]  #Do not remove this line.

# Type your code below this line.



While list slicing can be very powerful, it can also be confusing to read and difficult to track correctly. You are recommended to use more readable programming features as far as possible.

### List-mutating methods ##

A list has built-in methods that allow you to add or remove elements from it; such operations that change the contents of a data structure are called **mutation operations**. Recall that methods use the `object.method()` syntax and must be called from an object; `method()` will not work.

Try the expressions in the cell below, one by one:  
(Remember that the assignment (`=`) operator does not produce any output.)

`countries = ["America", "Brazil", "Cambodia", "Dominican Republic", "Ethiopia", "France", "Germany", "Hungary"]`  

1. `countries.append("India")`  
   (_Adds a single item at the end of the list.)
2. `countries.append(["India", "Japan"])`  
   (_Did this produce the effect you expected_?)
3. `countries.insert(4, "England")`  
   (_The `.insert(n, item)` method lets you insert `item` at the `n`th index._)
4. `del countries[0]`  
    (_To delete an element from a list, use the `del` keyword on the list element specified by index._)
5. `del countries[0:3]`  
    (_You can use the `del` keyword to delete multiple elements by list slicing._)

In [None]:
#Do not remove this line.
countries = ["America", "Brazil", "Cambodia", "Dominican Republic", "Ethiopia", "France", "Germany", "Hungary"]

# Type your code below this line.
countries.extend("India")
countries

**Explore:**

1. Do these methods work for strings?
2. Which of these methods return a value? Which ones do not return a value (i.e. return `None`)?

### List operators

You previously explored the use of the concatenation operator `+` and repetition operator `*` with strings (in lesson 1). These operators work with lists too.

Try the following expressions in the cell below:
(Notice that the list methods above modify the original list. the operators below do not; they give the modified list as a return value.)

`countries = ["America", "Brazil", "Cambodia", "Dominican Republic", "Ethiopia", "France", "Germany", "Hungary"]`
1. `countries + "Ireland"`  
   (_Doesn't work. The error provides a clue: lists can only be "added" to other lists, not to strings._)
2. `countries + ["Ireland"]`  
   (_This works. You have to convert the string into a list element by putting it in a list first. This is called **list concatenation**. A **new list is returned** containing the result, without mutating the original `countries`._)
3. `countries + ["Ireland", "Japan", "Kenya"]`  
   (_You can add lists with multiple elements together this way._)
4. `countries[0] = "Australia"`  
   (_To reassign a list element to a different value, address the element using its index._)
5. `countries[0] = Australia`  
   (_Remember that strings need the quote marks `''` or `""` otherwise they get interpreted as variables._)
6. `countries * 3`  
   (_`*` operator with an integer works on lists too._)

In [None]:
countries = ["America", "Brazil", "Cambodia", "Dominican Republic", "Ethiopia", "France", "Germany", "Hungary"]

# Type your code below this line.


### Useful functions for lists

We often need to know various properties of the list. Python has built-in functions to give us this information. You have already learnt the `type()` function, which tells us what type of object it is. 

Try the expressions in the cell below:

`numbers = [8, 7, 6, 5, 4, 3, 3, 2, 1]`

1. `len(numbers)`  
   (_`len()` tells you how many elements a list has._)
2. `sum(numbers)`  
   (_`sum()` returns the sum of all elements. Works on integers and floats only._)
3. `min(numbers)`  
   (_`min()` returns the smallest element. Works on strings._)
4. `max(numbers)`  
   (_`max()` returns the largest element. Works on strings._)

In [None]:
#Do not remove this line.
numbers = [8, 7, 6, 5, 4, 3, 3, 2, 1]

# Type your code below this line.


### Converting other types to lists

Just as we can convert variables to other types using `int()`, `float()`, `bool()`, and `str()`, we can also convert sequences to a list using `list().

Try the following code line and observe the result:

1. `list('apple')`  
   (_Strings are sequences of characters and can be converted to a list._)
2. `list(1)`  
   (_Integers are not a collection and cannot be converted to a list._) 

In [None]:
# Type your code below this line



## Iterating over lists

Iteration over lists works the same way as strings: we can use `while` loops and `for` loops.

In [None]:
ordinals = ['first', 'second', 'third', 'fourth', 'fifth']

# Conditional iteration
i = 0
while i < len(ordinals):
    print(f'Ordinal: {ordinal}')
    i = i + 1

# Indexed iteration
for i in range(len(ordinals)):
    ordinal = ordinals[i]
    print(f'Ordinal: {ordinal}')
    
# Direct iteration
for ordinal in ordinals:
    print(f'Ordinal: {ordinal}')

### Exercise 1: Membership check

A common requirement is to check if an item is inside a sequence (`str`, `list`, etc).

Write a Python function, `has(sequence, target)` that takes in a sequence of items, and a `target`, and returns a `bool` representing whether `target` is found in `sequence`.

**Example**

- `has([1, 2, 3, 4, 5], 4)` should return `True`
- `has([1, 2, 3, 4, 5], 6)` should return `False`

Also write a docstring for this function—you will need the practice! When using variables, give them appropriate names for readability: clear code reflects clear thinking.

**Testing:** Call your function with the above two examples to verify that they work correctly. The output should be clearly shown below the cell.

<details>
    <summary><b>Hint</b> (click to open)</summary>
    <p>Iterate over the items in the sequence. For each item, check if it matches the given argument. If a match is found, the result is `True`. If you have finished iterating over all items and did not find a match, the result is `False`.</p>
</details>

In [None]:
def has(sequence: list, target: str) -> bool:
    """Write an appropriate docstring"""
    # Write your code here
    


The algorithm you just implemented above is called a **linear search**.

### Membership operator: `in`

In Python, you can check for membership in a sequence using the membership operator `in`. For example:

In [None]:
# Run the code cell below
print('a' in 'abcde')
print('fifth' in ['first', 'second', 'third', 'fourth', 'fifth'])

### Exercise 2: Filtering lists with `for` loop

Write a Python function, `only_even(numbers)` that takes in a sequence of numbers, and returns a `list` containing only those numbers in `numbers` that are even. **Do not mutate the original list.**

**Example**

- `only_even([1, 2, 3, 4, 5])` should return `[2, 4]`
- `only_even([1, 3, 5])` should return `[]` (empty list)

Also write a docstring for this function—you will need the practice! When using variables, give them appropriate names for readability: clear code reflects clear thinking.

**Testing:** Call your function with the above two examples to verify that they work correctly. The output should be clearly shown below the cell.

<details>
    <summary><b>Hint</b> (click to open)</summary>
    <p>You will need to create a new list to hold the contents to be returned. Iterate over the numbers in the sequence. For each number, check if it is even. If yes, append it into the new list.</p>
</details>

In [None]:
def only_even(sequence: list[int]) -> list[int]:
    """Write an appropriate docstring"""
    # Write your code here
    


### Iterating over two sequences with `for` loop

Suppose I have two lists:

```python
positions = ['first', 'second', 'third', 'fourth', 'fifth']
fruits = ['apple', 'banana', 'cherry', 'durian', 'elderberry']
```

How would I generate the following output?

  ```
  The first value is apple.
  The second value is banana.
  The third value is cherry.
  The fourth value is durian.
  ...
  ```

Can I do that with a `for` loop? Absolutely. But it is not possible for us to iterate over two different lists in one loop. Instead, we need to recognise that in the first iteration, we want the first elements from each list, and for the second iteration we need the second elements, and so on.

We need to have a way to generate indexes for each iteration. Python makes it easy to do that with the `range()` function.

Run the code cell below and observe the output:

In [None]:
positions = ['first', 'second', 'third', 'fourth', 'fifth']
fruits = ['apple', 'banana', 'cherry', 'durian', 'elderberry']

for i in range(len(fruits)):
    ith = positions(i)
    name = fruits(i)
    print(f'The {i} value is {name}.')

### Exercise 3: Predict the output

What will the output look like with the following code?

  ```python
  for i in range(1, len(positions), 2)
      ith = positions[i]
      name = fruits[i]
      print(f'The {ith} value is {name}.')
  ```

What will the output look like with the following code? Why?

  ```python
  for i in [0, 1, 2, 3, 4, 5]:
      ith = positions[i]
      name = fruits[i]
      print(f'The {ith} value is {name}.')
  ```

What error will you get with the following code? Why?

  ```python
  for i in [0, 1, 2, 3, 4, 5]:
      ith = positions
      name = fruits
      print(f'The {ith} value is {name}.')
  ```

What will the output look like with the following code? Why?

  ```python
  for i in range(len(fruits)):
      ith = positions(i)
      name = fruits(i)
      print(f'The {i} value is {name}.')
  ```

What will the output look like with the following code? Why?

  ```python
  for i in range(0, len(positions)):
      print(f'The {positions[i]} value is {fruits[i]}.')
  ```

## Mutability

Notice that we can modify items in lists directly:

```python
fruits = ['apple', 'banana', 'cherry', 'durian', 'elderberry']
del fruits[0]  ## Deletes the first item in the fruits list
```

But we cannot do the same for strings:

```python
fruit = 'elderberry'
del fruit[0]  ## This will raise a TypeError
```

We say that lists are **mutable**, while strings are **immutable**. That means the elements of a string cannot be changed, while the elements of a list can be changed.

In [None]:
# Test your code here


## Tuples: immutable lists

[Tuples](https://en.wikipedia.org/wiki/Tuple) are the "immutable version" of lists.

### Creating a tuple

Creating an empty tuple: `tuple()`
Creating a tuple of 1 item: `(item,)`. The comma is necessary because `(item)` is interpreted by Python as a bracketed expression; `(item)` is equivalent to `item`.  
Creating a tuple of multiple items: `(item1, item2, item3)`  
Converting a list of items to a tuple: `tuple(list_of_items)`

### Similarities with list

Tuples also support indexing and slicing; slicing a tuple produces another tuple.

### Dissimilarities

List mutation methods don't work on tuples, because tuples are immutable.

----------

**Your turn**

What happens when you try to mutate a tuple? Try it and see.

----------

### Why do we need to know tuples?

You likely will not use to use them in the course of the H2 Computing syllabus. However, you are likely to have to use Python again in university, especially if you take on a course requiring data analysis. Tuples are a common data type in Python, returned by many functions. Knowing what they are, how they differ from lists, and how to work with them will make programming in Python much less stressful.

## Python Self-Help: the `dir()` function

The `dir()` function lets you examine a Python object and find out what methods and attributes it has. Attributes will be covered in the lesson on Object-Oriented Programming.

Run the code cell below for an example of how to use the `dir()` function.

In [None]:
string = 'a string'
dir(string)

You will see that strings have many methods associated with them.

Try this on a list or tuple. What methods do lists have? What methods do tuples have?

## Special methods: dunder methods

You'll notice some methods that begin and end with **d**ouble-**under**scores (`__`). These are special methods in Python, called **dunder methods**.

Dunder methods enable you to change the way some Python features work, for example changing what operators do. They are powerful and easy to misuse, but fortunately are not part of the syllabus, and we will skip them in our practical lessons (except for one, which will be required for Object-Oriented Programming).

# Summary

Research shows that **active recall**, the mental effort of attempting to remember, helps strengthen neuron connections. For each of the questions below, try to recall what you learnt from this lesson before you click to reveal.

<ol>

<li><details>
    <summary>What does it mean if an item is immutable? (click to reveal)</summary>
    <p>We cannot change the contents of an immutable item.</p>
</details></li>
    
<li><details>
    <summary>How do we use the membership operator? (click to reveal)</summary>
    <code>item in items</code>
    <p>This expression evaluates to a <code>bool</code>.</p>
</details></li>
    
</ol>