# Lists Review 

## What are lists and why should we care about them?

A list is one kind of _collection_ data structure

So far we’ve stored our data in _variables_. 

Variables have one value in them &ndash; when we put a new value in the variable, the old value is overwritten.

In [None]:
# variables are containers that hold data
employee = "Giovanni Luca Ciampaglia"
role = "Assistant Professor"

print("Name:", employee, "| Role:", role)

In [None]:
# variables can be overwritten with new data
employee = "Yogesh Boricha"
role = "Teaching Assistant"

print("Name:", employee, "| Role:", role)

But what if instead of individual pieces of data I need to _simultaneously_ keep track of multiple employees?

In [None]:
# Let's use variables to store all employees in the teaching staff of INST126
employee1 = "Giovanni Luca Ciampaglia"
employee2 = "Yogesh Boricha"
employee3 = "Nishad Kulkarni"

In [None]:
print("Teaching staff:", employee1 + ",", employee2 + ",", employee3 + ".")

However, this becomes pretty cumbersome if we have many variables to keep track of.

Furthermore, it is difficult to handle situations where the number of employees is not fixed. 

In [None]:
# same code as before but now wrapped in a function
def printstaff(emp1, emp2, emp3):
    print("Teaching staff:", emp1 + ",", emp2 + ",", emp3 + ".")

In [None]:
# INST 126 section 2
printstaff(employee1, employee2, employee3)

In [None]:
# Nuclear Physics (this prof does not need TAs ... but I still need to pass empty strings for the function to work!)
printstaff("Dr. Sheldon Cooper", "", "")

---

A list is a kind of **collection** data structure. 

A collection allows us to put many values into a single _&ldquo;variable&rdquo;_.

With a collection we can carry many values around in one convenient package.

In [None]:
friends = ['Joseph', 'Glenn', 'Sally']
carryon = ['socks', 'shirt', 'perfume']
scores = [1, 50, 32]

## Why do we need more data structures than strings, numbers and Boolean values?

Recall that computational thinking is a key component of programming skill. 

**Algorithms** --- sets of rules or steps used to solve a problem --- are an important way to model and instruct computers to solve problems. 

In computer science, it is well known that some algorithms need special **data structures** --- particular ways of organizing data in a computer.

Let's look at some examples.

Can you think of a structured set of rules (algorithms) for solving these problems *without* using a collection / list?

- Find the smallest number in some numbers;

- Take a collection and add an item to it;

- Sort / filter some numbers;

__Problem__: find the smallest number amongst five numbers

In [None]:
a = 12
b = 5
c = 7
d = 10
e = 2

In [None]:
# With variables
smallest = a
if b < smallest:
    smallest = b
elif c < smallest:
    smallest = c
elif d < smallest:
    smallest = d
elif e < smallest:
    smallest
print("The smallest number is", smallest)

In [None]:
# With lists
nums = [
    12,
    5,
    7,
    10,
    2
]

# Sort it from smallest to largest using the sorted() function
nums = sorted(nums)

# Take the first number
print("The smallest number is", nums[0])

__Problem__: filter out odd numbers so we only see the even numbers.

In [None]:
# With variables
a = 1
b = 5
c = 7
d = 10
e = 2

if a % 2 == 0:
    print(a)
if b % 2 == 0:
    print(b)
if c % 2 == 0:
    print(d)
if d % 2 == 0:
    print(d)
if e % 2 == 0:
    print(e)

In [None]:
# With lists
nums = [10, 5, 7, 1, 2, 6, 10, 15, 20, 200]

# Use a for loop
for num in nums:
    if num % 2 == 0:
        print(num)

Could in principle do these with separate variables for each item. But very clunky! And error prone! And basically impossible to generalize (contra core goal of developing **abstractions over classes of problems**, from CompT).

The point to note here is that your ability to model (and therefore solve) problems with programming is dependent on your knowledge of data structures (since they constrain the set of algorithms you can recognize and apply to problems). So as you expand your knowledge of data structures, try to note down also the common situations in which they apply, and what algorithms they tend to "work well with".

You will learn a few more data structures this semester (dive more into strings this module, then files and dictionaries next module, and dataframes for data analysis in the final module). And of course many more as you advance in your career.

## Anatomy of a list in Python

### How to define a list

A list value (also called a list _literal_):

1. Is surrounded by square brackets;

2. Contains multiple elements separated by commas.

In [None]:
"1, 2, 3" # this a string literal

In [None]:
1 # this is int literal

In [None]:
1.0 # float literal

In [None]:
[1, 2, 3] # list literal

### Style: multi-line list definition

If you want to see the list contents more easily, you can also write a list literal over multiple lines and use indentation.

```python
a = [
  1, # position 0 
  2, # position 1
  3  # position 2
]
```

Python knows where the list starts or stops based on the brackets.

In [None]:
basic_list_2 = [
    4,
    5,
    6
]

This style is especially handy when there are &ldquo;large&rdquo; elements

In [None]:
sentences = [
    "something",
    "she sells sea shells by the sea here",
    "she sells sea shells by the sea there",
]

# Difficult to read 
sentences = ["she sells sea shells by the sea shore", "she sells sea shells by the sea here", "she sells sea shells by the sea there"]

In [None]:
sentences

### Lists and variables

You can store lists into variables, just like any other values, with an assignment statement:

In [None]:
a = [1, 2, 3]
b = [1, "2", 3.0]
print(a)
print(b)

You can even create list of lists!

In [None]:
c = [a, b]
print(c)

## Some properties of lists

- They can hold **more than one value**
    - What can go in a list?
        - Any Python object: even another list!
        - Mixed objects: doesn't all have to be the same type of object
    - But you can also have lists with just one item, or empty ones! (Useful for initialization.)

- They are **accessed** positionally, and thus have a notion of position / order
    - Other data structures, like _dictionaries_ or _sets_, don't have this property
        - This allows you to do things like sorting, finding by position (e.g., "first" or "last")
    - NOTE: the index starts at 0, not 1! 
        - So the first item is at index 0, the second at index 1, and so on...
        - Very important to remember this as you work with getting things in and out of lists

- They are **mutable**: 
    - You can change the individual elements in the list directly:
        - Adding a new element
        - Changing an existing element
        - Deleting an existing element

- _Quick aside_: Python has also another data structure called _tuple_ that is instead **immutable**:
    - You cannot change the elements stored in a tuple
    - You can only create a new tuple and then assign it back
    - Interestingly, strings are also an _immutable_ data structure
        - Hold this thought to compare / contrast when we discuss strings in a couple of weeks.

Let's demonstrate these properties by &ldquo;dissecting&rdquo; a few lists together.

### Lists can hold multiple types of data, including other lists

In [None]:
basic_list = [1, 2, 3]
x = [1, "1", 1.0]  # mixed
y = [basic_list, basic_list, basic_list]  # list of lists
z = [] # empty list
print(x)
print(y)
print(z)

### Accessing elements

To access individual elements in a list we specify the position (or _index_) inside square brackets `[]`:

_Remember: indexing starts at 0!_

In [None]:
colors = [
    "red",    # index 0
    "green",  # index 1
    "blue",   # index 2
    "yellow", # index 3
    "white",  # index 4
    "black",  # index 5
]
i = 0
print("The element at position", i, "is", colors[i])

i = 1
print("The element at position", i, "is", colors[i])

# get the 5th item
colors[4]

Here it is in pictures.

<img src="https://terpconnect.umd.edu/~gciampag/INST126/images/positive-indexes.png"/>

You can also index in reverse! Handy for getting the last item in a list.

<img src="https://terpconnect.umd.edu/~gciampag/INST126/images/negative-indexes.png" />

In [None]:
print(colors)

i = -1
print("The element at position", i, "is", colors[i])

i = -2
print("The element at position", i, "is", colors[i])

## Mutating (changing) list elements

To modify an individual element we use an assignment statement, like we would with a regular variable.

However we also specify the position of the element to modify, using again square brackets `[]`

In [None]:
numbers = [4, 6, 7, 10, 5]
print(numbers)

In [None]:
numbers[1] = 7  # can mutate the list (i.e., modify it directly)
print(numbers)

## Deleting list elements

We use the `del` operation followed by the list element (again using square brackets to specify the position)

In [None]:
letters = ['a', 'b', 'c', 'd', 'e']
print(letters)

i = 1
print("Removing the element at index", i, "...")
del letters[1]
print(letters)

## Adding list elements

There are multiple ways to do this.

We can use **concatenation** (adding two lists together using the `+` operator)

In [None]:
x = [1, 2, 3]
y = ["a", "b", "c"]
z = x + y
print(z)

Another way is to use some list methods, like `append` (adds one element at the time at the end) or `extend` (adds the elements of another list at the end)

In [None]:
x = []
print("Initially x is", x)

elem = 1
x.append(elem)
print("After appending", elem, "now x is", x)

elem = 10
x.append(elem)
print("After appending", elem, "now x is", x)

In [None]:
y = []
print("Initially y is", y)

z = [1, 2]
y.extend(z)
print("After extending with", z, "now x is", y)

z = ["a", "b"]
y.extend(z)
print("After extending with", z, "now x is", y)

In [None]:
# Another way that some people use to create lists:
some_list = list()
some_list

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>


## Coding Challenge

_Remember that indices start at 0._

### Task 1
print the first, second, and third item of the list below, each on a different line.

In [None]:
triplet = [3, 4, 5]

# Your solution here
...

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

### Task 2

First, run the cell below to create a list of words. Then, in the cell print **the last** and **next-next-to-last** element in the list of words. Try a few different times with different lists of words to see if your code works no matter the input.

__Hint__ _A helpful trick for getting the last or nth-to-last item is to use negative indexes:_
- `-1` for the last element,
- `-2` is the next-to-last element, etc.

<div class="alert alert-warning"><strong>Attention:</strong> The following cell uses the <tt>input()</tt> function</div>

In [None]:
words = input("Please write at least four words (then press ENTER): ").strip().split(" ")
print("You entered the following words:", words)

# Your solution here
...

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

### Task 3

Change the second element of the list of words before to `"INST126"`, then print the full list

(Make sure the list has at least three words!)

In [None]:
# Your solution here
...

## Membership test: Check what's in a list

Python provides the `in` operator that lets you check if an item is in a list.

It's a logical operator that returns `True` or `False`, so you can use it in conditional blocks and so on.

In [None]:
inst126staff = ["Giovanni", "Nishad", "Yogesh"]

person = "Giovanni"
print("Is", person, "in the INST126 staff?", person in inst126staff)

person = "Jimmy"
print("Is", person, "in the INST126 staff?", person in inst126staff)

## More operations with lists

There are a great many other built-in operations in Python that let you do things with lists. 

They fall into list *methods* and also built in *functions* that take a list as an argument.

The main difference is whether the list appears as an argument not.

### List methods
- The list is _not_ passed as an argument &ndash; the method operates on it directly
- They typically operate on the list &ldquo;in-place&rdquo;, meaning they do not return it back
- Mechanics: name of variable holding the list, then `.`, then method name, then `()` (with arguments, if any)
- Example `a.append(1)`, `b.extend([1, 2, 3])` 
- Full list here: https://docs.python.org/3/tutorial/datastructures.html

### List functions
- The list to operate on must be passed as an argument
- Usual function calling syntax
- Example: `sorted(a)`
- Can operate both in-place or return back a list

In [None]:
a_list = [1, 2, 3, 1, 4, 1, 5]
print("Original list:", a_list)

# add something to the end of a list (WE'LL USE THIS A LOT)
a_list.append(4)
print("After appending:", a_list)

# sort the list (WE'LL ALSO USE THIS A LOT)
a_list.sort()
print("After sorting (small to large):", a_list)

# control how you sort
a_list.sort(reverse=True)
print("After sorting (large to small):", a_list)

# count how many times an item is in a list (handy for searching)
elem = 1
print("Element", elem, "appears", a_list.count(1), "times")

# insert an item at a specific position
elem = 22
idx = 0
a_list.insert(idx, elem)
print("After inserting", elem, "at index", idx, ":", a_list)

In [None]:
a_list = [5, 2, 7, 10, 3]
result = sorted(a_list)
print("Original list:", a_list)
print("Sorted list:", result)

In [None]:
# Want the index, not just True / False
names = ["joe", "harry", "rachel", "kelly"]
print(names.index("joe"))

## Collection functions

- Built in functions from Python that operate on lists as arguments
    - `max`, `min`, `sum`
    - `sorted`
    - `all`, `any`
    - Look for the ones that have an "iterable" as as parameter type

- These are functions, so usual function calling syntax
- Full list here: https://docs.python.org/3/library/functions.html

Let's play with a few!

### List length

In [None]:
x = [1, 2, 3, 1, 4, 1, 5, 10, 12]
print("x is ", x)
print("The length of x is ", len(x))

### Sorting

Returns _a new_ list that is a sorted version of its argument. 

Use this instead of the `.sort()` method if you want to keep the original list around.

In [None]:
x_sorted = sorted(x) 
print("Sorted version of x", x_sorted)
print("x is still unsorted:", x)

### Minimum, Maximum, Sum, etc.

Useful for basic arithmetic. Work only with numerical values.

In [None]:
print("The smallest value is", min(x))
print("The largest value is", max(x))
print("The average value is", sum(x) / len(x))

The `sum` function expects numbers, so this will throw a type error

<div class="alert alert-warning">This cell will generate an error!</div>

In [None]:
list_of_strings = ["a", "b", "c"]
sum(list_of_strings)

However `min` and `max` work with strings too

In [None]:
words = ["abandonware", "abapical", "abaptiston", "abaco", "abacterial", "abactinal", "abaculus"]
print("The first word is", min(words))
print("The last word is", max(words))

In [None]:
rev_words = sorted(words, reverse=True)
print(words)
print(rev_words)

### Listwise logical operations

`all` returns True if all the elements in the list are True &ndash; listwise logical `and` 

`any` returns True if at least one element is True &ndash; listwise logical `and`

In [None]:
all([True, False, True])

In [None]:
any([True, False, True])

# Common errors

## Error: forgetting that indices start at 0

As I noted earlier, remember that indices start at 0. A common error to make if you forget this, is to get something at the wrong position!

In [None]:
a = [1, 2, 3]
a[1]

## Error: position not in the list (IndexError)

Another common error is to try to get something from an index position that doesn't yet exist in a list.

For example, the list `x = [1, 4, 5]` has 3 items (has length 3). 

But! If I want to get the 3rd item with `x[3]`, I will get an IndexError, because the item only has indices that go up to 2!

Sometimes this happens if you forget 0-indexing, and try to get the "3rd item" with index 3 (instead of the correct index 2).

We'll return to this error next week, because it often shows up with iteration 

<div class="alert alert-warning">This cell will generate an error!</div>

In [None]:
x = [1, 4, 5]
x[3]

## Error: Mixing up in-place methods / functions

This happens most with operations that can be done with methods and functions, such as sorting.

Consider this situation where we use the `sorted` function to get a new, sorted list, which we store in a new variable

In [None]:
# Correct -- using fruitful function (returns a new list)
x = [1, 7, 4, 2]
xsort = sorted(x) # don't change x, just make a  new list that is a sorted version of x
print(x)
print(xsort)

If instead of the function we use the in-place `.sort()` method, `xsort` will have None as its value

In [None]:
# ERROR -- using in-place method
x = [1, 7, 4, 2]
xsort = x.sort()
print(x)
print(xsort)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

# Solutions

## Coding Challenge

_Remember that indices start at 0._

### Task 1
print the first, second, and third item of the list below, each on a different line.

In [None]:
# Your solution here

triplet = [3, 4, 5]

### BEGIN SOLUTION
print(triplet[0])
print(triplet[1])
print(triplet[2])
### END SOLUTION

### Task 2

First, run the cell below to create a list of words. Then, in the cell print **the last** and **next-next-to-last** element in the list of words. Try a few different times with different lists of words to see if your code works no matter the input.

__Hint__ _A helpful trick for getting the last or nth-to-last item is to use negative indexes:_
- `-1` for the last element,
- `-2` is the next-to-last element, etc.

<div class="alert alert-warning"><strong>Attention:</strong> The following cell uses the <tt>input()</tt> function</div>

In [None]:
words = input("Please write at least four words (then press ENTER): ").strip().split(" ")
print("You entered the following words:", words)

# Your solution here
### BEGIN SOLUTION
print(words[-1]) # get the last item
print(words[-3]) # get the next-next-to-last item
### END SOLUTION

### Task 3 

Change the second element of the list of words before to `"INST126"`, then print the full list

(Make sure the list has at least three words!)

In [None]:
# Your solution here
### BEGIN SOLUTION
words[1] = "INST126"
print(words)
### END SOLUTION