### Storing Collections of Data Using Lists<br>
* Up to this point, we have seen numbers, Boolean values, strings, functions, and a few other types. Once one of these objects has been created, it can not be modified
* In this lecture, we will learn how to use a Python type named $\rm\color{orange}{list}$
* Lists contain zero or more objects and are used to keep track of collections of data
* Unlike the other types we have learned about, lists $\rm\color{cyan}{can}$ be modified

* The following table shows the number of gray whales counted near the Coal Oil Point Natural Reserve in a two-week period starting on February $24$, $2008$
---
| Days | Number of Whales | Days | Number of Whales |
| :--: | :--: | :--: | :--: |
| 1 | 5 | 8 | 6 |
| 2 | 4 | 9 | 4 |
| 3 | 7 | 10 | 2 |
| 4 | 3 | 11 | 1 |
| 5 | 2 | 12 | 7 |
| 6 | 3 | 13 | 1 |
| 7 | 2 | 14 | 3 |

* Using what we have seen so far, we would have to create fourteen variables to keep track of the number of whales counted each day
  
![Variables](lec07-01.jpg)
  
* To track an entire year’s worth of observations, we would need $365$ variables
* Rather than dealing with this programming nightmare, we can use a $\rm\color{orange}{list}$ to keep track of the $14$ days of whale counts
* That is, we can use a list to keep track of the $14$ int objects that contain the counts

In [None]:
whales = [5, 4, 7, 3, 2, 3, 2, 6, 4, 2, 1, 7, 1, 3]
print(whales)

* A list is an $\rm\color{magenta}{object}$; like any other object, it can be assigned to a variable
* Here is what happens in the memory model
  
![List](lec07-02.jpg)

* The general form of a list expression is as follows
  
        [«expression1», «expression2», ... , «expressionN»]
  
* The empty list is expressed as $\rm\color{magenta}{[]}$
* In our whale count example, variable whales refers to a list with fourteen items, also known as $\rm\color{magenta}{elements}$
* The list itself is an object, but it also contains the memory addresses of fourteen other objects
* The memory model above shows whales after this assignment statement has been executed

* The items in a list are $\rm\color{magenta}{ordered}$, and each item has an $\rm\color{magenta}{index}$ indicating its position in the list
* The first item in a list is at index $\rm\color{magenta}{0}$, the second at index $\rm\color{magenta}{1}$, and so on
* It would be more natural to use $1$ as the first index, as human languages do. Python, however, uses the same convention as languages like C and Java and starts counting at zero
* To refer to a particular list item, we put the index in brackets after a reference to the list (such as the name of a
variable)

In [None]:
whales = [5, 4, 7, 3, 2, 3, 2, 6, 4, 2, 1, 7, 1, 3]
print(whales[0])
print(whales[1])
print(whales[12])
print(whales[13])

* We can use only those indices that are in the range from zero up to one less than the length of the list, because the list index starts at $0$, not at $1$
* In a fourteen-item list, the legal indices are $0$, $1$, $2$, and so on, up to $13$
* Trying to use an out-of-range index results in an error

In [None]:
whales = [5, 4, 7, 3, 2, 3, 2, 6, 4, 2, 1, 7, 1, 3]
print(whales[1001])

* Unlike most programming languages, Python also lets us index backward from the end of a list
* The last item is at index $-1$, the one before it at index $-2$, and so on
* Negative indices provide a way to access the last item, second-to-last item and so on, without having to figure out the size of the list

In [None]:
whales = [5, 4, 7, 3, 2, 3, 2, 6, 4, 2, 1, 7, 1, 3]
print(whales[-1])
print(whales[-2])
print(whales[-14])
print(whales[-15])

* Since each item in a list is an object, the items can be $\rm\color{magenta}{assigned}$ to other variables

In [None]:
whales = [5, 4, 7, 3, 2, 3, 2, 6, 4, 2, 1, 7, 1, 3]
third = whales[2]
print('Third day:', third)

* We will learn that an entire list, such as the one that whales refers to, can be assigned to other variables and discover the effect that that has

### The Empty List<br>
* In previous lectures, we saw the empty string, which does not contain any characters. There is also an empty list
* An empty list is a list with no items in it. As with all lists, an empty list is represented using brackets

In [None]:
whales = []

* Since an empty list has no items, trying to index an empty list results in an error

In [None]:
print(whales[0])
print(whales[-1])

### Lists Are Heterogeneous<br>
* Lists can contain any type of data, including integers, strings, and even other lists
* Here is a list of information about the element krypton, including its name, symbol, melting point, and boiling point

In [None]:
krypton = ['Krypton', 'Kr', -157.2, -153.4]
print(krypton[1])
print(krypton[2])

* A list is usually used to contain items of the same kind, like temperatures or dates or grades in a course
* A list can be used to aggregate related information of different kinds, as we did with krypton, but this is prone to error
    * We need to remember which temperature comes first and whether the name or the symbol starts the list
* Another common source of bugs is when we forget to include a piece of data in our list (or perhaps it was missing in our source of information)
    * How, for example, would we keep track of similar information for iridium if we do not know the melting point?
    * What information would we put at index 2?
    * A better, but more advanced, way to do this is to use object-oriented programming

### Modifying Lists<br>
* Suppose we are typing in a list of the noble gases and our fingers slip

In [None]:
nobles = ['helium', 'none', 'argon', 'krypton', 'xenon', 'radon']

* The error here is that we typed 'none' instead of 'neon'
* Here’s the memory model that was created by that assignment statement
  
![Modification01](lec07-03.jpg)
  
* Rather than retyping the whole list, we can assign a new value to a specific element of the list

In [None]:
nobles[1] = 'neon'
print(nobles)

* Here is the result after the assignment to nobles[1]:
  
![Modification02](lec07-04.jpg)
  
* That memory model also shows that list objects are $\rm\color{magenta}{mutable}$. That is, the contents of a list can be mutated

* In the code above, nobles[1] was used on the left side of the assignment operator
* It can also be used on the right side. In general, an expression of the form L[i] (list L at index i) behaves just like a simple variable
* If L[i] is used in an expression (such as on the right of an assignment statement), it means "Get the value referred to by the memory address at index i of list L"
* On the other hand, if L[i] is on the left of an assignment statement (as in nobles[1] = 'neon'), it means "Look up the memory address at index i of list L so it can be overwritten"

* In contrast to lists, numbers and strings are $\rm\color{magenta}{immutable}$
* We cannot, for example, change a letter in a string
* Methods that appear to do that, like upper, actually create new strings
* Because strings are immutable, it is only possible to use an expression of the form s[i] (string s at index i) on the right side of the assignment operator

In [None]:
name = 'Darwin'
capitalized = name.upper()
print(capitalized)
print(name)

### Operations on Lists<br>
* We have introduced a few of Python's built-in functions
* Some of these, such as len, can be applied to lists, as well as others we have not seen before
---
| Function | Description |
| :--: | :--: |
| len(L) | Returns the number of items in list L |
| max(L) | Returns the maximum value in list L |
| min(L) | Returns the minimum value in list L |
| sum(L) | Returns the sum of the values in list L |
| sorted(L) | Returns a copy of list L where the items are in order from smallest to largest (This does not mutate L) |

* The half-life of a radioactive substance is the time taken for half of it to decay
* After twice this time has gone by, three quarters of the material will have decayed; after three times, seven eighths, and so on
* An isotope is a form of a chemical element. Plutonium has several isotopes, and each has a different half-life
* Here are some of the built-in functions in action working on a list of the half-lives of plutonium isotopes Pu-$238$, Pu-$239$, Pu-$240$, Pu-$241$, and Pu-$242$

In [None]:
half_lives = [887.7, 24100.0, 6563.0, 14, 373300.0]
print(len(half_lives))
print(max(half_lives))
print(min(half_lives))
print(sum(half_lives))
print(sorted(half_lives))
print(half_lives)

* In addition to built-in functions, some of the operators that we have seen can also be applied to lists
* Like strings, lists can be combined using the concatenation ($+$) operator

In [None]:
original = ['H', 'He', 'Li']
final = original + ['Be']
print(final)

* This code does not mutate either of the original list objects
* Instead, it creates a new list whose entries refer to the items in the original lists
  
![Operation](lec07-05.jpg)

* A list has a type, and Python complains if we use a value of some type in an inappropriate way
* For example, an error occurs when the concatenation operator is applied to a list and a string

In [None]:
print(['H', 'He', 'Li'] + 'Be')

* We can also multiply a list by an integer to get a new list containing the elements from the original list repeated that number of times

In [None]:
metals = ['Fe', 'Ni']
print(metals * 3)

* As with concatenation, the original list is not modified; instead, a new list is created
* One operator that does modify a list is $\rm\color{orange}{del}$, which stands for delete
* It can be used to remove an item from a list

In [None]:
metals = ['Fe', 'Ni']
del metals[0]
print(metals)

### The In Operator on Lists<br>
* The in operator can be applied to lists to check whether an object is in a list

In [None]:
nobles = ['helium', 'neon', 'argon', 'krypton', 'xenon', 'radon']
gas = input('Enter a gas: ') # argon
if gas in nobles:
    print(f'{gas} is noble')

gas = input('Enter a gas: ') # nitrogen
if gas in nobles:
    print(f'{gas} is noble')

* Unlike with strings, when used with lists, the in operator checks only for a single $\rm\color{magenta}{item}$; it does not check for sublists
* This code checks whether the list $[1,\space 2]$ is an item in the list $[0,\space 1,\space 2,\space 3]$

In [None]:
print([1, 2] in [0, 1, 2, 3])

### Slicing Lists<br>
* Geneticists describe C. elegans phenotypes (nematodes, a type of microscopic worms) using three-letter short-form markers
* Examples include Emb (embryonic lethality), Him (high incidence of males), Unc (uncoordinated), Dpy (dumpy: short and fat), Sma (small), and Lon (long). We can keep a list

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Sma']
print(celegans_markers)

* It turns out that Dpy worms and Sma worms are difficult to distinguish from each other, so they are not as easily differentiated in complex strains
* We can produce a new list based on celegans_phenotypes but without Dpy or Sma by taking a slice of the list

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Sma']
useful_markers = celegans_markers[0:4]

* This creates a new list consisting of only the four distinguishable markers, which are the first four items from the list that celegans_phenotypes refers to
  
![Slicing01](lec07-06.jpg)
  
* The first index in the slice is the starting point. The second index is one more than the index of the last item we want to include
    * For example, the last item we wanted to include, Lon, had an index of $3$, so we use $4$ for the second index
* More rigorously, list$[i$:$j]$ is a slice of the original list from index i (inclusive) up to, but not including, index j (exclusive)
* Python uses this convention to be consistent with the rule that the legal indices for a list go from 0 up to one less than the list's length

* The first index can be omitted if we want to slice from the beginning of the list, and the last index can be omitted if we want to slice to the end

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Sma']
print(celegans_markers[:4])
print(celegans_markers[4:])

* To create a copy of the entire list, omit both indices so that the "slice" runs from the start of the list to its end

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Sma']
celegans_copy = celegans_markers[:]
celegans_markers[5] = 'Lvl'
print(celegans_markers)
print(celegans_copy)

* The list referred to by celegans_copy is a clone of the list referred to by celegans_markers
* The lists have the same items, but the lists themselves are different objects at different memory addresses
  
![Slicing02](lec07-07.jpg)

### Aliasing: What's in a Name?<br>
* An alias is an alternative name for something
* In Python, two variables are said to be aliases when they contain the same memory address
* For example, the following code creates two variables, both of which refer to a single list
  
![Aliasing01](lec07-08.jpg)

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Sma']
celegans_alias = celegans_markers
celegans_markers[5] = 'Lvl'
print(celegans_markers)
print(celegans_alias)

* When we modify the list using one of the variables, references through the other variable show the change as well
* Aliasing is one of the reasons why the notion of mutability is important
    * For example, if x and y refer to the same list, then any changes you make to the list through x will be "seen" by y, and vice versa
* This can lead to all sorts of hard-to-find errors in which a list's value changes as if by magic, even though our program does not appear to assign anything to it
* This can not happen with immutable values like strings. Since a string can not be changed after it has been created, it is safe to have aliases for it

### Mutable Parameters<br>
* Aliasing occurs when we use list parameters as well, since parameters are variables
* Here is a simple function that takes a list, removes its last item, and returns the list

In [None]:
def remove_last_item(L):
    """ (list) -> list
    
    Return list L with the last item removed.
    
    Precondition: len(L) >= 0

    >>> remove_last_item([1, 3, 2, 4])
    [1, 3, 2]
    """
    
    del L[-1]
    return L

* In the code that follows, a list is created and stored in a variable; then that variable is passed as an argument to $\rm\color{magenta}{remove\_last\_item}$

In [None]:
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Lvl']
remove_last_item(celegans_markers)
print(celegans_markers)

* When the call on function $\rm\color{magenta}{remove\_last\_item}$ is executed, parameter L is assigned the memory address that celegans_markers contains
* That makes celegans_markers and L $\rm\color{magenta}{aliases}$
* When the last item of the list that L refers to is removed, that change is "seen" by celegan_markers as well
* Since $\rm\color{magenta}{remove\_last\_item}$ modifies the list parameter, the modified list does not actually need to be returned. We can remove the return statement

In [None]:
def remove_last_item(L):
    """ (list) -> list
    
    Return list L with the last item removed.
    
    Precondition: len(L) >= 0

    >>> remove_last_item([1, 3, 2, 4])
    [1, 3, 2]
    """
    
    del L[-1]
    
celegans_markers = ['Emb', 'Him', 'Unc', 'Lon', 'Dpy', 'Lvl']
remove_last_item(celegans_markers)
print(celegans_markers)

### List Methods<br>
* Lists are objects and thus have methods
* Table below gives some of the most commonly used list methods

In [None]:
colors = ['red', 'orange', 'green']
colors.extend(['black', 'blue'])
print(colors)
colors.append('purple')
print(colors)
colors.insert(2, 'yellow')
print(colors)
colors.remove('black')
print(colors)

| Method | Description |
| :--: | :--: |
| L.append(v) | Appends value v to list L |
| L.clear() | Removes all items from list L |
| L.count(v) | Returns the number of occurrences of v in list L |
| L.extend(v) | Appends the items in v to L |
| L.index(v) | Returns the index of the first occurrence of v in L — an error is raised if v does not occur in L |
| L.index(v, beg) | Returns the index of the first occurrence of v at or after index beg in L — an error is raised if v does not occur in that part of L |
| L.index(v, beg, end) | Returns the index of the first occurrence of v between indices beg (inclusive) and end (exclusive) in L; an error is raised if v does not occur in that part of L |
| L.insert(i, v) | Inserts value v at index i in list L, shifting subsequent items to make room |
| L.pop() | Removes and returns the last item of L (which must be nonempty) |
| L.remove(v) | Removes the first occurrence of value v from list L |
| L.reverse() | Reverses the order of the values in list L |
| L.sort() | Sorts the values in list L in ascending order (for strings with the same letter case, it sorts in alphabetical order) |
| L.sort(reverse=True) | Sorts the values in list L in descending order (for strings with the same letter case, it sorts in reverse alphabetical order) |

* All the methods shown above modify the list instead of creating a new list
* The same is true for the methods clear, reverse, sort, and pop
* Of those methods, only pop returns a value other than None. (pop returns the item that was removed
from the list)
* In fact, the only method that returns a list is copy, which is equivalent to L$[:]$
* Finally, a call to append is not the same as using $+$
    * First, append appends a single value, while $+$ expects two lists as operands
    * Second, append modifies the list rather than creating a new one

### Where Did the List Go?<br>
* Programmers occasionally forget that many list methods return $\rm\color{orange}{None}$ rather than creating and returning a new list
* As a result, lists sometimes seem to disappear

In [None]:
colors = 'red orange yellow green blue purple'.split()
print(colors)
sorted_colors = colors.sort()
print(sorted_colors)

* In this example, colors.sort() did two things
    * It sorted the items in the list
    * It returned the value None
* That is why variable sorted_colors refers to None
* Variable colors, on the other hand, refers to the sorted list

In [None]:
print(colors)

* Methods that mutate a collection, such as append and sort, return None
* It is a common error to expect that they will return the resulting list

### Working with a List of Lists<br>
* We said that lists can contain any type of data. That means that they can contain other lists
* A list whose items are lists is called a nested list
  
![NestedList01](lec07-09.jpg)

In [None]:
life = [['Canada', 76.5], ['United States', 75.5], ['Mexico', 72.0]]

* Notice that each item in the outer list is itself a list of two items
* We use the standard indexing notation to access the items in the outer list

In [None]:
life = [['Canada', 76.5], ['United States', 75.5], ['Mexico', 72.0]]
print(life[0])
print(life[1])
print(life[2])

* Since each of these items is also a list, we can index it again, just as we can chain together method calls or nest function calls

In [None]:
life = [['Canada', 76.5], ['United States', 75.5], ['Mexico', 72.0]]
print(life[1])
print(life[1][0])
print(life[1][1])

* We can also assign sublists to variables

In [None]:
life = [['Canada', 76.5], ['United States', 75.5], ['Mexico', 72.0]]
canada = life[0]
print(canada)
print(canada[0])
print(canada[1])

* Assigning a sublist to a variable creates an alias for that sublist
  
![NestedList02](lec07-10.jpg)
  
* As before, any change we make through the sublist reference will be seen when we access the main list, and vice versa

In [None]:
life = [['Canada', 76.5], ['United States', 75.5], ['Mexico', 72.0]]
canada = life[0]
canada[1] = 80.0
print(canada)
print(life)