# Lists

## Basic Idea
The primary variables we have seen so far have all been single values
(regardless of their type),
e.g. a single string, single number, or single boolean.
Very often, it is helpful to do more than just a single computation, but
instead perform a computation across (or on) a large amount of data,
requiring *data structures* - objects specifically designed to store and
organize multiple pieces of data.  

One data structure available in Python
is a *list*: an ordered, mutable collection of objects.  *Mutable* refers
to the fact that the contents of the list can be changed.  

Lists in Python
can contain any type of object including strings, numerical types, booleans,
and other lists (along with other objects we have not yet talked about).
The have no set size (in contrast to many other languages) and it is
possible to add and remove things as desired.  

### List Creation
We've already seen one example of a list when discussing for loops.  Recall
that lists are defined in Python by listing the elements inside `[]`.
For example, we could create a list of strings of color names:

In [None]:
colors = ['orange', 'red', 'green', 'blue', 'yellow']
print(colors)
print(type(colors))

Lists are not limited to strings, however; we could also have a list with
numerical values:

In [None]:
heights = [67, 62.5, 69, 71, 72.5, 75]
print(heights)

Notice that the previous list contains both `float` and `int` values.  We could
also have a list with an even wider range of types.  For example, the following
list has an `int` for its first element, a `str` for its second element, another
list as its 3rd element, and a `bool` as its fourth element.

In [None]:
mixed_list = [10, "lays bbq", [3.99, 4.49, 2.99], True]
print(mixed_list)

Note
that `list()` is a function to convert other types to a list, so you should
not use `list` as a variable name (a very common mistake).

#### Creating and Adding to Lists
The previous lists were all initially created with all of their elements.
It's also possible (and common) to create an empty list and add elements
as desired.  To create an empty list, we simply set the desired variable
name to empty brackets, e.g.,

In [None]:
countries = []

Then, we can add items to the list via the `append()` method.  Note, this is a *method*,
not a function.  Recall that a function is a named block of code.  A *method*, while
similar, is tied to a specific object, so in order to call it we preceed the method
name by the variable name for the object we are invoking it with, e.g. `variablename.methodname()`.

In [None]:
countries.append('Italy')
countries.append('Mexico')
countries.append('Canada')
countries.append('USA')
countries.append('South Africa')
countries.append('Greece')
countries.append('Spain')
print(countries)

Using the `append()` method to add to a list isn't limited to lists
that were initially created as empty, we can append to any list regardless
of how it was created.

#### List from Range

Often, we wish to create a list populated with a range of numbers.
Recall the `range(start, stop, step=1)` function used
to loop through a preset range of values.  To create a list with the
values in this range, we can typecast the result of range to a list with
the function `list()` (like we could convert from a string to a float).
For instance, suppose we want a list with the numbers 1 through 10:

In [None]:
numbers = list(range(1,11))
print(numbers)


### List Operations

#### Checking if Element in List
One common operation we need to perform is to see if an item exists in
a list.  In Python, we do that via the `in` reserved word, e.g.
`val in listname`, which evaluates to `True` or `False`.

In [None]:
print("Spain" in countries)
print("Australia" in countries)

More often than not, we would not choose to `print` the result of
this check, but instead utilize this check as the condition for an `if`
statement to control what occurs when the element is present in the list.

#### Concatenation

Another very common operation is *concatenation*, where the elements
of two lists are combined, one after another.  In Python, the `+`
operator performs concatenation when applied to lists.

In [None]:
list1 = [-1,0,1]
list2 = [2,3,4]
list3 = list1 + list2
print(list1)
print(list2)
print(list3)

#### Repetition

In addition to the `+` operator, the `*` operator is also defined
for lists.  When we multiply a list by an integer, it is used to
repeat the list a specified number of times.  This follows from
the use of `+` as concatenation and the 
general definition of multiplication where it is simply adding (concatenating)
the list to itself a specified number of times.  For example:

In [None]:
repeated = list1*3
print(repeated)

## Accessing List Items

### Indexing

To access individual elements in a list, we use the square brackets, `[]` with the index
number of the desired element inside.  Python lists are indexed starting at 0, so the 1st
element has index 0, 2nd element has index 1, and so on.  Note, that because indexing starts
at 0, the index of the last item is the one less than the length of the list.

For example, consider the list of countries above:

In [None]:
print(countries)
print(countries[0])
print(countries[1])

We can also use the index notation to set a specific element (rather than just access it),
which changes the underlying list.

In [None]:
countries[2] = 'Argentina'
print(countries)

In addition to positive indices (which start at 0 with the beginning of the list),
Python also allows you to use negative indices, which work from the back of the list
where `[-1]` is the last item, `[-2]` is the 2nd to last item, etc.  This
is a particularly useful feature of Python that can be very handy depending on what
you are attempting to accomplish.

In [None]:
print(countries[-1])
print(countries[-2])

#### Lists of Lists
This indexing only applies to the outermost list level.  Suppose we have 
a list where each element is itself another list.  Then, this indexing notation
will give another list, which we would then need to index into with another index.
For example,

In [None]:
cities = [["Grand Rapids", "Detroit", "Traverse City"],
          ["Chicago", "Urbana-Champaign", "Peoria"],
          ["San Diego", "San Francisco", "Los Angeles", "Santa Barbara"]]
print(cities[2])
ca = cities[2]
print(ca[1])

In the above, we explicitly stored the inner list before indexing into it,
but that's not necessary, we can provide multiple indices by using multiple
sets of square brackets, e.g., `[][]`.  For example, the above could have
simply been:

In [None]:
print(cities[2][1])

The first index applies to the outermost list, the 2nd index into the inner list.  Note
in this example that not all inner lists are of the same length.
It's also possible to have more than 2 levels of nested lists as well,
and there are times where 3 or 4 levels of nested lists are the most natural choices.
Regardless, from left-to-right the square brackets index to from outer most to inner most
lists.

### Slicing

Often, we wish to do more than access a single list element, and instead
want to extract some sublist of the items in the list.  This is
known as *slicing*.  In Python, slicing creates a copy of the list
with a specified set of the elements from the original list.  We still
use the bracket notation, but instead of using a single index, we give
it multiple indices to denote the start and stop index for the slice,
separated by a colon:
```
listname[start:stop]
```
The slice starts at index `start` (inclusive)
and goes up to, but not including `stop` (exclusive).  In addition
to `start` and `stop`, we can also optionally provide a `stride`
(also separated by a colon):
```
lstname[start:stop:stride]
```
which indicates how much the index should go up by between for each
consecutive element selected.  The default value for `stride` is 1,
which indicates to get all indices.

In [None]:
print(countries[1:4])
print(countries[1:4:2])

All three values (`start`, `stop`, and `stride`) can be negative.
* negative `start` or `stop` -> just a negative index
* negative `stride` -> go in reverse from end to front

In [None]:
print(countries)
print(countries[-2:0:-1])

Additionally, either (or both) `start` and `stop` can be left empty:
* empty `start` -> start at the beginning
* empty `end` -> go until very end

In [None]:
print("all but first =", countries[1:])
print("all but last =", countries[:-1])
print("reversed =", countries[::-1])

## Lists and Loops

We've previously seen one example of how a for loop can be
used to loop through the items in a list
```
for val in lst:
    # some code using val
```



tv_shows = ["the big bang theory", "the good place", "mandalorian", "great british baking show"]
for show in tv_shows:
    print(show)

If all we need are the values in the list, this
style of loop is a great choice.  However, there
are times where we need the list indices in addition to the
list entries.  For example, we may need to update the element
of the list or need to get a corresponding entry in another list.
We can do this by using a for loop with `range`, but combining it with
the built-in function `len`.  Recall that the function `len` can be used
to get the length of a string, it can also be used to get the number
of elements in a list.

For instance, suppose we want to edit an existing list of numbers to contain
the squares of the numbers:

In [None]:
nums = [-2, 1, 2, 4]
for i in range(len(nums)):
    nums[i] = nums[i]**2

print(nums)

### List Comprehension

We can also utilize `for` loops to create or fill lists.
Suppose we wish to fill a list with the squares of all of the numbers
up to 6.  We could use a for loop to loop through the values $0, 1, \ldots, 6$
and append the square of the loop variable to a list, which is completely valid.

In [None]:
lstA = []
for i in range(7):
    lstA.append(i**2)
print(lstA)

However, there is also a short-hand way of accomplishing this same task known
as *list comprehension*.  It does not fundamentally provide any additional
capability, but it is considered a "pythonic" way of doing things and you'll
likely come across it at some point in code.

In [None]:
lstB = [x**2 for x in range(7)]
print(lstB)

This list comprehension can also be combined with conditionals.  For instance, maybe we want all of the squares that are odd:

In [None]:
lstC = [x**2 for x in range(7) if x%2 == 1]
print(lstC)

## Functions and Methods with Lists

### Functions
We've seen already how one of the built-in functions, `len`, can be
applied to lists.  A few other commonly used built-in functions can
be applied to lists as well.  Note, as these are all functions, the list
is passed in as an argument to the function.

* `len(lstname)` -> returns the number of elements in the list
* `sum(lstname)` -> returns the sum of the elements in the list
* `max(lstname)` -> returns the maximum of the elements in the list
* `min(lstname)` -> returns the minimum of the elements in the list

In [None]:
counts = [30, 25, 10, 14, 6]
print("number of elements =", len(counts))
print("sum =", sum(counts))
print("min =", min(counts))
print("max =", max(counts))

### List Methods

While the functions are general built-in functions that can be applied in
multiple ways (for instance to get the length of a string or to compute the
max of 2 values), there are also operations specific to lists, often an individual
list.  These are *methods* associated with the specific object and are called by 
`lstname.methodname(args)`.  The full list
of methods for python lists is available in the
[documentation](https://docs.python.org/3/tutorial/datastructures.html), but
some of the most commonly used methods are:


#### Adding to Lists

* `append(x)` -> add item `x` to the end of the list
* `insert(i, x)` -> insert item `x` at index `i`

In [None]:
names = ["Matt","Luke", "Julia", "Miles", "Alice", "Paul", "Matt", "Mia"]
names.insert(1, "Michelle")
print(names)

#### Removing from Lists

* `pop(i)` -> remove the item at index `i` in the list and return it (`i` is optional, default
   is to remove and return the last item in the list)
* `remove(x)` -> remove the first item with value equal to `x`.  Raises a `ValueError` if no
  item equal to `x` exists in the list.

In [None]:
removed = names.pop(3)
names.remove("Luke")
print(names)

#### Miscellaneous

* `index(x)` -> returns the index of first item with value equal to `x` .  Raises a `ValueError`
  if no item equal to `x` exists in the list.  Can optionally pass a `start` and `end`,
  to specify subsequence of list in which to search for `x`.
* `sort()` -> sort the list *in place*.  This means that it does not return a new list, but modifies
  the original list to be sorted.  
* `count(x)` -> return the number of times `x` occurs in the list
* `copy()` -> return a copy of the list

In [None]:
location = names.index("Mia")
print(location)

names.sort()
print(names)

print(names.count("Matt"))

names2 = names.copy()
names.append("Jill")
print(names)
print(names2)

## More About Lists

### References

We previously talked about lists in python, including
how to create them, how to traverse them, and the methods
lists have available.  Technically everything in python is
an *object*.  Objects can have both data and methods associated
with them.  When we have objects, the actual variable name is not actually the
object itself, but instead a *reference* to the underlying spot
in memory where the computer stores the object.

We can check in python if 2 variables refer to the same object
with `is`.  This test is far stronger than equality, it tests if
they actually point to the same underlying object.  We can use the
`id()` function to get an identifier for an object, unique for
each object.  These allow us to see when 2 variable names actually
point to the same underlying object

In [None]:
a = 1
b = a
print(id(a), id(b), a is b)
b = 2
print(id(a), id(b), a is b)

#### Who Cares?

You are likely wondering to yourself why any of this matters.
Who actually cares if 2 variables reference the same
underlying object or not?  For basic types, such as `float`,
`int`, `bool`, and `str`, you are right to wonder -- there's really
no purpose because the underlying object never changes.
That's what we saw in the previous code example:  originally
both `a` and `b` referred to the same underlying object.  However,
when we set `b = 2`, it actually created a new underlying object
with the value of 2 and then set the variable `b` to refer
to that new underlying object.  This happened because `int` types
are *immutable* -- they cannot be changed.  

Recall, however, that lists are *mutable*, meaning the underlying
object can actually be changed.  Let's consider what this
means by looking at the following code:

In [None]:
lsta = [10,12,14]
lstb = lsta
print(lsta)
print(lstb)
print(id(lsta), id(lstb), lsta is lstb)
lstb[0] = 8
print(lsta)
print(lstb)

The line `lstb = lsta` assigns `lstb` to refer to the
same spot in memory as `lsta`.
Because of this, when we modified the first element of `lstb`,
the first element of `lsta` changed as well.  This occurred
because they both reference the same spot in memory,
and this underlying spot in memory is actually what is changed.

This means that you have to be more careful with lists (and other
mutable types that we haven't yet talked about because modifying
the object can have wider impacts than you may expect at first).

### References with Functions

In addition to being careful in general with mutable types,
we also need to be very careful when passing mutable objects
as arguments to functions.  Why?  When passing a variable
as an argument to a function in python, only the reference
is copied, not the underlying memory.  This causes any edits
made inside the function to the list to modify the list outside
of the function as well.

To see this effect, consider the following 2 functions:

In [None]:
def foo(val1, val2):
    val1 = val1*2
    val2 = val2*2

c = 3
d = 4
foo(c,d)
print(c, d)

In [None]:
def bar(lst1, lst2):
    lst1[0] = "it changed"
    lst2 = [7,8,9]

alist = [1,2,3]
blist = [4,5,6]
bar(alist, blist)
print(alist)
print(blist)

Notice that even though `val1` and `val2` were modified
inside `foo`, `c` and `d` outside the function
remained unchanged.  However, in the second code block,
when `lst1` inside `bar` was modified, it actually
modified `alist` which was passed in for that parameter.
However, because `lst2` was set to a new list in `bar`,
it just modified the reference (which was copied), not
the underlying list, so `blist` remained unchanged.

## Special Functions with Lists - Revisiting Random Numbers

We looked previously at how we could use the
[`random` module](https://docs.python.org/3/library/random.html)
to generate"random" (actually pseudorandom) numbers.
The functions we considered previously all generated a single
random number at a time.  To generate multiple random numbers,
we would need to make multiple calls to the function (or
nest the function in a loop that runs the appropriate number
of times).

The `random` module also has some extremely convenient functions
dealing with sequences that allow for the generation of
random orderings or lists of random numbers.

As before, our first step is to import the `random` module.

In [None]:
import random

#### Random Sample - Without Replacement

To select `n` items randomly from a list without replacement,
we can use the `sample(population, n)`.  This function
will return a list of length `n` with the samples.

For instance, we could use this to simulate picking 3 marbles
from a bag of marbles.

In [None]:
bag = ['r','r','b','g','o','o']
marbles = random.sample(bag, 3)
print(marbles)

The list above could have held anything (numerical types, strings,
a mix, etc.) You can also call `sample` with the result of `range()` instead
of with a list if you want samples from a consecutive range of integers.

For example, we can randomly sample without replacement 20 integers from 1 to 100
(inclusive) with:

In [None]:
nums = random.sample(range(1,101), 20)
print(nums)

#### Random Choices with Replacement

The previous example sampled from a list without replacement.  There
is also a function to sample `n` items from a list
with replacement, `choices(population, k=n)`.  As with `sample`,
the first argument can be a list or the result of a range function.
With `choices`, you must explicitly have `k=`, or it will try
to interpret it as a different optional parameter.

For example, we would often use
choices to simulate rolling multiple dice.
We can simulate rolling 8 10-sided dice with:

In [None]:
rolls = random.choices(range(1,11), k=8)
print(rolls)

The `choices` function also has the ability to accept a `weights` argument.
This can be used to specify the relative weight each choice should have.

For instance, suppose we were flipping an unfair coin that lands heads 80%
of the time and tails 20% of the time.  We could simulate 10 flips of this
coin with:

In [None]:
flips = random.choices(['H','T'], weights=[0.8, 0.2], k=10)
print(flips)

The weights in the above could have been integers as well (so we could have
used `[8,2]` to simulate the same process).

#### Randomly Permuting Order of Items

The final useful function in the `random` module
that deals with sequences is the `shuffle(seq)`
function that takes `seq` and randomly shuffles
the order of the elements *in-place* (meaning the
original sequence is changed.)

For instance, suppose we have a deck of cards (ignoring
suit). We could create this deck as a list:

In [None]:
deck = ['A', 'K', 'Q', 'J', '10', '9', '8', '7', '6', '5', '4', '3', '2']*4

If we wished to simulate a game, we could virtually shuffle
the deck with `shuffle`:

In [None]:
random.shuffle(deck)
print(deck)

Note that since `shuffle()` works in place, we didn't assign the result of `random.shuffle(deck)`
to a variable.