# Lists

While we've learned a lot about Pythons' *primitive* values
including `int`, `float`, `str`, and `bool`,
there's more to programming than writing lines of code 
to individually manipulate these indivisible 
(more formally called *atomic*) beads of information.

Rather than writing code to manipulate each item a la carte,
we will instead want to organize our data in various 
structures, sometimes called *collections*,
in part to facilitate acting upon the whole collection subsequently.

We often want to collect a bunch of *related* values together,
so we can manipulate them all together.
Imagine, for eample that you are scraping the web. 
Then you might have a list of 10,000 addresses that you need to scrape.
One the other hand, if we were keeping track of public health trends,
then we might want some way to represent a collection of test results,
where each result represents one member of the population.

The most simple and natural collections 
to work with in Python are `list` objects.
They represent a collection of values, in some specified **order**.
The most simple way to work with a list 
is to write one down directly in code as a *list literal*.
We do this by placing to brackets (`'[`, `]`) 
with some comma-separated values in between:

In [None]:
my_list = [13, 2, 3, 1, 8, 5, 1]

## Accessing from a list 

Now that we've defined this list, we can access its values in the following ways. 

### Printing lists

In [None]:
print(my_list)

### Indexing into a list

We can access its elements by their indices using bracket notation `my_list[<index>]`

In [None]:
print(my_list[1])
print(my_list[2])

Notice that in bracket notation, we supply an *index* inside the brackets. 
The index indicates which element of the list we are retrieving.
Each element is associated with an index based on its location, 
i.e., where it appears in the list.

***Note also that all lists start with index 0***

In [None]:
print(my_list[0])

***Side note:*** For reasons we don't need to get into,
indexing at 0 is a sensible convention for indexing items in arrays.
This owes to historical reasons that are more readily seen
in programming languages like C that are a bit closer to the metal.
There, an array is basically a block of consecutive memory locations.
In C, our reference to an array is a reference to the first item in the list.
Each index then indicates the offset of that item relative to the first item.
The index of 0 means go to the head of list and then do not advance any further.
An index of 1 means go to the head of the list and then advance one item down the list.

While indexing at 0 is a widespread standard that most languages adopt,
the reasons for it do not matter so much in modern interpreted languages.
Indeed, not all languages follow the convention.
Julia, a wonderful language developed at MIT
specifically for mathematical computing indexes at 1,
to better match how we talk about collections of items in math notation,
and so does MatLab (a proprietary language that is almost affordable to academics
but costs one kidney per license upon graduation). 

Given list of unknown length, we often need to find out its length
in order to know *which indices are valid*.
We can do that using  Python's built-in `len` function:,
the same as we used to calculate the length of strings:

In [None]:
len(my_list)

Note that because the list indices start at `0`, 
the last element has index `len(mylist-1)`:

In [None]:
print(my_list[len(my_list) - 1])

If we try to access any index larger than that we'll get an error that looks like this:

```
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-15-bf67fb38c3b1> in <module>()
----> 1 print(my_list[len(my_list)])

IndexError: list index out of range
```

### Negative Indexing

We can use negative indices where `my_list[-i]`
is shorthand for `my_list[len(my_list)-i]`
To access the last element of the list we can use index `[-1]`, 
and to access the 2nd to last element, we can use `[-2]`, etc. 

### Slicing Lists

Sometimes we want to access a continuous stretch of values from within our lists. 
To indicate that we want to extract a whole range of indices,
we insert a colon into our index notation with notatino like `my_list[start:end]`:

In [None]:
print(my_list[4:6])

Note that slices are inclusive on the left, but not on the right, 
e.g., [4:6] gives us a list that includes the elements 
with index 4 and index 5, but not index 6.

This is useful because it makes it easy to split lists into two halves:

In [None]:
split_point = 4
print(my_list[:split_point])
print(my_list[split_point:])

Notice above that we used one nice additional Python slicing feature. 
When we use `[:end]`, Python will assume 
that we start at the beginning of the list, 
thus is equivalent to `[0:end]`. 
Similarly, when we invoke `[end:]`, Python will assume 
that we are going all the way to the end of the list, 
and thus this is equivalent to `[end:len(my_list)]`.

## Lists and Strings

You might notice that lists and strings have some things in common.
Unlike integers or floats, strings and lists can be of various lengths.
Moreover supports the `len()` function.

The similarities don't end there. 
In many programming languages, text strings 
are expicitly defined as arrays of characters.
In Python, while there is no dedicated, separate character type,
the list-like nature of strings remains. 

For starters, we can access indices or slices of strings:

In [None]:
"This is a string, I wrote it myself!"[-18:]

Similarly, just as we can concatenate two strings together, 
we can concatenate two lists together:

In [None]:
[1, 1, 2, 3] + [5, 8, 13, 21]

## Updating elements

Sometimes we'll want to update a single element in a list
while keeping all others the same.
This is easy, using `[index]` notation,
we just a sign a new value to that index:

In [None]:
my_list2 = my_list
my_list2[1] = 1985
print("my_list2: ", my_list2)

We might have thought that we were very clever and by creating the variable `my_list2`, avoided messing up the original `my_list`, let's check to see how successful we were:

In [None]:
print("my_list:  ", my_list)
print("my_list2: ", my_list2)

Not very! See, when we assign `var = some_list`, 
we're not actually making a copy of the list. 
Instead, we are assigning a *reference* to **the same list**! 
To really make a copy, we'll want to use the `copy` package 
and call the `deepcopy` function:

In [None]:
import copy
my_list2 = copy.deepcopy(my_list)
my_list[1] = 2
print("my_list:  ", my_list)
print("my_list2: ", my_list2)

## Appending items to a List

Often, we don't want to just update an item in place, 
we'll actually want to grow the list to accomodate new data. 
The easiest way to do this is with the `append` function:

In [None]:
my_list.append(21)
print("my_list", my_list)

## Removing items from a list

There are a few options for how to remove items from lists.
One way is to call the lists `.pop()` method 
which takes an index as input and simultaneously ***removes and returns*** that element.
If we call `.pop()` without providing any index as an argument
it will default to removing the last element of the list.
Appending and popping corresponds to using our list as a *stack*.
Stacks are useful when we want to access data 
in a last in, first out format. 

I tend to access my emails in a stack-like fashion, 
but unfortunately that means that email that gets buried
for over a week may never see the light of day.

In [None]:
print(my_list2)
x = my_list2.pop(1)
print(my_list2)

## Sorting and reversing
Finally, you might want to take a list 
and sort it according to some ordering.
To sort according to whatever ordering is default
(alphabetical for strings, numeric for numbers),
just call a list's `.sort()` method. 

In [None]:
my_list.sort()
my_list

Note that `.sort()` actually changes the order of the items in the list.
After calling sort, my_list is now permanently sorted
and we have lost all information about the original
order in which the iterms appeared. 

If you would only like to temporarily access the items in sorted order
but do not want to disturb the original list, 
you can call the `sorted()` function which will 
return a sorted version of the list while leaving the original untouched.

We can also reverse the order of elements in a list by using the `.reverse()` method.

In [None]:
my_list.reverse()
my_list

## Checking membership 

Sometimes we want to check to see if a given value is represented in a list. 
We can do this using the built-in syntax `<element> in <list>`.
This is a Boolean function that returns True if the element 
is one of the values contained in the list and False otherwise.

In [None]:
print(13 in my_list)
print("rhinocerous" in my_list)

### Other list methods

Lists support a number of other methods. These include `.extend(other_list)`
which works similarly to `.append()` but instead of appending a single element,
it takes a number of elements in another list and adds them all to our list.

In general, you will not remember every method or attribute of every object,
and even when you do remember one, you may forget precisely what it is called
or precisely how to use it. 

Thus now is a good time to introduce two handy utilities that we can call in Jupyter.
First, we can call `dir()` on any object to access its full listing of attributes and methods.
Below you can run `dir("Hello")` to see the methods associated with a string,
including the familiar `lower`, `upper`, and `title`. 
You can also call `dir([1,2,3])` to see the full set of methods associated with a list.

In [None]:
# dir("Hello")
# dir([1,2,3])