# Lists

This chapter presents one of Python's most useful built-in types, lists.
You will also learn more about objects and what can happen when multiple variables refer to the same object. 



## A list is a sequence

Like a string, a **list** is a sequence of values. In a string, the
values are characters; in a list, they can be any type.
The values in a list are called **elements**.

There are several ways to create a new list; the simplest is to enclose the elements in square brackets (`[` and `]`).
For example, here is a list of two integers. 

In [3]:
numbers = [42, 123]

And here's a list of three strings.

In [2]:
cheeses = ['Cheddar', 'Edam', 'Gouda']

The elements of a list don't have to be the same type.
The following list contains a string, a float, an integer, and even another list.

In [3]:
t = ['spam', 2.0, 5, [10, 20]]

A list within another list is **nested**.

A list that contains no elements is called an empty list; you can create
one with empty brackets, `[]`.

In [4]:
empty = []

The `len` function returns the length of a list (the number of elements).

In [5]:
len(cheeses)

3

The length of an empty list is `0`.

In [6]:
len(empty)

0

## Lists are mutable

To read an element of a list, we can use the bracket operator.
The index of the first element is `0`.

In [7]:
cheeses[0]

'Cheddar'

Unlike strings, lists are mutable. When the bracket operator appears on
the left side of an assignment, it identifies the element of the list
that will be assigned.

In [12]:
numbers[1] = 17
numbers

[42, 17]

The second element of `numbers`, which used to be `123`, is now `17`.

List indices work the same way as string indices:

-   Any integer expression can be used as an index.

-   If you try to read or write an element that does not exist, you get
    an `IndexError`.

-   If an index has a negative value, it counts backward from the end of
    the list.

The `in` operator works on lists -- it checks whether a given element appears anywhere in the list.

In [13]:
'Edam' in cheeses

True

In [14]:
'Wensleydale' in cheeses

False

Although a list can contain another list, the nested list still counts as a single element -- so in the following list, there are only four elements.

In [15]:
t = ['spam', 2.0, 5, [10, 20]]
len(t)

4

And `10` is not considered to be an element of `t` because it is an element of a nested list, not `t`.

In [16]:
10 in t

False

## List slices

The slice operator works on lists the same way it works on strings.
The following example selects the second and third elements from a list of four letters.

In [17]:
letters = ['a', 'b', 'c', 'd']
letters[1:3]

['b', 'c']

If you omit the first index, the slice starts at the beginning. 

In [18]:
letters[:2]

['a', 'b']

If you omit the second, the slice goes to the end. 

In [19]:
letters[2:]

['c', 'd']

So if you omit both, the slice is a copy of the whole list.

In [20]:
letters[:]

['a', 'b', 'c', 'd']

Another way to copy a list is to use the `list` function.

In [21]:
list(letters)

['a', 'b', 'c', 'd']

Because `list` is the name of a built-in function, you should avoid using it as a variable name.


## List operations

The `+` operator concatenates lists.

In [22]:
t1 = [1, 2]
t2 = [3, 4]
t1 + t2

[1, 2, 3, 4]

The `*` operator repeats a list a given number of times.

In [23]:
['spam'] * 4

['spam', 'spam', 'spam', 'spam']

No other mathematical operators work with lists, but the built-in function `sum` adds up the elements.

In [24]:
sum(t1)

3

And `min` and `max` find the smallest and largest elements.

In [25]:
min(t1)

1

In [26]:
max(t2)

4

## List methods

Python provides methods that operate on lists. For example, `append`
adds a new element to the end of a list:

In [27]:
letters.append('e')
letters

['a', 'b', 'c', 'd', 'e']

`extend` takes a list as an argument and appends all of the elements:

In [28]:
letters.extend(['f', 'g'])
letters

['a', 'b', 'c', 'd', 'e', 'f', 'g']

There are two methods that remove elements from a list.
If you know the index of the element you want, you can use `pop`.

In [29]:
t = ['a', 'b', 'c']
t.pop(1)

'b'

The return value is the element that was removed.
And we can confirm that the list has been modified.

In [30]:
t

['a', 'c']

If you know the element you want to remove (but not the index), you can use `remove`:

In [31]:
t = ['a', 'b', 'c']
t.remove('b')

The return value from `remove` is `None`.
But we can confirm that the list has been modified.

In [32]:
t

['a', 'c']

If the element you ask for is not in the list, that's a ValueError.

In [33]:

t.remove('d')

ValueError: list.remove(x): x not in list

## Lists and strings

A string is a sequence of characters and a list is a sequence of values,
but a list of characters is not the same as a string. 
To convert from a string to a list of characters, you can use the `list` function.

In [34]:
s = 'spam'
t = list(s)
t

['s', 'p', 'a', 'm']

The `list` function breaks a string into individual letters.
If you want to break a string into words, you can use the `split` method:

In [35]:
s = 'pining for the fjords'
t = s.split()
t

['pining', 'for', 'the', 'fjords']

An optional argument called a **delimiter** specifies which characters to use as word boundaries. The following example uses a hyphen as a delimiter.

In [88]:
s = 'ex-parrot'
t = s.split('-')
t

['ex', 'parrot']

If you have a list of strings, you can concatenate them into a single string using `join`.
`join` is a string method, so you have to invoke it on the delimiter and pass the list as an argument.

In [37]:
delimiter = ' '
t = ['pining', 'for', 'the', 'fjords']
s = delimiter.join(t)
s

'pining for the fjords'

In this case the delimiter is a space character, so `join` puts a space
between words.
To join strings without spaces, you can use the empty string, `''`, as a delimiter.

## Looping through a list

You can use a `for` statement to loop through the elements of a list.

In [38]:
for cheese in cheeses:
    print(cheese)

Cheddar
Edam
Gouda


For example, after using `split` to make a list of words, we can use `for` to loop through them.

In [39]:
s = 'pining for the fjords'

for word in s.split():
    print(word)

pining
for
the
fjords


A `for` loop over an empty list never runs the indented statements.

In [40]:
for x in []:
    print('This never happens.')

## Sorting lists

Python provides a built-in function called `sorted` that sorts the elements of a list.

In [41]:
scramble = ['c', 'a', 'b']
sorted(scramble)

['a', 'b', 'c']

The original list is unchanged.

In [42]:
scramble

['c', 'a', 'b']

`sorted` works with any kind of sequence, not just lists. So we can sort the letters in a string like this.

In [43]:
sorted('letters')

['e', 'e', 'l', 'r', 's', 't', 't']

The result is a list.
To convert the list to a string, we can use `join`.

In [44]:
''.join(sorted('letters'))

'eelrstt'

With an empty string as the delimiter, the elements of the list are joined with nothing between them.

## Objects and values

If we run these assignment statements:

In [45]:
a = 'banana'
b = 'banana'

We know that `a` and `b` both refer to a string, but we don't know whether they refer to the *same* string. 
There are two possible states demonstrated in the code below.

In [8]:
a = 'banana'
b = 'banana'
a is b

True

In this example, Python only created one string object, and both `a`
and `b` refer to it.
But when you create two lists, you get two objects.

In [9]:
a = [1, 2, 3]
b = [1, 2, 3]
a is b

False

In this case we would say that the two lists are **equivalent**, because they have the same elements, but not **identical**, because they are not the same object. 
If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.

## Aliasing

If `a` refers to an object and you assign `b = a`, then both variables refer to the same object.

In [10]:
a = [1, 2, 3]
b = a
b is a

True

The association of a variable with an object is called a **reference**.
In this example, there are two references to the same object.

An object with more than one reference has more than one name, so we say the object is **aliased**.
If the aliased object is mutable, changes made with one name affect the other.
In this example, if we change the object `b` refers to, we are also changing the object `a` refers to.

In [55]:
b[0] = 5
a

[5, 2, 3]

So we would say that `a` "sees" this change.
Although this behavior can be useful, it is error-prone.
In general, it is safer to avoid aliasing when you are working with mutable objects.

For immutable objects like strings, aliasing is not as much of a problem.
In this example:

In [56]:
a = 'banana'
b = 'banana'

It almost never makes a difference whether `a` and `b` refer to the same
string or not.

## List arguments

When you pass a list to a function, the function gets a reference to the
list. If the function modifies the list, the caller sees the change. For
example, `pop_first` uses the list method `pop` to remove the first element from a list.

In [57]:
def pop_first(lst):
    return lst.pop(0)

We can use it like this.

In [58]:
letters = ['a', 'b', 'c']
pop_first(letters)

'a'

The return value is the first element, which has been removed from the list -- as we can see by displaying the modified list.

In [59]:
letters

['b', 'c']

Passing a reference to an object as an argument to a function creates a form of aliasing.
If the function modifies the object, those changes persist after the function is done.

## Making a word list

In the previous chapter, we read the file `all_day.csv` and did a few things with the data.  We could download the same data again and then read through the list, this time reading each line into the items in a list.  Notice in this function, unless you delete the prevoiously downloaded file, it won't re-download it. 

In [12]:
import os
import urllib.request  # Built-in Python library

filename = 'all_day.csv'
url = 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv'

if not os.path.exists(filename):
    print("Downloading...")
    # This does exactly what wget does, but using pure Python
    urllib.request.urlretrieve(url, filename)
    print("Done!")
else:
    print("It is already downloaded in this directory so just use that one.")

It is already downloaded in this directory so just use that one.


In [34]:
eq_list = []

for line in open('all_day.csv', encoding='utf-8'):
    eqrecord = line.strip()
    eq_list.append(eqrecord)
    
len(eq_list)

255

Before the loop, `eq_list` is initialized with an empty list.
Each time through the loop, the `append` method adds a word to the end.

Another way to do the something simmilar is to use `read` to read the entire file into a string.

In [21]:
string = open('all_day.csv', encoding='utf-8').read()
len(string)

52061

The result is a single string with lots of characters.
We can use the `split` method to split it into a list of lines.

In [22]:
eq_list = string.strip().split("\n")
len(eq_list)

255

Now, we can look at any record (row) by index

In [23]:
print(eq_list[3])

'2026-02-13T12:47:09.099Z,60.333,-141.167,17.8,1.5,ml,11,132,0.1,0.4,ak,ak2026dbxqww,2026-02-13T12:48:46.222Z,"118 km NW of Yakutat, Alaska",earthquake,6.3,2.9285,0.3,4,automatic,ak,ak'

Since each row is really a comma seperated record of the earthquake we can create a list of all the attributes of any of the earthquakes in this manner:

In [28]:
an_eq_info = eq_list[3].split(",")
print(an_eq_info)

['2026-02-13T12:47:09.099Z', '60.333', '-141.167', '17.8', '1.5', 'ml', '11', '132', '0.1', '0.4', 'ak', 'ak2026dbxqww', '2026-02-13T12:48:46.222Z', '"118 km NW of Yakutat', ' Alaska"', 'earthquake', '6.3', '2.9285', '0.3', '4', 'automatic', 'ak', 'ak']


Then access individual value using [ ] notation.  For example if we wanted to just get the description (the 13th element) we could do that like this:

In [29]:
an_eq_info[13]

'"118 km NW of Yakutat'

Do you spot an issue with this?  If you look, the city and the state in this record have now been split into two different elemnents in the list because the deliminter "," appeared within the quoted text string.  There are a number of ways to deal with this, that we will talk about next week.  For now we will just ignore it.

## Debugging

Note that most list methods modify the argument and return `None`.
This is the opposite of the string methods, which return a new string and leave the original alone.

If you are used to writing string code like this:

In [30]:
word = 'plumage!'
word = word.strip('!')
word

'plumage'

It is tempting to write list code like this:

In [31]:
t = [1, 2, 3]
t = t.remove(3)           # WRONG!

`remove` modifies the list and returns `None`, so next operation you perform with `t` is likely to fail.

In [32]:

t.remove(2)

<class 'AttributeError'>: 'NoneType' object has no attribute 'remove'

This error message takes some explaining.
An **attribute** of an object is a variable or method associated with it.
In this case, the value of `t` is `None`, which is a `NoneType` object, which does not have a attribute named `remove`, so the result is an `AttributeError`.

If you see an error message like this, you should look backward through the program and see if you might have called a list method incorrectly.

## Glossary

**list:**
 An object that contains a sequence of values.

**element:**
 One of the values in a list or other sequence.

**nested list:**
A list that is an element of another list.

**delimiter:**
 A character or string used to indicate where a string should be split.

**equivalent:**
 Having the same value.

**identical:**
 Being the same object (which implies equivalence).

**reference:**
 The association between a variable and its value.

**aliased:**
If there is more than one variable that refers to an object, the object is aliased.

**attribute:**
 One of the named values associated with an object.

## Exercises



In [33]:
# This cell tells Jupyter to provide detailed debugging information
# when a runtime error occurs. Run it before working on the exercises.

%xmode Verbose

Exception reporting mode: Verbose


### Exercise

Two words are anagrams if you can rearrange the letters from one to spell the other.
For example, `tops` is an anagram of `stop`.

One way to check whether two words are anagrams is to sort the letters in both words.
If the lists of sorted letters are the same, the words are anagrams.

Write a function called `is_anagram` that takes two strings and returns `True` if they are anagrams.

To get you started, here's an outline of the function.

In [71]:
def is_anagram(word1, word2):
    """Checks whether two words are anagrams.
    
    >>> is_anagram('tops', 'stop')
    True
    >>> is_anagram('skate', 'takes')
    True
    >>> is_anagram('tops', 'takes')
    False
    >>> is_anagram('skate', 'stop')
    False
    """
    return None

### Exercise

Python provides a built-in function called `reversed` that takes as an argument a sequence of elements -- like a list or string -- and returns a `reversed` object that contains the elements in reverse order.

In [75]:
reversed('parrot')

<reversed at 0x7fe3de636b60>

If you want the reversed elements in a list, you can use the `list` function.

In [76]:
list(reversed('parrot'))

['t', 'o', 'r', 'r', 'a', 'p']

Or if you want them in a string, you can use the `join` method.

In [77]:
''.join(reversed('parrot'))

'torrap'

So we can write a function that reverses a word like this.

In [78]:
def reverse_word(word):
    return ''.join(reversed(word))

A palindrome is a word that is spelled the same backward and forward, like "noon" and "rotator".
Write a function called `is_palindrome` that takes a string argument and returns `True` if it is a palindrome and `False` otherwise.

Here's an outline of the function with doctests you can use to check your function.

In [79]:
def is_palindrome(word):
    """Check if a word is a palindrome.
    
    >>> is_palindrome('bob')
    True
    >>> is_palindrome('alice')
    False
    >>> is_palindrome('a')
    True
    >>> is_palindrome('')
    True
    """
    return False

Write a function called reverse_sentence that takes as an argument a string that contains any number of words separated by spaces. It should return a new string that contains the same words in reverse order. For example, if the argument is “Reverse this sentence”, the result should be “Sentence this reverse”.

Hint: You can use the capitalize methods to capitalize the first word and convert the other words to lowercase.

To get you started, here's an outline of the function with doctests.

In [83]:
def reverse_sentence(input_string):
    '''Reverse the words in a string and capitalize the first.
    
    >>> reverse_sentence('Reverse this sentence')
    'Sentence this reverse'

    >>> reverse_sentence('Python')
    'Python'

    >>> reverse_sentence('')
    ''

    >>> reverse_sentence('One for all and all for one')
    'One for all and all for one'
    '''
    return None

### Exercise

Write a function called equake last day averages.  This script should read the earthquake file using only the knowledge you have been taught up to this point.  First read in the lines of the file as we did abvove (also pasted below to get you started.  You should then slice the list so that it does not include the first line (the header).  Then loop through the list and calcuate the average (mean) value for the lat [1], lon [2], depth [3] and mag [4] for all the earthquakes in the file.  Report these values back to the screen using a print statement in a formatted string so that the reader understands which numbers are which.  

In [39]:
eq_list = []
#setup intital values 


for line in open('all_day.csv', encoding='utf-8'):
    eqrecord = line.strip()
    eq_list.append(eqrecord.split(','))

#slice list so that you don't have first element anymore or remove it a different way.

for eqrec in eq_list:
    #get all nessessary values lat [1], lon [2], depth [3] and mag [4]
    #example of getting current latitude value
    lat = eqrec[1]

    #print for debugging, remove this in final version
    print(lat)
    
    #calculate incremental totals for each value
    
#calculate means (sum of var / count)

#report means for lat, lon, depth and mag

latitude
31.632
31.611
60.333
61.725
59.902
19.050500869751
60.059
4.8023
59.831
34.031
61.564
62.987
37.594165802002
37.596000671387
27.4176
38.842498779297
38.809501647949
31.702
32.3955
32.391333333333
38.783332824707
64.31
33.246833333333
31.504
27.5427
33.029666666667
18.133
40.276668548584
31.903
63.267
60.099
61.839
33.882833333333
32.105
32.099
19.118333816528
33.181333333333
32.099
33.1805
33.179
35.63305283
59.912
19.045499801636
33.179666666667
38.831165313721
31.817
33.181833333333
33.1815
33.181333333333
33.178166666667
33.566333333333
35.661833333333
31.47
31.904
33.781166666667
17.958666666667
19.291666030884
38.825668334961
19.070999145508
28.3204
33.781166666667
38.826667785645
33.779666666667
38.786167144775
32.099
33.7785
15.8333
18.404
60.119
40.4421
63.478
20.012666702271
62.097
33.985166666667
38.824001312256
62.813
38.824001312256
33.8845
33.985
37.7974
44.0666
38.820499420166
33.058333333333
40.275833129883
40.143165588379
61.226
35.63267899
33.1145
61.268
61.69