# Lists

We've learned how to assign names to variables and store things in them like numbers or strings. But sometimes, it's avantageous to use the same name to refer to a bunch of numbers or strings. For example, let's say you want to calculate with a bunch of student test scores. Here's one way you could store those as variables:

In [1]:
score1 = 89
score2 = 77
score3 = 100
score4 = 95
score5 = 88

This is only 5 test scores, and it took a lot of work to store them into variables. Can you imagine what it would be like with a class of 30 students? And then what if you wanted to calculate the average of these scores? It would take longer to do that than simply not using variables at all. Fortunately there is a way around this that affords us a lot of utility. We can use a single list to store multiple numbers under the same variable name like this:

In [2]:
scores = [89,77,100,95,88]

Now, when we want to look at the scores, we can print all of them at once!

In [3]:
print scores

[89, 77, 100, 95, 88]


You can put any kind of variable in a list, such as strings:

In [4]:
names = ["Billy","Matthew","Shannon","Kristen","Taylor"]
print names

['Billy', 'Matthew', 'Shannon', 'Kristen', 'Taylor']


In [5]:
key = [True, False, False, True ,False]
print key

[True, False, False, True, False]



Lists wouldn't be very useful if we couldn't access the individual components. To do this, we use the square brackets [ ]

In [6]:
print scores[0]
print scores[1]
print scores[2]
print scores[3]
print scores[4]

89
77
100
95
88


Once again, note that python starts counting at 0! The number used to access an element of a list (i.e. the number inside the square brackets) is called an index. So the first element of the list is indexed as the 0th element. What happens if we try to access scores with the index 5?

In [7]:
print scores[5]

IndexError: list index out of range

The answer is: we get an error. This is because we're trying to access an element of a list when that element doesn't exist.

While we can't access scores[5], we *can* access scores[-1]. When python encounters a negative index, it starts counting from the *end* of the list and returns the value it finds. So the last element in the list can be accessed with an index of -1.

In [8]:
scores[-1]

88

Note that this only works up until you reach the beginning of the list, so the most negative index that won't return an error is -(length of list). In our example, scores[-5] would return 89, the first entry of the list. Trying scores[-6] would give an error.

When we access an element of a list, it behaves just like the type that is stored in the list. For example, we can add two elements of a list (when that addition makes sense).

In [9]:
print scores[0]+scores[1]

166


In [10]:
print names[0],names[1]

Billy Matthew


And we can make changes to a single element of a list. Let's say we made an error and need to change the third student's score to 79. We can do that like this:

In [11]:
scores[2] = 79 #An index of 2 will give us the 3rd element of the list
print scores[2]

79


Though it's less commmon, the elements of a list don't even have to be the same type! It's possible for us to do something like this:

In [12]:
randomlist = ["h",67,True,9,"masonry",False,True]
print randomlist

['h', 67, True, 9, 'masonry', False, True]


## Some functions we can call on lists

There are some useful functions we can call on lists to help us manipulate them. A function in Python is already-written code with a specific purpose. It can be used by invoking the name of the function in your code, and parentheses are used to give the function input.

The len function (which stands for length) tells us how many items are in a list.

In [13]:
len(scores)

5

This is useful if you want to use a while loop to perform the same action for a bunch of items in a list.

Say we decided to add a curve of 5 points to the score of every student. We can do so like this:

In [14]:
print scores

[89, 77, 79, 95, 88]


In [15]:
i = 0
while i < len(scores):
    scores[i] = scores[i] + 5
    i +=1

In [16]:
print scores

[94, 82, 84, 100, 93]


A keyword is word Python has reserved to have a special meaning and purpose. You've seen keywords before, like "print," "and," "if," and so on. All keywords in an iPython notebook will be colored green. We can use the "in" keyword to find out if a particular item is a member of a list.

In [17]:
82 in scores

True

In [18]:
56 in scores

False

We can use the append function to add another student's score to the list.

In [19]:
print scores
scores.append(85) #This is a special function that uses a period to act on something. Note the syntax.
print scores

[94, 82, 84, 100, 93]
[94, 82, 84, 100, 93, 85]


The append function is very useful for building up a list from scratch (e.g. inside a loop). One very useful technique is starting an empty list and appending to it later.

In [20]:
mylist = []

The code below will now put the first 8 even numbers into mylist.

In [21]:
i = 0
while i < 8:
    mylist.append(2*i)
    i = i + 1
print mylist

[0, 2, 4, 6, 8, 10, 12, 14]


The del keyword removes the element of the list at the index specified. For example, this code deletes the last element of scores.

In [22]:
print scores
del scores[-1]
print scores

[94, 82, 84, 100, 93, 85]
[94, 82, 84, 100, 93]


## Slicing

Let's recall what's inside scores right now.

In [23]:
print scores

[94, 82, 84, 100, 93]


The last technique we'll talk about is slicing. If you want more than one element of a list, but not the entire list, you can use this. Slicing will return only the section of the list that you specify. For example:

In [24]:
print scores[2:4]

[84, 100]


The colon inside the square brackets [ ] is how we indicate that we are slicing. The first number is the index of the first element that we want to include, and the second number of the index is the first element that is NOT included. Therefore the code above printed the values with index 2 and 3. If we leave the left-hand side of the colon blank, the slice will start at the beginning of the list. If we leave the right-hand side of the colon blank, the slice will end with the final element of the list. When slicing, what you get out will be another list.

In [25]:
print scores[:4] #a list of the first 4 elements of scores

[94, 82, 84, 100]


In [26]:
print scores[2:] #a list omitting the first 2 elements of scores

[84, 100, 93]


## Practice Problems

Write a cell that creates a list that contains 5 seperate entries with value 1. (That is, the list has length 5 and every entry has value 1.)

In [1]:
my_list = [1,1,1,1,1]

Write a cell that creates a list containing the first 17 odd numbers.

In [3]:
# Create an empty list
odd_list = []

# Dummy variable
i = 1

# Loop 17 times
while i <= 17:
    # Do what we need to do
    # by appending to the list
    odd_list.append(2*i-1)
    
    # Increment i
    i = i + 1
    
# Print our answer
print odd_list

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33]


Write a cell that creates a list of all multiples of 5 less than 100.

In [1]:
# Create an empty list
multiples = []

# Dummy variable (for indexing)
i = 5

# I want to loop while i < 100
while i < 100:
    # add multiple of 5 to the list
    multiples.append(i)
    # need to increment by 5
    i = i + 5

print multiples

[5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]


Write a cell that sums all the entries of a list.

In [3]:
# create a list of numbers for testing
alist = [1,2,3,4,5,6,7,8,9,10]

# Create a variable that holds the sum, initialize to zero
mysum = 0

# We are going to need the length of the list
size = len(alist)

# Dummy variable for indexing the list
i = 0

while i < size:
    # add current value to mysum
    mysum = mysum + alist[i]
    # increment i
    i = i + 1

# Now print ... should be 55
print mysum

55


Write a cell that calculates the average of a list of numbers.

In [5]:
# create a list of numbers for testing
alist = [1,2,3,4,5,6,7,8,9,10]

# Create a variable that holds the sum, initialize to zero
mysum = 0

# We are going to need the length of the list
size = len(alist)

# Dummy variable for indexing the list
i = 0

while i < size:
    # add current value to mysum
    mysum = mysum + alist[i]
    # increment i
    i = i + 1

# I need to convert size from an integer to a float
# so that the result can be a fractional number
avg = mysum / float(size);

# Now print ... should be 5.5
print avg

5.5


Write a cell that copies all numbers from the given list below that are not multiples of 2 and puts them in a new list. (hint: This can be done using a loop with a conditional).

In [8]:
first_list = [ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

# Get size of list
size = len(first_list)

# Dummy variable 
i = 0

# Create empty second list
second_list = []

while i < size:
    # if even add to the list
    if first_list[i] % 2 == 1:
        second_list.append(first_list[i])
    # Increment index
    i = i + 1
        
# Print result
print second_list

[1, 9, 25, 49, 81]


## Advanced Problems

*Standard Deviation*: Write a cell that calculates the standard deviation of a list of numbers. The formula for the standard deviation is  
  
$$ \sigma = \sqrt{\frac{1}{N-1} \sum_{i = 1}^N(x_i - \bar{x})^2}$$  
  
where $N$ is the number of elements in the list, $x_i$ is a particular element of the list, and $\bar{x}$ is the average of the list of numbers. (hint: you will need to calculate the average first).

Test your code on the given list. The answer is approximately 31.23.

In [11]:
import math
test_list = [7, 39, 2, 56, 98, 74, 34, 17, 56, 88, 66, 0, 56, 34]

# We need to compute the average first (I just copied the code from above 
# and made the appropriate changes)

# Create a variable that holds the sum, initialize to zero
mysum = 0

# We are going to need the length of the list
size = len(test_list)

# Dummy variable for indexing the list
i = 0

while i < size:
    # add current value to mysum
    mysum = mysum + test_list[i]
    # increment i
    i = i + 1

# I need to convert size from an integer to a float
# so that the result can be a fractional number
avg = mysum / float(size);

# Set my dummy variable back to zero
i = 0

# Create a variable that holds the sum for the standard deviation
mysum2 = 0

while i < size:
    # add current value to mysum
    mysum2 = mysum2 + (test_list[i]-avg)**2
    # increment i
    i = i + 1

stddev = math.sqrt((1./(size-1))*mysum2)

print stddev

31.2340508751


*Median*: Write a cell that calculates the median of a list of numbers. The answer for test_list is 47.5.

In [None]:
test_list = [7, 39, 2, 56, 98, 74, 34, 17, 56, 88, 66, 0, 56, 34]

# First we need to use the function sorted to sort the list
test_sorted = sorted(test_list)

*Mode*: Write a cell that calculates the mode of a list of numbers. The answer for test_list is 56.

In [21]:
test_list = [7, 39, 2, 56, 98, 74, 34, 17, 56, 88, 66, 0, 56, 34]

# Get size of list
size = len(test_list)

# Create a dictionary to hold the number of times
# an integer appears in the list
d = {}

# Index variable
i = 0
while i < size:
    # Get current value in list
    value = test_list[i]
    # Check to see if the current value is in the dictionary
    result = d.get(value)
    # If the result is None, then this means the value is a new entry 
    # in the dictionary
    if result == None:
        d[value] = 1
    else:
        d[value] = result+1
    i = i + 1

mode = -1
max_value = -1
for k, v in d.items():
    if (v > max_value):
        max_value = v
        mode = k

print mode
    
    
        

{0: 1, 2: 1, 34: 2, 66: 1, 39: 1, 74: 1, 7: 1, 98: 1, 17: 1, 56: 3, 88: 1}
56
