This notebook is the third in a series of jupyter notebooks intended to introduce students to programing in python. I designed it for the Vaughan house Living Learning Community at UC Merced. It focuses on data structures and loops in python 3.

Let's first discuss data structures. What is a data structure? A data structure is a way for computers to organize and store various pieces of data. For example, if you consider someone's name and phone number a piece of data, then the contact list on your phone would be a data structure. Or, you could think of your grocery list as a data structure, with each item on the list as a piece of data.

Programmers have a number of data structures that they call upon, such as arrays, linked lists, binary trees, and hash tables, each with different costs and benefits, and one of the most important skills for programmers is recognizing which data structure is most appropriate for the task they are trying to accomplish. For the purposes of this notebook we are going to take a look at the two most commonly used data structures in python: lists and tuples.

Let's start with lists. A list in python is a dynamically sized array that can contain any type of data (even within the same list). In addition, a list can be changed after it has been created, either by adding new items, removing current items, changing current items, or even reordering the list. The easiest way to create a list is with the open and close brackets:

In [1]:
#create a list of the Vaughan house instructors
instructorList = ["Mnia", "Hrant", "Erik"]
print(instructorList)


['Mnia', 'Hrant', 'Erik']


We can get individual items from the list by referencing their index. Note that in python references start at zero!

In [2]:
#what's the first item in our list of instructors?
print(instructorList[0])

Mnia


The items in a list can also be lists! These are what we call multi-dimensional lists. They are created the same way as regular lists:

In [3]:
#create a list of instructors with emails
emailList = [["Mina", "mina@ucm.edu"],["Hrant", "hrant@ucm.edu"],["Erik", "erik@ucm.edu"]]
print(emailList)

[['Mina', 'mina@ucm.edu'], ['Hrant', 'hrant@ucm.edu'], ['Erik', 'erik@ucm.edu']]


We again reference items using brackets to get items from lists, but now we can have a series of brackets, where the first bracket gets the list, then the second bracket gets the item from that list.

In [4]:
#what's the first list in our email list?
print(emailList[0])

#What's the second item in the third list in our email list?
print(emailList[2][1])

#Reference order is important!
print("Compare " + emailList[1][0] + " with " + emailList[0][1])


['Mina', 'mina@ucm.edu']
erik@ucm.edu
Compare Hrant with mina@ucm.edu


We can also use the reference to change an item in the list:

In [5]:
#What's the first item in the instructor list again?
print(instructorList[0])

#Oops, typo! Better fix it
instructorList[0] = "Mina"
print(instructorList[0])

Mnia
Mina


Something really nice about python is that there are a few ways to reference items in a list. You can use negative numbers to count from the back of the list forward, and you can us a colon to get a range of items from a list. For example:

In [6]:
#what's the last item in the instructor list?
print(instructorList[-1])

#what are the second and third items in the instructor list? 
#Note that it goes up to the final reference, but doesn't include it!
print(instructorList[1:3])


Erik
['Hrant', 'Erik']


There's more to discuss with lists, but before we continue I want to take a quick detour to talk about the second data structure common to python, which are called "tuples". In python tuple is very similar to a list, being an ordered array that can contain any type of data. Tuples are created with parenthesis, similar to how brackets are used to create lists:

In [7]:
#let's create a tuple for instructors
instructorTuple = ("Mnia", "Hrant", "Erik")
print(instructorTuple)

('Mnia', 'Hrant', 'Erik')


In [8]:
#what's the first entry in this tuple?
print(instructorTuple[0])

#what's the final entry in this tuple?
print(instructorTuple[-1])

Mnia
Erik


The primary difference between lists and tuples is that tuples are immutable, which means that they **cannot** be changed after they've been created.

In [9]:
#there's a typo again in the tuple!
print(instructorTuple[0])

#what if I try to fix the typo?
instructorTuple[0] = "Mina"

Mnia


TypeError: 'tuple' object does not support item assignment

If lists and tuples are so similar, why do we have both? Why not just get rid of tuples? There are various reasons, primarily related to memory usage, to have both. But for you, the biggest reason to know about both is that there are some functions in python that require a tuple and won't work on lists, so it's important for you to be aware of both.

Now that we've taken that detour, let's get back to lists. We've seen how to get items in a list and change/replace items in a list. How do we add items to a list? There are a two ways to do this. The first is to use the append() function, which adds an item to the end of a list.

In [10]:
#Add Jose as an instructor
instructorList.append("Jose")
print(instructorList)

['Mina', 'Hrant', 'Erik', 'Jose']


If I don't want to add an item to the end of the list, but somewhere else, I need to use the insert() function.

In [11]:
#Insert Jose and his email to the email list at the second position
emailList.insert(1,["Jose", "jose@ucm.edu"])
print(emailList)


[['Mina', 'mina@ucm.edu'], ['Jose', 'jose@ucm.edu'], ['Hrant', 'hrant@ucm.edu'], ['Erik', 'erik@ucm.edu']]


Finally, there are two ways to remove an element from a list. I can use the remove() function, which goes through the list until it finds the first instance of the function argument and removes it:

In [12]:
#Move Jose to the second position in the instructor list
#First, take Jose out of the instructor list
instructorList.remove("Jose")
print(instructorList)

#Now, insert Jose into the second position
instructorList.insert(1,"Jose")
print(instructorList)


['Mina', 'Hrant', 'Erik']
['Mina', 'Jose', 'Hrant', 'Erik']


Alternatively, I can use the pop() function to remove the element at the referenced position. What's nice about the pop function is that it also returns the removed element, so we can use a variable to store it if we want.

In [13]:
#Move the third element of the email list to the second position
#First, use the pop() function to get the third element and store it in a variable
tempEmail = emailList.pop(2)
print(tempEmail)
print(emailList)

#Now, put the removed item into the second position
emailList.insert(1, tempEmail)
print(emailList)


['Hrant', 'hrant@ucm.edu']
[['Mina', 'mina@ucm.edu'], ['Jose', 'jose@ucm.edu'], ['Erik', 'erik@ucm.edu']]
[['Mina', 'mina@ucm.edu'], ['Hrant', 'hrant@ucm.edu'], ['Jose', 'jose@ucm.edu'], ['Erik', 'erik@ucm.edu']]


All of these functions for adding, modifying, and removing items from a list can only operate on 1 element at a time. So, how do we make multiple changes to a list? Say, add 10 items to a list? For that, we need to use loops. Loops are used in computer programs to do the same (or, at least, similar) things multiple times. There are a wide variety of loops structures and syntax in different languages, but the most common loops fall into two categories: for loops, which do the same set of code a set number of times (add an element to a list 17 times, for example) and while loops, which do the same set of code until a boolean statement is false (add an element to a list until you have 100 elements, for example). Of these, the for loop is probably the most common, and it's what we'll focus on for this notebook. In python, the syntax for a for loop is similar to the syntax for an if statement, where you use the "for" command, followed by how many times to do the loop, and then all the indented lines below that command are run that many times. For example, here's how to print out all the individual elements in the instructor list: 

In [14]:
#First, print out the whole list to see what all the elements are
print(instructorList)

#Now, let's print each element individually. First, I need to know how many elements 
#are in the list, so I can tell the for loop how many times to loop
#I know that my list has 4 elements, but since I won't always know how big my list
#is I'll use a built-in function, called len(), to tell me how many elements are in
#my list
listLength = len(instructorList)

#Now I can create a for loop with a variable x, which starts at
#zero and increases by one each time through the loop, and the
#range() function plus the listLength variable to tell the for
#loop how many times to run
for x in range(listLength):
    print(instructorList[x])
    

['Mina', 'Jose', 'Hrant', 'Erik']
Mina
Jose
Hrant
Erik


What's nice about python, which is not true for most other languages, is that it's been built to be smart about figuring out the number of times to loop, if possible. In the example above, I first had to figure out how long my list is, then tell the for loop how many times it needed to loop. This is similar to what I would have to do in most languages, such as C. However, becaue python is more user friendly it's been built to make this easier. I can instead give it my list as the argument, rather than use the range() function to tell it how many times to go through the list:

In [15]:
#Here's an easier way to iterate through my list
for x in instructorList:
    print(x)

Mina
Jose
Hrant
Erik


In [16]:
#Here's the email for each instructor
for x in emailList:
    print(x[0] + " can be emailed at " + x[1])

Mina can be emailed at mina@ucm.edu
Hrant can be emailed at hrant@ucm.edu
Jose can be emailed at jose@ucm.edu
Erik can be emailed at erik@ucm.edu


The range function, which I used above in the first example, is useful when I don't have a list to start with, but instead want to use a for loop to create a list. For example, here's a for loop to get the first 15 binary numbers (i.e. 1, 2, 4, 8, 16, etc.).

In [17]:
#first start with an empty list
binaryNums = []

#now, do a for loop 8 times, adding the next binary number to the list
for x in range(15):
    binaryNums.append(2 ** x)

print(binaryNums)

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384]


I can also use the range function to change the iteration behavior on my loop. For example, instead of getting every item, I can instead count by 2 to get every other item.

In [18]:
#go through the binaryNums list and get print every other entry
#the range function needs three arguments in this case (start, stop, and stepsize)
#also, remember python starts counting at zero, not one
for x in range(0,15,2):
    print(binaryNums[x])

1
4
16
64
256
1024
4096
16384


Now that you've seen how to create and modify lists, and how to use loops, move on to notebook 3b for some practice.