# Notebook 5.2: Sequences (tuples, lists, and range)

### container types
Here we will cover several objects that can be used as containers to store other object types. In particular, we'll cover `tuples` and `lists`. We will then also cover a special function called `range`, which is used to iterate over a range of numbers, which can be especially useful for indexing elements in a container. 

### Tuples
A tuple consists of a number of values separated by commas, and is represented as a container in which elements are enclosed by parentheses. Although a tuple can be created like in the cell below by assigning comma-separated values, it is more clear to surround the elements with parentheses to show they are in fact a tuple. 

In [1]:
# create a tuple variable 
a = 1, 2, 3
a

(1, 2, 3)

In [2]:
# another way to create a tuple variable
b = (1, 2, 3)
b

(1, 2, 3)

In [4]:
# another way to create a tuple
c = tuple((1, 2, 3))
c

(1, 2, 3)

### What can tuples store?
Tuples can store a variety of data objects, such as strings, integers, floats, even other tuples and lists. 

In [5]:
tup1 = (1, 2, 3)
tup2 = ('a', 'b', 'c')
tup3 = (4, 5, 'd', 'e')
tup4 = (tup1, tup2)

In [6]:
tup1

(1, 2, 3)

In [7]:
tup4

((1, 2, 3), ('a', 'b', 'c'))

### Indexing and slicing
Tuples can be indexed and sliced just like string or list. 

In [8]:
tup4[0]

(1, 2, 3)

### What is special about tuples?
For the most part tuples are very similar to lists, but offer a bit less functionality, so for that reason they are typically used less often than lists. However, when used correctly they offer a number of advantages. The main difference between tuples and lists is that tuples are *immutable*, just like strings. This means that although we can index elements in the tuple, we cannot modify individual elements of it. This is shown below. 

This turns out to be important for some obscure reasons now, but which we will learn more about in a later session about dicts and sets.

In [10]:
# error: *try* to modify a tuple element in place
tup3[0] = 'a'

TypeError: 'tuple' object does not support item assignment

## Lists
Lists are one the most commonly used data types in Python. It is a sort of general use container. Lists can be indexed and sliced like strings and tuples, but in addition they can also be mutated. Finally, a *mutable* object! This means that you can change elements of the list and the list object will still be represented as being the same object in memory. We'll show many examples of this below. But first, let's create a list. Lists are represented by objects enclosed by brackets and can store any other type of data object. 

In [13]:
# store a list variable
mylist = ['a', 'b', 'c', 'd']

In [14]:
# index or slice to view elements in the list
mylist[:2]

['a', 'b']

In [15]:
# assign (mutate) new values to indexed positions in the list
mylist[0] = "apple"
mylist[-1] = "dandelion"

In [16]:
# see the new mutated list
mylist

['apple', 'b', 'c', 'dandelion']

### Another way to create a list

In [17]:
list('apple')

['a', 'p', 'p', 'l', 'e']

### List built-ins
Lists also have many builtin functions that can be used to operate on it, such as sampling objects from the list, counting them, sorting them, etc. Let's try out some of these below. As we've done several times previously, type a list variable name below with a period at the end of it, and press the `<tab>` key to see the many available options. You'll notice as you execute the commands below that they modify the list in-place, meaning that it doesn't return a new list but instead modifies your existing list. 

In [None]:
mylist.

In [18]:
# add an element to the end of the list
mylist.append("dog")
mylist

['apple', 'b', 'c', 'dandelion', 'dog']

In [19]:
# sorts the list (alphanumerically, by default)
mylist.sort()
mylist

['apple', 'b', 'c', 'dandelion', 'dog']

In [20]:
# get index of the element 'c'
mylist.index("c")

2

In [21]:
# reverse order of the list
mylist.reverse()
mylist

['dog', 'dandelion', 'c', 'b', 'apple']

In [22]:
# empties the list
mylist.clear()
mylist

[]

### Iteration
Both tuples and lists can be iterated over. Unlike string objects where the elements to be iterated over were byte characters, in tuples or lists the elements in the container are iterated over. For example, below the three string objects in the list are the elements. 

In [23]:
mylist = ["species_A", "species_B", "species_C"]
for item in mylist:
    print(item)

species_A
species_B
species_C


### Lists as stacks
A common use of lists is to store objects that have been passed through some type of filter process. Lists are nice for this because you can start with an empty list and sequentially add objects to it to build it up. Example below. 

In [24]:
vowels = []
for item in "abcdefghijklmnopqrstuvwxyz":
    if item in "aeiou":
        vowels.append(item)

In [25]:
vowels

['a', 'e', 'i', 'o', 'u']

### List comprehension
A more compact way to assign values to a list while iterating over a for-loop or conditional statement is to use a method called *list comprehension*. This is essentially a way of rewriting a multi-line for-loop statement into a single line. The point of list comprehension is to make your code more compact and easier to read. 

In [26]:
vowels = [i for i in "abcdefghi" if i in "aeiou"]
vowels

['a', 'e', 'i']

### The `range` sequence
The sequence object `range` is a special highly efficient operator for iterating over numeric values. It has the form `range(start, stop, step)`, and returns an object that generates numbers on the fly as they are sampled. This makes it highly efficient since if you tell it to generate a billion numbers it doesn't need to generate them ahead of time but instead generates them only as they are needed. We will discuss more other types of Python generators in the future, but for now `range` is important since it is often used in conjunction with sequence type objects to sample their index. 

In [27]:
## sample 10 values
for idx in range(0, 10):
    print(idx)

0
1
2
3
4
5
6
7
8
9


In [28]:
## sample 20 values by 2
for idx in range(0, 20, 2):
    print(idx)

0
2
4
6
8
10
12
14
16
18


In [29]:
## examine a range object
ra = range(0, int(1e10), 100)
ra

range(0, 10000000000, 100)

In [30]:
## query the range object.
## It doesn't need to generate all 1e10 values to know 100 is in it
500 in ra

True

### Index a list using range
It is often useful to use range to index the elements of an object. This can be done by accessing the length of the object, if it is a sequence, to know how many items are in it. Let's use the example from earlier when we were learning about strings to count the differences in DNA between two sequences. 

In [32]:
dna1 = "ACAGAGTTGCCAGGAGATGACAGAAAGGTGTGGGTTACAACTCTCTCTAATTTAAGGGCCAATTAACATT"
dna2 = "ACAGAGTCGCCAGGAGATGACAGAAAGGTCTGGGTTACAACTCTCTCTAAAATAAGGGCCAATTAACGTT"

In [33]:
## get length of dna1
len(dna1)

70

In [34]:
## a range object over this length
range(len(dna1))

range(0, 70)

In [35]:
## a for-loop to iterate over the range and compare dna strings at each index
for idx in range(len(dna1)):
    if dna1[idx] != dna2[idx]:
        print(idx)

7
29
50
51
67


### Challenges

A. Create a list that contains the three objects in the cell below. 

In [1]:
a = (1, 10, 100)
b = ['dog', 'cat', 'rat']
c = "Columbia University"

In [2]:
nestList = [a, b, c]

B. Find all values between 0 and 200 that are divisible by 17 and store them in a list.

In [3]:
newList = []
for i in range(0, 201):
    if i%17 ==0:
        newList.append(i)

C. Rewrite the for-loop below using list-comprehension

In [4]:
store = []
for idx in range(100):
    if idx > 90:
        store.append(idx)

In [5]:
store = [idx for idx in range(100) if idx > 90]

D. Write a for-loop nested within a for-loop to iterate over each element in the list, and then over eaach character in the element. As you iterate over the characters use an if statement to call `print()` on the character only if it is capitalized. 

In [6]:
mylist = ["Camel", "hOrse", "DonkEy", "elephant"]

In [7]:
for word in mylist:
    for char in word:
        if char == char.capitalize():
            print(char)

C
O
D
E


### optional extended reading on Python objects
https://docs.python.org/3/tutorial/datastructures.html?highlight=tuples