## Lists

Like strings, Python lists are sequence types. A list is an ordered collection of objects. The elements of a list can by any type of object, including other lists.

Lists use square brackets [] and are mutable (unlike strings). We can modify a list in place.

By the way, make sure you don't overwrite 'list' - even though it is a reserved word, Python will let you do this:  

```
list = [1,2,3] # just don't
```

and then method list() is gone (at least during your program execution).

### Creating lists 

Lists can be created by organizing objects in square brackets, or by using the list() constructor. You can also print a list.

In [45]:
lista = [1, 2, 3]     # create a list with a literal
listb = list(range(4,7))  # create a list with list()
print('listb:', listb)

listb: [4, 5, 6]


### List operators 

The += operator can be used to add a new element to the end of a list, while the + operator can join two lists.

In [53]:
listab = lista + listb
print('listab length = ', len(listab))
listab += [7]   # add one element
listab += [8, 9] # add many elements
print('listab length = ', len(listab))

listab length =  6
listab length =  9


In [54]:
listx = listab + 5  # can't do this

TypeError: can only concatenate list (not "int") to list

The **for-in** and **in** constructs work well with lists. The **in** construct returns boolean True or False. The **for in** loop is typically favored over counting-for loops or while loops.

The **for-in** below iterates through each element in the list. 

In [48]:
friends = ['Jim', 'Hamed', 'Charlotte']

for friend in friends:
    print('Hello', friend)

Hello Jim
Hello Hamed
Hello Charlotte


By the way, these **for-in** and **in** constructs also work on strings as we discussed in the last notebook, and they work on other container objects as we will show in the next notebook.

In [50]:
# use for-in with any iterable object
for letter in 'aeiou':
    if letter in 'friend':
        print(letter, ' is in word friend')

e  is in word friend
i  is in word friend


## List methods

There are many built-in container methods that work on lists. 

### index()

The index() method to find the list index.

In [12]:
if 'Charlotte' in friends:
    print(friends[friends.index('Charlotte')], ' is my friend.')

Charlotte  is my friend.


### len()

The len() function works on lists just as it did for strings.

In [20]:
len(friends)

3

### sort(), sorted() and reverse()

The sort() method sorts a list in place, returning None, whereas the sorted() method returns a new sorted list. The reverse() method reverses the list in place, returning None.

In [21]:
for friend in sorted(friends):
    print(friend)
print('after sorted():',' '.join(friends))
friends.sort()
print('after sort():', ' '.join(friends))
friends.reverse()
print('after reverse():', ' '.join(friends))

Charlotte
Hamed
Jim
after sorted(): Jim Hamed Charlotte
after sort(): Charlotte Hamed Jim
after reverse(): Jim Hamed Charlotte


### insert(), append(), remove(), and pop()

We can modify a list by inserting elements at a given place in the list, appending elements to the end of the list, removing elements, or popping elements off the list from either the front or the end of the list.

In [17]:
fruits = ['apple', 'banana', 'orange']
print('point 1:', ' '.join(fruits))

fruits.insert(0, 'peach')        # insert at position 0
print('point 2:', ' '.join(fruits))

fruits.append('pear')
print('point 3:', ' '.join(fruits))

if 'apple' in fruits:
    fruits.remove('apple')    # throws error if not in list
print('point 4:', ' '.join(fruits))

fr = fruits.pop()
print('point 5:', ' '.join(fruits), 'after popping', fr)

fr = fruits.pop(0)
print('point 6:', ' '.join(fruits), 'after popping', fr)

point 1: apple banana orange
point 2: peach apple banana orange
point 3: peach apple banana orange pear
point 4: peach banana orange pear
point 5: peach banana orange after popping pear
point 6: banana orange after popping peach


### Efficiency in lists and timeit()

Lists are very efficient constructs in Python but we should keep in mind how to best use lists efficiently.

The timeit module is helpful for quantifying efficiency comparisons. The timeit module will return time in seconds.

First we make a long list and pop off the top of the list.
Then we make the list again and pop off the bottom of the list.

Noticing that popping off the end of the list is *much* more efficient compared to popping off the front of the list. Popping off or inserting into the front of the list is an O(n) operation and should be avoided. This is because the list has to be resized and the elements reshifted. 

Lists have very fast random access and accessing at the end is O(n). See the [list time complexity page](https://wiki.python.org/moin/TimeComplexity) for costs of list operations.

In [18]:
import timeit

long_list = list(range(1,100000))  # about 100K elements
start_time = timeit.default_timer()
while long_list:
    long_list.pop(0)
print("time = ", timeit.default_timer() - start_time)

long_list = list(range(1,100000))
start_time = timeit.default_timer()
while long_list:
    long_list.pop()
print("time2 = ", timeit.default_timer() - start_time)

time =  0.918456026000058
time2 =  0.007165963000261399


### slices

List slices work just like string slices.


In [28]:
list1 = list(range(1,101))
print(list1[:5])   # first elements
print(list1[95:])  # last elements
print(list1[2:5])  # middle elements

[1, 2, 3, 4, 5]
[96, 97, 98, 99, 100]
[3, 4, 5]


### using slice notation to copy or clear a list

* list[:] creates a copy of the list
* del a[:] clears a list


In [29]:
list2 = list1[:]   # same as list2 = list1.copy()
print(len(list2))

del list2[:]
print('length of list 1 is still = ', len(list1))
print('length of list 2 is:', len(list2))

100
length of list 1 is still =  100
length of list 2 is: 0


### list comprehensions

List comprehensions provide a concise way to perform operations on lists. A list comprehension consists of brackets containing an expression followed by a for, and optionally followed by ifs.

The following code shows two ways to square each integer in the range 1 to 5.


In [1]:
# method 1: use a for loop
squares = []
for i in range(1,6):   # 1, 2, 3, 4, 5
    squares.append(i**2)
print(squares[-1])
    
# method 2: use list comprehension
squares = [i**2 for i in range(1,6)]
print(squares[-1])

25
25


The following list comprehension creates a list of lists.

In [11]:
int_sq = [[x, x**2] for x in [1, 2, 3] ]
int_sq

[[1, 1], [2, 4], [3, 9]]

Having more than one **for** works just like nested for loops.

In [12]:
listxy = [[x, y] for x in [1, 2] for y in ['a', 'b', 'c']]
listxy

[[1, 'a'], [1, 'b'], [1, 'c'], [2, 'a'], [2, 'b'], [2, 'c']]

The following shows an example using the **if** clause. 

In [13]:
list_small_squares = [[x, x**2] for x in list(range(1,12)) if x**2 < 100]
list_small_squares

[[1, 1], [2, 4], [3, 9], [4, 16], [5, 25], [6, 36], [7, 49], [8, 64], [9, 81]]

### Practice

* Make a list from the individual words in a sentence using the string.split() function
* Print the tokens in order
* Print the tokens in reverse order by iterating backwards
* Use a list comprehension to create a new list based on words that begin with 's'

In [30]:
text = 'This is a sample sentence'

print("Tokens in order:")
tokens = text.split()
for token in tokens:
    print(token)

print("\nTokens in reverse order:")
i = len(tokens) - 1
while i >= 0:
    print(tokens[i])
    i -= 1
    
print("\nTokens that start with s:")
s_tokens = [token for token in tokens if token.startswith('s')]
print(s_tokens)

Tokens in order:
This
is
a
sample
sentence

Tokens in reverse order:
sentence
sample
a
is
This

Tokens that start with s:
['sample', 'sentence']


### deepcopy() a list

When we copy a list of lists some strange results may occur.

In [31]:
# example 1
list1 = [1, 2, [3, 4], 5]
list2 = list1
list2[2][0] = 9
list1  # the sublist in list 1 was changed when list 2 was changed


[1, 2, [9, 4], 5]

How can we explain the behavior in the code above? The answer is in the way Python manages memory. The list within the list is actually a pointer to a list. So when list1 is copied to list2, the pointer is copied. 

To make a completely independent copy, the deepcopy function must be used.

In [33]:
# example 2
from copy import deepcopy
list1 = [1, 2, [3, 4], 5]
list2 = deepcopy(list1)
list2[2][0] = 9
list1 # list1 remains unchanged


[1, 2, [3, 4], 5]

### Practice

Write a Python program in the IDE of your choice. The program operates on two lists. Write 3 functions, all with the same purpose: to return a list of the same length as list1 with 0 if the element does not appear in list2 and 1 if it does. 

* a loop within a loop comparison 
* a one-loop function using the 'in' operator
* a list comprehension function

Start with two small lists until you get the logic right. Then test it with larger lists of random numbers. The code block below shows how to generate random small integers.

In [2]:
from random import randrange

len1 = 50
len2 = 75

list1 = [randrange(100) for _ in range(len1)]
list2 = [randrange(100) for _ in range(len2)]

In [3]:
# check the range of generated numbers
print('max and min of list 2:', max(list2), min(list2))

max and min of list 2: 98 1


Compare the timing of the functions on your data. 