# Lists in Python

## List Creation
Use square brackets. Lists can contain any mix of data types. You can nest lists inside other lists.

In [2]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

In [3]:
fam2 = [["liz", 1.73],
["emma", 1.68],
["mom", 1.71],
["dad", 1.89]]

In [4]:
fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [5]:
fam2

[['liz', 1.73], ['emma', 1.68], ['mom', 1.71], ['dad', 1.89]]

## Subsetting lists
- index starts at 0 (hardest part to adapt for R users)
- use a series of square brackets for nested lists
- use negative numbers to count from the end

In [5]:
fam[0]

'liz'

In [6]:
fam2[0]

['liz', 1.73]

In [7]:
fam2[0][0]

'liz'

In [8]:
fam[-1]

1.89

In [9]:
fam2[-1]

['dad', 1.89]

In [10]:
fam2[-1][-1]

1.89

## List Slicing
Note that the slice will not include the item in the index after the colon.
You can think of the 'slice' happening at the commas corresponding to the number.
So fam[1:3] slices the list at the first and third commas, and extracts [1.73, 'emma']

In [11]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam[1:3]

[1.73, 'emma']

In [12]:
fam[1:2]

[1.73]

In [13]:
fam[1:1]

[]

In [14]:
fam2[0:2]

[['liz', 1.73], ['emma', 1.68]]

In [15]:
fam[2:]

['emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [16]:
fam[:4]

['liz', 1.73, 'emma', 1.68]

In [17]:
fam[:]  # slice with no indices will create a copy of the list.

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [18]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
print(fam[-5:-2])

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
[1.68, 'mom', 1.71]


No simple solution for subsetting disjoint items in a list. No equivalent to R's list[c(1, 3, 7)]

A workaround (from stackexchange):

`a = ['0', 'a', 'b', 3, 4, 'e', 6, 7, 8]`
and the list of indexes is stored in

`b = [1,3,5]`
then a simple one-line solution will be

`c = [a[i] for i in b]`

In [7]:
a = ['0', 'a', 'b', 3, 4, 'e', 6, 7, 8]
b = [1,3,5]
c = [a[i] for i in b]  # this is technically called a list comprehension
print(c)

['a', 3, 'e']


## Lists are mutable
This means that methods change the lists themselves. 
If the list is assigned to another name, both names refer to the exact same object.

In [8]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
second = fam    # second references fam. second is not a copy of fam.
second[0] = "sister"  # we make a change to the list 'second'
print(second)
print(fam) # changing the list 'second' has changed the list 'fam'

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [21]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
print(fam)
second = fam[:]  # creates a copy of the list
# second = fam.copy() # you can also create a list using the copy() method
second[0] = "sister"
print(second)
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['sister', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [22]:
third = fam.copy()
print(third)
third[1] = 1.65
print(third)
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.65, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [23]:
print(fam)
fam[1:3] = [1.8, "jenny"]
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
['liz', 1.8, 'jenny', 1.68, 'mom', 1.71, 'dad', 1.89]


# List Methods

- `list.copy()`
    - Return a shallow copy of the list. Equivalent to a[:]
- `list.append(x)`
    - Add an item to the end of the list. Equivalent to a[len(a):] = [x].

In [24]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.append("me")   # unlike R, you don't have to "capture" the result of the function. 
# the list itself is modified. You can only append one item.
print(fam)

fam = fam + [1.8]  # you can also append to a list with the addition `+` operator
# note that this output needs to be 'captured' and assigned back to fam
print(fam)

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me']
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me', 1.8]


- `list.insert(i, x)`
    - Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).

- `list.extend(iterable)`
    - Extend the list by appending all the items from the iterable. Equivalent to a[len(a):] = iterable.

In [25]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, "joe")
print(fam)

fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, ["joe", 2.0])  # trying to insert a list inserts a list
print(fam)

fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.insert(4, "joe", 2.0)  # like append, you can only insert one item
print(fam)

['liz', 1.73, 'emma', 1.68, 'joe', 'mom', 1.71, 'dad', 1.89]
['liz', 1.73, 'emma', 1.68, ['joe', 2.0], 'mom', 1.71, 'dad', 1.89]


TypeError: insert() takes exactly 2 arguments (3 given)

In [26]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.extend(["joe", 2.0]) # lets you add multiple items, but at the end
print(fam)

fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam[4:4] = ["joe", 2.0] # Use slice and assignment to insert multiple items in a specific position
print(fam)


['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'joe', 2.0]
['liz', 1.73, 'emma', 1.68, 'joe', 2.0, 'mom', 1.71, 'dad', 1.89]


### shallow versus deep copy

(You will not be quizzed on this concept.)

There are actually two ways to make a copy:
- `list.copy()`
and 
- `import copy
copy.deepcopy(list)`

The difference is noticeable when you have other objects (e.g. other lists) nested in lists.

A shallow copy makes a copy of the list with references to the nested objects
A deep copy makes copies of the nested objects.

In [27]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

d = c  # i am not making a copy. both d and c refer to the exact same object.
print(c)
print(d)
c[1] = "x"  # this change affects both
print(c)
print(d)

[['a', 1, 2], ['b', 3, 4]]
[['a', 1, 2], ['b', 3, 4]]
[['a', 1, 2], 'x']
[['a', 1, 2], 'x']


In [28]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

d = c.copy()
c[1] = "x"  # this change affects only c. it does not affect d because d is a copy.
print(c)
print(d)

a.append(100) # We update list a. Lists c and d refer to list a. So this change affects c and d
print(c)
print(d)

[['a', 1, 2], 'x']
[['a', 1, 2], ['b', 3, 4]]
[['a', 1, 2, 100], 'x']
[['a', 1, 2, 100], ['b', 3, 4]]


In [29]:
a = ["a", 1, 2]
b = ["b", 3, 4]
c = [a, b]

import copy
e = copy.deepcopy(c)

c[1] = "x"  # this change affects only c. it does not affect e because e is a copy
print(c)
print(e)

a.append(100) # lists c refers to list a, but e made a copy of list a. So this change affects only c but not e
print(c)
print(e)

[['a', 1, 2], 'x']
[['a', 1, 2], ['b', 3, 4]]
[['a', 1, 2, 100], 'x']
[['a', 1, 2], ['b', 3, 4]]


- `list.remove(x)`
    - Remove the first item from the list whose value is x. It is an error if there is no such item.

- `list.pop([i])`
    - Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list.

- `list.clear()`
    - Remove all items from the list. Equivalent to del a[:].


In [30]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.remove("liz")
print(fam)

[1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


In [31]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
j = fam.pop()  # if you don't specify an index, it pops the last item in the list
print(j)
print(fam)

1.89
['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad']


In [32]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
j = fam.pop(2)  # you can also specify an index
print(j)
print(fam)

fam.clear()
print(fam)

emma
['liz', 1.73, 1.68, 'mom', 1.71, 'dad', 1.89]
[]


- `list.index(x)`
    - Return zero-based index in the list of the first item whose value is x. Raises a ValueError if there is no such item.

- `list.count(x)`
    - Return the number of times x appears in the list.

In [33]:
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
fam.index("emma")

2

In [34]:
letters = ["a", "b", "c", "a", "a"]
print(letters.count("a"))

fam2 = [["liz", 1.73],
["emma", 1.68],
["mom", 1.71],
["dad", 1.89]]
print(fam2.count("liz"))  # the string by itself does not exist
print(fam2.count(["liz", 1.73]))

3
0
1


- `list.sort(key=None, reverse=False)`
    - Sort the items of the list in place (the arguments can be used for sort customization, see sorted() for their explanation).

- `list.reverse()`
    - Reverse the elements of the list in place.

In [35]:
fam.reverse()
print(fam)

[1.89, 'dad', 1.71, 'mom', 1.68, 'emma', 1.73, 'liz']


In [36]:
fam.sort()  # can't sort floats and string

TypeError: '<' not supported between instances of 'str' and 'float'

In [37]:
some_digits = [4,2,7,9,2,5,3]
some_digits.sort()
print(some_digits)

some_digits.sort(reverse = True)
print(some_digits)

[2, 2, 3, 4, 5, 7, 9]
[9, 7, 5, 4, 3, 2, 2]


# String Methods

strings are immutable. This means that when you use a method on a string, it does not modify the string itself. 

In [71]:
name = "miles chen"
print(name.upper())
print(name.capitalize())
print(name.title())
print(name.lower())
print(name) # string itself is not modified

MILES CHEN
Miles chen
Miles Chen
miles chen
miles chen


In [72]:
name.count("e")

2

In [73]:
name.endswith("k")

False

In [74]:
name.endswith("n")

True

In [75]:
name.startswith("m")

True

In [76]:
name2 = "   miles chen  "
name2.strip()

'miles chen'

In [77]:
name2.split()

['miles', 'chen']

In [78]:
num_string = "2,3,4,7,8"
print(num_string.split())
print(num_string.split(','))

['2,3,4,7,8']
['2', '3', '4', '7', '8']


In [79]:
print(name)
print(name.isalpha()) # has a space, so it is not strictly alpha
name3 = "abba"
print(name3.isalpha())

miles chen
False
True


In [85]:
# strings can span multiple lines with triple quotes 
long_string = """Lyrics to the song Hallelujah
Well I've heard there was a secret chord
That David played and it pleased the Lord
But you don't really care for music, do you?"""
shout = long_string.upper()
print(shout)
word_list = long_string.split() # separates at spaces
print(word_list)

LYRICS TO THE SONG HALLELUJAH
WELL I'VE HEARD THERE WAS A SECRET CHORD
THAT DAVID PLAYED AND IT PLEASED THE LORD
BUT YOU DON'T REALLY CARE FOR MUSIC, DO YOU?
['Lyrics', 'to', 'the', 'song', 'Hallelujah', 'Well', "I've", 'heard', 'there', 'was', 'a', 'secret', 'chord', 'That', 'David', 'played', 'and', 'it', 'pleased', 'the', 'Lord', 'But', 'you', "don't", 'really', 'care', 'for', 'music,', 'do', 'you?']


In [86]:
long_string.splitlines() # separates at line ends

['Lyrics to the song Hallelujah',
 "Well I've heard there was a secret chord",
 'That David played and it pleased the Lord',
 "But you don't really care for music, do you?"]

In [87]:
long_string.count("e")

15

In [89]:
long_string.find("t")

7