## Lesson 2 - Tuples, lists, dicts, and sets

Data Structures: is a data organization and storage format that enables efficient access and modification. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

### Using lists

In [2]:
a_list = ['chewbacca', 42, 'harrison ford', 3.1415]

A list is an ordered collection of other objects. It can be sliced like a string, giving a list with only the elements included in the slice:

In [3]:
 a_list[0:2]

['chewbacca', 42]

We can also slice a list with a skip value, for example getting every third element.

In [4]:
a_list[::3]

['chewbacca', 3.1415]

List are mutable data structure, meaning that we can codify a given list in-place. For example: calling the append methond on a list and feeding in an object (say a string).

In [5]:
a_list.append('chicken')

In [6]:
a_list

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken']

This behavior has some interesting consequences. For example, if now set the name a_list2 to point to the same list as a_list does:

In [7]:
a_list2 = a_list

In [8]:
a_list2

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken']

Now lets append an element to a_list:

In [9]:
a_list.append(420)

In [10]:
a_list2

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420]

This is because both names point to the same list instance; they are currently two names for the same object. Thinking in terms of names and objects in Python is key to not being confused by this behavior. For example, what happens when we append a list to itself?

In [12]:
a_list.append(a_list)

In [11]:
a_list

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420]

This list now includes itself as the last element. Meaning, if we select that last element with indexing:

In [13]:
a_list[-1]

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420, [...]]

We get the same list. Each element in a list is effectively a name pointing to a particular object in memory. In this case, a name like a_list[-1] happens to point to the same data structure as a_list itself. So selecting the last element of each successive result gives the same result:

In [14]:
a_list[-1][-1][-1]

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420, [...]]

Another method for list includes .pop()
Which drops the last element in the list and returns it.

In [15]:
a_list.pop()

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420]

In [16]:
a_list

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken', 420]

In [17]:
a_list.pop()

420

In [18]:
a_list

['chewbacca', 42, 'harrison ford', 3.1415, 'chicken']

There's .insert() method, which allowes us to insert an element at a given index.

In [19]:
a_list.insert(1, 'fuzzy')

In [20]:
a_list

['chewbacca', 'fuzzy', 42, 'harrison ford', 3.1415, 'chicken']

In addition to using [], we can create a list using the list constructor. This takes an iterable, such as a string, and makes each element of the iterable an element of the resulting list:

In [21]:
list("a string")

['a', ' ', 's', 't', 'r', 'i', 'n', 'g']

This can be used to make a "shallow" copy of an existing list:

In [23]:
newlist = list(a_list)

In [24]:
newlist.append(2)

In [25]:
newlist

['chewbacca', 'fuzzy', 42, 'harrison ford', 3.1415, 'chicken', 2]

There is not change in our original list.

In [26]:
a_list

['chewbacca', 'fuzzy', 42, 'harrison ford', 3.1415, 'chicken']

We can check if the lists are different by:

In [27]:
a_list is newlist

False

In [28]:
a_list2 is a_list

True

Because a list is an iterable, we can use the len function on it to get the number of elements it contains:

In [29]:
len(a_list)

6

There is a built-in function called range that will give back numbers starting from 0 up to the number we specify:

In [30]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


We could use this to iterate through the elements of a list:

In [33]:
for i in range(len(a_list)):
    print(a_list[i])

chewbacca
fuzzy
42
harrison ford
3.1415
chicken


We can iterate directly through the elements of a list as we would the characters of a string.

In [34]:
for i in a_list:
    print(i)

chewbacca
fuzzy
42
harrison ford
3.1415
chicken


###  Challenge: write a for-loop that creates a list of the first 100 perfect squares

In [37]:
squares = list()
for i in range(1, 101):
    squares.append(i**2)
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801, 10000]


The result can be displayed differently by the following:

In [38]:
squares = list()
for i in range(1, 101):
    squares.append(i**2)
squares

[1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000]

### Generating lists with list comprehensions

The example above can we expressed differenly in what is called list comprehension.

In [39]:
squares = [i**2 for i in range(1, 101)]

In [40]:
squares

[1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000]

List comprehensions can also include if and else statements.

In [41]:
squares = [i**2 if i % 2 == 0 else 0 for i in range(1, 101)]

In [42]:
squares[:10]

[0, 4, 0, 16, 0, 36, 0, 64, 0, 100]

In [43]:
squares = [i**2 for i in range(1, 101) if i%2 == 0]

In [44]:
squares[:10]

[4, 16, 36, 64, 100, 144, 196, 256, 324, 400]

## Using Tuples

Tuple is another data structure. Tuples are immutable once tuple's elements are created

In [45]:
a_tuple = (42, 'dankstank')

In [46]:
type(a_tuple)

tuple

In [47]:
tuple

tuple

We cannot replace an element of tuple with some other object. We will get an error.

In [48]:
a_tuple[1] = 'food'

TypeError: 'tuple' object does not support item assignment

Tuples, like lists are iterable, so tuple have length and can be sliced.

In [49]:
len(a_tuple)

2

In [50]:
a_tuple[1]

'dankstank'

### Tuple Comprehensions

In [1]:
t_squares = tuple(i**2 for i in range(1, 100))

In [2]:
t_squares[:10]

(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)

## Using dictionaries for key-value relationships

Dictionary is a data structure with keys and values pairs

We can make a dictionary giving states as keys and their capitals as values, like so:

In [3]:
capitals = {'Georgia': 'Athens',
            'Missouri': 'Jefferson City',
            'Arizona': 'Phoenix',
            'Colorado': 'Denver'}

We can then access the capital of each state in the dictionary by giving the state name:

In [4]:
capitals['Missouri']

'Jefferson City'

This is an example of a mapping: we use elements of the set of states as our values mapping onto elements of the set of capital cities. Each key in the dictionary maps directly onto a value. A dictionary excels at storing a mapping like this. It's much preferrable over, say, a list of tuples:

In [7]:
capitals_list = [('Georgia', 'Athens'), ('Missouri', 'Jefferson City')] 

Where we would have to grab the same result with something like:

In [8]:
capitals_list[1][1]

'Jefferson City'

We can add new key,value pairs directly with assignment:

In [9]:
capitals['Hawaii'] = 'Honolulu' 

In [10]:
capitals

{'Georgia': 'Athens',
 'Missouri': 'Jefferson City',
 'Arizona': 'Phoenix',
 'Colorado': 'Denver',
 'Hawaii': 'Honolulu'}

Note that not any object can be used as a key. Keys must be hashable, which means that they must not be mutable or composed of mutable elements. For example, we can't use a list as a key: we get an error

In [11]:
capitals[[1, 2]] = "Nowhere"

TypeError: unhashable type: 'list'

In [12]:
capitals[True] = 'Nowhere'

In [13]:
capitals

{'Georgia': 'Athens',
 'Missouri': 'Jefferson City',
 'Arizona': 'Phoenix',
 'Colorado': 'Denver',
 'Hawaii': 'Honolulu',
 True: 'Nowhere'}

### Creating a dictionary with a dict comprehension

In [14]:
squares_dict = {i: i**2 for i in range(1, 10)}

In [15]:
squares_dict

{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

###  Final notes on dicts

We can directly access the keys in a dict with the keys method:

In [16]:
list(capitals.keys())

['Georgia', 'Missouri', 'Arizona', 'Colorado', 'Hawaii', True]

Likewise, the values:

In [17]:
list(capitals.values())

['Athens', 'Jefferson City', 'Phoenix', 'Denver', 'Honolulu', 'Nowhere']

If we want both keys and values, calling items will get us key,value tuples:

In [21]:
list(capitals.items())

[('Georgia', 'Athens'),
 ('Missouri', 'Jefferson City'),
 ('Arizona', 'Phoenix'),
 ('Colorado', 'Denver'),
 ('Hawaii', 'Honolulu'),
 (True, 'Nowhere')]

Note that, to recognize that dicts have no concept of order for their contents.

### Using sets for membership operations

In [22]:
a_set = {'Robert', 3.14, 42}

In [23]:
a_set

{3.14, 42, 'Robert'}

Since a set has no order, its elements cannot be accessed with indexing:

In [25]:
a_set[1]           # gives an error

TypeError: 'set' object does not support indexing

But we can check for membership:

In [26]:
42 in a_set

True

Add elements to set:

In [28]:
a_set.add(33)

In [29]:
a_set

{3.14, 33, 42, 'Robert'}

A key feature of a set is that it has no repeat elements, so adding 33 again results in no change:

In [30]:
 a_set.add(33)

In [31]:
a_set

{3.14, 33, 42, 'Robert'}

Sets are good to check which members of two distinct groupings are members of both (intersection), members of either (union), or members of exactly one (difference). For example, if we make a set of numbers from 0 to 9:

In [32]:
a_newset = set(range(10))

In [33]:
a_newset

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [34]:
another_set = set(range(7,20)) 

In [35]:
another_set

{7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}

Then taking the intersection of these gets us a new set featuring the members that are present in both:

In [36]:
another_set & a_newset

{7, 8, 9}

Likewise, taking the union gets us a set with members present in either set:

In [37]:
another_set | a_newset

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}

We could take the difference to get only members present in the first set alone:

In [38]:
another_set - a_newset

{10, 11, 12, 13, 14, 15, 16, 17, 18, 19}

Or the symmetric difference to get only members present in exactly one of the two sets:

In [39]:
another_set.symmetric_difference(a_newset)

{0, 1, 2, 3, 4, 5, 6, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}

Sets have methods for each of these operations as well, such as union:

In [40]:
another_set.union(a_newset)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}

Sets can be constructed with a comprehension:

In [41]:
{i**2 for i in range(10)}

{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}