#Collection Data Types

We're moving on to the next four data types, these types aren't necessarily more complicated they just have a bit more to them since they can hold multiple elements unlike the integers, floats, booleans, and strings that we went over in the first lesson. This ability to hold multiple elements is why I've grouped them into one lesson and informally labelled them as `Collection` data types. Python doesn't actually make this distinction, but some other programming languages do.

So the four data types we'll cover now are:

* Lists
* Tuples
* Sets
* Dictionaries

But first, let's ease into this by working with our good old friend the `string` data type.

In [None]:
introduction = "Well, I thought that, maybe, you know, like, possibly?"

Now this introduction has a little lyrical flourish if you will, it has a large number of pauses as demonstrated by the commas. What if we wanted to get rid of them?

In the last lesson we went over the `strip()` function to remove characters from the beginning or end of a string. Unfortunately, the commas are in the middle of the introduction so we can't use those. There is, however, another function that we can use to split apart a `string`. It is aptly labelled the `split()` function.

In [None]:
introduction.split(',')

**Whoa!** That isn't a string. 

Unlike the `strip()` function, the `split()` function returns a list. It is all of the individual elements in a string split apart based on the character we gave as input.

We can tell that it's a list because it has multiple elements and it starts and ends with the brackets. The bracket symbol (`[]`) is what denotes a list.

#Lists

A list is an ordered sequence of elements, with that order being specified by the order that the elements are in when the list is created or as elements are added to the list. We create a list by using the `[]` syntax.

Let's go back to our imaginary pet store and list the types of pets that we carry.

In [None]:
pets = ['dogs', 'cats', 'fish']
pets

Notice that the list printed the elements in the same order that they were in when we created the list.

If we want to add an element to the list we can simply `append()` the element to the list.

In [None]:
pets.append('hedgehog')

pets

Notice how when we used the `append()` function it changes the variable value. 

If we had multiple additional values that we wanted to add, we could append them individually. Or we could just `extend()` our list `pets` with the list of additional pets that our pet shop carries.

In [None]:
pets.extend(['parrot', 'iguana'])

pets

We can also add two lists together.

In [None]:
pets + ['chameleon', 'hamster']

But notice that the addition symbol doesn't change the value of `pets`

In [None]:
pets

If we want to add to the list variable, we just set it equal to itself and the new list.

In [None]:
pets = pets + ['chameleon', 'hamster']
pets

And just like strings, we can use both of the additive math operators. I'll use a small number here to not flood the screen, but we can multiply the list too.

In [None]:
print( pets * 2 )

But just like before subtraction and division don't work since it's not clear what should happen

In [None]:
pets - ['hamster']

##Indexing and Slicing

Since a list is made up of multiple elements, it makes sense that we will be able to get a single element by its index or a slice of elements. Remember to access an element by its index we do `variable[index]`.

In [None]:
pets[1]

And then slicing is just `variable[start_index : stop_index : step]`

In [None]:
pets[1 : 3]

Both of these operations work exactly like they did when we used strings.

Now let's get to a big difference, `list` variables are **mutable**. That means that we can change one element of the list and this is the first type we've encountered that will let us do that. Let's say that we stopped selling cats because everyone just looks at them on the internet and now we carry sea monkeys. We could delete `cats` and append `sea monkeys` or we could just set the position that `cats` is in with `sea monkeys`.

In [None]:
pets[1] = 'sea monkeys'
pets

If we want to find out the index of a value to use it, we can use the built-in function `index()`. We use indices to access specific parts of a list a lot so this is helpful

In [None]:
pets

In [None]:
pets.index('fish')

Here are two more ways we can change the contents of a list:

* `insert()` - inserts a new element at the specified index
* `pop()` - removes and returns the element at the specified index

In [None]:
pets.insert(5, 'burger')
pets

Wait, that's not a type of pet!

In [None]:
pets.pop(5)

In [None]:
pets

There is also the `remove()` function, with this we actually get to give the value that we want to be removed.

In [None]:
pets.append('hamster')
pets

In [None]:
pets.remove('hamster')
pets

However, you should be careful because if you try to remove a value and it doesn't exist then Python will give an error.

In [None]:
pets.remove('hamster')

Also, the `remove()` function will only remove the **first** instance of the variable in a list. If there is more than one entry then you would need to use remove again.

In [None]:
pets.insert(2, 'hamster')
pets.append('hamster')
pets

In [None]:
pets.remove('hamster')
pets

We can also use `del` to remove an item from a `list`. If we don't specify an index though it will delete the **entire** variable from memory though! So be careful when you use `del`.

In [None]:
del pets[-1]
pets

So there we go, we can add elements to a list with:

* `append()`
* `extend()`
* `insert()`
* `+`

We can remove elements from a list with:

* `pop()`
* `remove()`
* `del`

Lists have other built-in methods too besides the maintenance functions of adding and deleting elements. For example, we can reverse a list

In [None]:
print('Initial pets', pets)
pets.reverse()
print('Reversed pets', pets)

And we can also sort the list

In [None]:
print('Initial pets', pets)
pets.sort()
print('Sorted pets', pets)

Note that both of those operations worked directly on the variable itself and changed it. If we want to perform one of these functions without changing the variable's value we can use the `sorted()` and `reversed()` functions on the variable instead of the built-in `sort()` and `reverse()`. (Note that in order to make these statements print I have to cast it as a list. This is new in Python 3 and it isn't that important why this distinction is necessary right now).

In [None]:
print( "Reversing the list", list( reversed(pets) ) )

print( "The original list", pets )

Another very useful function that works on the `list` type like `sorted()` and `reversed()` is `len()`.

`len()` tells us the **len**gth of the list (or, said another way, the number of items in it).

In [None]:
len(pets)

## Nesting

So far the elements that I've shown in a `list` have been `string`, `float`, or `integer` data types. Does that mean that a `list` can only hold the basic data types?

Nope! Lists can contain other `list` variables, just like this:
<img width='400' style="float:center" src='../images/nesting_dolls.jpg')></img>

Now just to prove it, I'll change our `pets` list to have the different breeds of dogs as an element.

In [None]:
dog_breeds = ['bulldog', 'terrier', 'greyhound']
pets.append(dog_breeds)
pets

But wait, what if we want to access just one of the dog breeds and print it to the screen? How do we do that?

Well I appended the `dog_breeds` variable to `pets`, so why don't we just try to access the last element of `pets` and see what happens.

In [None]:
pets[-1]

A-ha! So the last element of `pets` is the entire `dog_breeds` `list`. 

I bet we can access a single element by its index in that list too. Let's give a `0` index so that we print `bulldog`. Since these `list` variables are nested though, we need to give nested directions to the value. We do that just by putting one set of `[]` next to another `[]`. When Python navigates through your list variable it will always read the indices from left to right, with the leftmost `[]` being the topmost `list`.

In [None]:
pets[-1][0]

Visually I like to think of nested lists like this:

<img src='../images/nesting.png'></img>

Also, we can change the values inside our nested lists just like at our topmost `list` and use all of our normal `list` functions that we used before.

In [None]:
pets[-1][0] = 'golden retriever'
pets

In [None]:
pets[-1].append('pug')
pets

# Tuples

Tuples arey similar to `list` variables at the outset.

* Both Tuples and Lists contain a sequence of individual elements
* Both Tuples and Lists are stored in the same order as they are created
* Both Tuples and Lists can store mixed data types
* We access individual elements with the syntax `variable[index]`

So what are the differences? Well the first one is that the syntax for a `list` is `[]` while the syntax for a `tuple` is `()`. Let's create a tuple that stores the attributes for our golden retriever who is named `penny`.

In [None]:
penny = (60, 75, 'yellow') #Length (in), Weight (lbs), Color 
print( type(penny) )
penny

Now we have a `tuple` that has all of the attributes for our dog `penny`. Notice that since it wasn't necessarily clear what each of the individual elements stood for in the variable that I made a comment next to it stating what each field was. We can make a comment anytime we want in Python by using the `#` symbol. Whatever we type after the `#` on the same line will be ignored by Python.

The second difference is that tuples are **immutable**. This means that we can **not** change one of the elements or delete elements from it. Tuples also do not have any built-in functions to add elements to themselves.

In [None]:
penny[-1] = 'amber'

In [None]:
del penny[1]

We can still perform the additive operators with tuples though

In [None]:
penny + ('amber', 'stinky') #Eye color, breath smell

We just need to be careful if there is only one element in the tuple that we want to add though. To specify a tuple with only one element we have to use the syntax `(value, )` because without the comma, Python will think that we are trying to use the parentheses as a mathematical operation.

In [None]:
penny + ('amber', )

Otherwise, we can put any data type inside a `tuple`, nest `lists` and `tuples` inside a `tuple`, and index nested `lists` and `tuples` exactly like we do with a `list`.

#### So wait, why are there both lists and tuples? This doesn't make any sense!! I demand for things to make sense!!

I know, right now it's pretty hard to see why you would use one data type over the other. To be honest, it isn't even always that clear after programming for a little bit and typically most beginners just prefer to use `lists` for their ability to be manipulated.

But this is why we have both types of variables in theory.

A `list` will typically hold homogenous elements. This means that, in theory, every element within a `list` to be of the same category. That's because when we have a `list` we may want to perform an action to every element within it.

A `tuple` will typically hold heterogeneous elements. This means that, in theory, every element will not be the same. 

To make this concrete, with our `pets` list every single element was a pet species. They were different values, but they were all within the same category.

With our `penny` tuple, each element was different. We stored her length, weight, and color. If we changed the color value, the new variable likely wouldn't be `Penny` the dog at all unless some mischievious kids broke into the store and dyed her fur! 

It's this line of thinking that cause some people to refer to a `tuple` as a `record`. The `penny` variable is a record for a specific animal and if we were to change one of its values then it would no longer be the record for `penny`.

In practice though, Python doesn't enforce your usage of these data types and you can choose to use either one. It's important to be aware of the distinction though, because many functions and programs written by others that we will use later will return a `tuple` back. Just remember to not try and change the contents of a `tuple`!

# Sets

The concept of a `set` comes quite directly from math. It appears to be similar to a `list` and a `tuple` but it is quite different (thank goodness after all those similarities between `lists` and `tuples` right?).

A `set` is an unordered collection of unique elements. We use the `{}` notation to create a `set` variable. Let's make a short `set` of the unique pet types that we carry.

In [None]:
unique_pets = {'bulldog', 'hamster', 'parrot'}

unique_pets

If we try to make a set with more than one element that it is the same, the second occurence will be discarded.

In [None]:
not_so_unique_pets = {'bulldog', 'hamster', 'parrot', 'hamster'}
not_so_unique_pets

We can cast a `list` or `tuple` into a `set` and when we do it will discard all duplicate elements also.

In [None]:
not_so_unique_pets_list = ['bulldog', 'hamster', 'parrot', 'hamster']
print(not_so_unique_pets_list)
set(not_so_unique_pets_list)

Sets are mutable like a `list`, but they don't use the same built-in functions. Instead of `append()` to add an item we just have the function `add()`

In [None]:
unique_pets.add('gerbil')
unique_pets

To remove an element from a `set` we use the `discard()` function instead of the `remove()` function like in a list.

In [None]:
unique_pets.discard('parrot')
unique_pets

However, we cannot add two sets together or multiply them

In [None]:
{'parrot', 'bulldog'} + {'gerbil'}

In [None]:
{'parrot', 'bulldog'} * 2

Instead we use specific mathematical operations with `set` variables. These are the `intersection`, `union`, or `difference` functions. This can be extremely useful in some cases.

For example, let's say that we have two pet stores and we don't have all of the same pets at each store.

In [None]:
store_one = {'bulldog', 'parrot', 'hamster', 'fish'}
store_two = {'fish', 'parrot', 'terrier', 'cat'}

If we wanted to find out what pets we carried at both stores we could just see what the `intersection` is between the store variables.

In [None]:
store_one.intersection(store_two)

If we wanted to know all the unique pets that we carried across both stores we would use the `union` function

In [None]:
store_one.union(store_two)

And if we wanted to know what pets that we carried at store one and not at store two we would use the `difference`

In [None]:
store_one.difference(store_two)

The same is done if we wanted the unique pets from store two that weren't at store one, we just need to ask for the difference starting from `store_two`

In [None]:
store_two.difference(store_one)

# Transformations

Since a `set`, `list`, and `tuple` are all a collection of elements, we can easily convert a variable from one type to the other as needed. We just need to cast the variable.

In [None]:
list_unique_pets = list( unique_pets )
print(list_unique_pets)
type(list_unique_pets)

In [None]:
tuple_unique_pets = tuple( list(unique_pets) )
print(tuple_unique_pets)
type(tuple_unique_pets)

In [None]:
reset_unique_pets = set(tuple_unique_pets)
print(reset_unique_pets)
type(reset_unique_pets)

# Dictionaries

A dictionary (which I'll also refer to as `dict`) is a bit of a twist in terms of the variable types that we've seen so far. A dictionary is a collection of elements, but it isn't as flat as a `list`, `tuple` or `set`.

Each element in a dictionary is an `item`, and every `item` has both a `key` and a `value`. We use the `key` to "look up" the `value`. This concept is just like if we wanted to look up the meaning of a word in a real dictionary. Also, just like in a real dictionary, it means that all of the `keys` **must** be unique. If we had a `key` multiple times, then we wouldn't know where to go look up its `value`.

To create a dictionary we use the `{}` syntax like a `set` (which makes sense, since all of the keys must be unique like in a `set`). However, since a dictionary is `key-value` pairs each individual `key-value` pair we list is separated by a `:`. This looks like this:

`dict = {key : value, key : value}`

Notice that each item is separated by a `,` (just like in the other collection variables) and the `:` is the new additional syntax to separate the `key` and `value`.

Let's make a dictionary that lists the weights of the pets we keep.

In [None]:
weights = {'penny': 75, 'jenny': 0.5, 'benny': 10}
weights

Now something important is that the `keys` in a dictionary are **unordered**, just like in a `set`. That means that while we think they should be sorted or ordered in some way, the variable just isn't stored that way. It also means that if you work on this lesson next to a friend, your variable may not print out in the same order as theirs.

So how can we access the value of an element if it doesn't have an index then?

Well, we actually use the same syntax as before: `dict[ ]`, but instead of putting in the index of an element we give the `key` and get its corresponding value.

In [None]:
weights['jenny']

If we want to change the value of a `key`, we just set its value again. Let's say that Jenny the fish ate all the food in the fish tank and has now doubled in size.

In [None]:
weights['jenny'] = 1.0

weights

If we want to add a `key-value` pair to the dictionary we just access a new `key` and assign a value.

In [None]:
weights['wendy'] = 5
weights

Just like a `set` we cannot add two dictionaries together or multiply a dictionary

In [None]:
weights + {'jeff': 10, 'bobby': 8}

However, if we want to add multiple `key-value` pairs we can use the `update()` function on a dictionary variable.

In [None]:
weights.update({'jeff': 10, 'bobby': 8})
weights

If we want to remove a `key-value` pair from a `dict` variable we can use `del` and give the `key` (make sure that the `key` exists or else it will be an error!)

In [None]:
del weights['jenny']
weights

Dictionaries also have the `pop()` function built-in that will work on a `key`, using this will delete the `key-value` pair and return the `value`

In [None]:
weights.pop('benny')

In [None]:
weights

Since a `dict` contains two distinct types of elements (`keys` and `values`), it has built-in functions to return a list of one or the other (note that in Python 3 this isn't a real list, in examples you'll see with Python 2 it used to be a real `list` data type. Now if we want to manipulate it as a `list` we must cast it as one).

In [None]:
weights.keys()

In [None]:
weights.values()

We can also just get all of the `key-value` pairs listed together using the `items()` function

In [None]:
weights.items()

As you can see here it returns a list (well, a dict_items data type truly) of our `key-value` pairs, with each item having the `tuple` data type.

As just stand alone functions it's hard to see the utility, but when we start making our own programs they will be extremely useful.

##Nesting

Just like a `list` or `tuple`, we can nest the collection data types inside a `dict` variable. In many cases this can be extremely useful.

Let's go back to our example where we wanted to store the attributes of our pets. Before we needed to give a comment stating what "fact" each element in the `tuple` was. In a dictionary, we can just name them explicitly.

In [None]:
attributes = {
    'penny' : {'weight' : 75,
               'length' : 60,
               'color' : 'yellow'
              },
    'bobby' : {'weight' : 10,
               'length' : 30,
               'color' : 'brown',
              }
}

So then if we were curious about `penny` we give her as the `key` to `attributes`

In [None]:
attributes['penny']

If we just wanted to know her `weight` we would do

In [None]:
attributes['penny']['weight']

And we can do the same for `bobby`

In [None]:
attributes['bobby']['length']

# A quick discussion on variables being variables

This last section doesn't actually cover a new data type, it actually just covers the concept of copying a variable. 

Let's go back to our multiple pet stores example. We're the shop owner and we start off with one shop.

In [None]:
store_one = ['beagle', 'terrier', 'bulldog', 'fish']
store_one

We go and open up our second store. We start that store by taking one of each type of pet that we had at the first store. The easiest way to do that would be to say that the inventory of `store_two` is equal to `store_one`

In [None]:
store_two = store_one

Now that we've been operating `store_two` for a bit we sold our only `beagle`, and with that money we bought a `parrot` and a `gerbil` to sell

In [None]:
store_two.remove('beagle')
store_two.extend(['parrot', 'gerbil'])
store_two

Perfect! Now let's go back and check on the stock at `store_one`

In [None]:
store_one

*????* That's not the inventory for `store_one`! That's the inventory for `store_two`!

Here is what happened in a rough explanation.

When we create a variable in a language it has some `value` associated with it. That `value` goes into the memory of our computer and the language stores the information that when I use my `variable`, that the `variable` is linked to a certain `value` in the memory.

So when I set `second_variable` equal to `variable` the computer thinks:

`A-ha! I am a smart computer, this second_variable just needs to be linked to the same value at the same place in memory. That is very easy, I'll just remember that second_variable goes to value`

So now when we change `second_variable` we also change the value for `variable`, since they are sharing the same `value` in the computer's memory. This might seem silly, but way back when memory was an extremely precious resource in computers so languages would try to minimize using it and relied on you, the programmer, to be both careful and explicit.

So if we want to copy a variable's contents from one variable to another, placing an identical but separate `value` in memory we have to use the `copy()` function. Every distribution of Python comes with `copy()`, but we have to import it from the library to use it.

In [None]:
from copy import copy

store_one = ['beagle', 'terrier', 'bulldog', 'fish']
store_two = copy(store_one)
store_two.remove('beagle')
store_two.extend(['parrot', 'gerbil'])

print('Store One', store_one)
print('Store Two', store_two)

Right now it's not important to understand libraries or anything like that, I just wanted to show you that **it is important** to understand how copying variables works. Accidentally setting one variable name equal to another and not remembering it is an easy way for errors to creep into your programs. Make sure to be careful!

#Exercises

You are now the manager of the pet store! I'm going to need you to manage the inventory of pets that we have at the store.

In [None]:
pet_store = ['beagle', 'parrot', 'iguana', 'gerbil', 'chameleon', 'fish']

We've sold out of `chameleon` and `iguana`, you need to remove them from the inventory. Use both `remove()` and `del` to do so.

We just got in a `terrier`, `chameleon`, `bulldog`, and `terrier`. Please add all of those animals to the `pet_store`.

Now, how many **unique** animal types do we currently have in the store?

How many `terrier` dogs do we have in the store? (Hint: the `list` data type has a built-in function called `count()`)

As an owner, I actually have a very bad obsessive-compulsive disorder. Could you please reverse-alphabetically sort the pet types at the store? Thanks so much! Also going to need you to work this entire weekend. For free.

Could you print from `beagle` to `fish` but display it in alphabetical order?

What is the index of `parrot`?

Now let's go back to our pet attribute dictionary

In [None]:
attributes = {
    'penny' : {'weight' : 75,
               'length' : 60,
               'color' : 'yellow',
               'animal' : 'golden retriever',
              },
    'bobby' : {'weight' : 10,
               'length' : 20,
               'color' : 'brown',
               'animal' : 'cat',
              }
}

Please add `peter` the `parrot` to attributes dictionary

What are all the names of the animals that we currently have in `attributes`?

Does `bobby` weigh more than `peter`? (Remember equivalencies from the basic data types?)

`bobby` actually got into the catnip while you were on break, and then ripped open a box of treats and ate all of them. lol, he's just so cute!

Please update his weight to be twice as much as before.

Actually `bobby` is so cute I'm just going to take him home with me. Please take him out of the `attributes` dictionary so no one else will take him!

Great job today! Keep this up and I just might promote you!

Exercises completed!

In [None]:
from IPython.core.display import HTML


def css_styling():
    styles = open("../styles/presentation.css", "r").read()
    return HTML(styles)
css_styling()