## Advance Data types

Python has four collection data types:

* Lists
* Tuples
* Sets
* Dictionaries

These data types allow for a greater number of operations because they can hold multiple elements.  In order to ease into deadling with collections, we will first recall some properties of another collection data type: strings.


### Lists
A list is an ordered sequence of elements, with that order being specified by the order that the elements are in when the list is created or as elements are added to the list. We create a list by using the `[]` syntax.

Let's go back to our obsession with pets, and imagine we own a pet store that carries a number of different species of pets.

In [1]:
pets = ['dogs', 'cats', 'fish']

In [2]:
print( pets )

['dogs', 'cats', 'fish']


In [3]:
print(pets[0])

dogs


In [4]:
print(pets[1])

cats


In [5]:
pets.append('birds')
pets

['dogs', 'cats', 'fish', 'birds']

Notice that the list printed the elements in the same order in which they were created. Unlike strings, **lists are mutable**. Thus, we can change them. 

If we want to add an element to the list, we `append()` the new element to the list.

In [6]:
pets.append(['parrot', 'dolphin'])
pets

['dogs', 'cats', 'fish', 'birds', ['parrot', 'dolphin']]

This is not what I had in mind, I wanted to add 'parrot' and 'dolphin', not the list `['parrot', 'dolphin']`. If we want to add multiple additional values, we can either append them one by one, or we can `extend()` our list `pets` with the list of additional pets that our pet shop carries.

In [56]:
pets.remove(['parrot', 'dolphin'])
pets.extend(['parrot', 'dolphin'])

pets

ValueError: list.remove(x): x not in list

In [8]:
pets + ['chameleon', 'hamster']

pets

['dogs', 'cats', 'fish', 'birds', 'parrot', 'dolphin']

And just like strings, we can use both of the additive math operators. I'll use a small number here so as not to flood the screen, but we can multiply the list too.

In [9]:
print( pets * 2 )

['dogs', 'cats', 'fish', 'birds', 'parrot', 'dolphin', 'dogs', 'cats', 'fish', 'birds', 'parrot', 'dolphin']


As for strings, subtraction and division are not implemented for lists. The reason is that it is not clear what subtracting two lists should produce.

In [11]:
pets - ["fish"]

TypeError: unsupported operand type(s) for -: 'list' and 'list'

In [12]:
len(pets)

6

In [13]:
pets[1 : 3]

['cats', 'fish']

In [14]:
pets[1] = 'sea monkeys'
pets

['dogs', 'sea monkeys', 'fish', 'birds', 'parrot', 'dolphin']

In [15]:
pets.index('fish')

2

### Adding and removing elements from a list

Python provides several methods for changing the contents of a list.  We will first look at `insert()` and `pop()`.


In [16]:
pets.insert(5, 'leopard')
pets

['dogs', 'sea monkeys', 'fish', 'birds', 'parrot', 'leopard', 'dolphin']

Wait, it is illegal to have leopards as pets. Better take it out from our list.

In [17]:
illegal_pet = pets.pop(5)
print( illegal_pet )
print( pets )

leopard
['dogs', 'sea monkeys', 'fish', 'birds', 'parrot', 'dolphin']


Notice how popping an element, returns its value.  A limitation of `pop()` is that we must know the index of the value we want to remove. If we do not have that knowledge, then we can use the function `remove()`:

In [18]:
pets.insert(2, 'hamster')
print( pets )

['dogs', 'sea monkeys', 'hamster', 'fish', 'birds', 'parrot', 'dolphin']


In [19]:
pets.remove('hamster')
pets

['dogs', 'sea monkeys', 'fish', 'birds', 'parrot', 'dolphin']

Another way to remove elements from a list is with the command `del`. However, `del` is a sort of nuclear option. If we do not provide an an index though it will delete the **entire** variable from memory! **Be careful when you use `del`**.

In [20]:
del pets[-1]
pets

['dogs', 'sea monkeys', 'fish', 'birds', 'parrot']

### Ordering lists

Lists have other built-in methods too besides the maintenance functions of adding and deleting elements. For example, you can reverse a list:

In [21]:
pets

['dogs', 'sea monkeys', 'fish', 'birds', 'parrot']

In [22]:
pets.reverse()


In [23]:
pets

['parrot', 'birds', 'fish', 'sea monkeys', 'dogs']

The elements in a list can also be sorted.  If the elements are numbers that the standard ordering of numbers is used. If the elements are string, then alphanumeric ordering is used.

In [24]:
pets.sort()


In [25]:
pets

['birds', 'dogs', 'fish', 'parrot', 'sea monkeys']

Notice that the methods `sort()` and `reverse()` act on the variable directly.  If you want to keep the original variable unchanged, you must use the built-in functions  `sorted()` and `reversed()`. 

In [28]:
reversed(pets)

<list_reverseiterator at 0x27f62bbd6d0>

The output of the function `reversed()` is not a list. It is what Python calls an `iterator`, this is just a rule for generating values from some input. The advantage of `iterators` is that they require less memory and may be faster???

However, if we want to have a human readable printout, we need to cast the iterator as a list. 

In [29]:
print( "Reversing the list", list( reversed(pets) ) )

print( "The original list", pets )

Reversing the list ['sea monkeys', 'parrot', 'fish', 'dogs', 'birds']
The original list ['birds', 'dogs', 'fish', 'parrot', 'sea monkeys']


## Hands On Exercise 

<mark> Create your shoping list & Print it </mark> 

In [57]:
shop_list=["Milk","Chips","Coffee"]

In [58]:
shop_list

['Milk', 'Chips', 'Coffee']

<mark> You are now the manager of the pet store! I'm going to need you to manage the inventory of pets that we have at the store.</mark> 

In [59]:
pet_store = ['beagle', 'parrot', 'iguana', 'gerbil', 'chameleon', 'fish']

<mark> We've sold out of `chameleon` and `iguana`, you need to remove them from the inventory. Use both `remove()` and `del` to do so. </mark> 

In [60]:
pet_store.remove('chameleon')
del pet_store[2]
pet_store

['beagle', 'parrot', 'gerbil', 'fish']

<mark> We just got in a `terrier`, `chameleon`, `bulldog`, and another `terrier`. Please add all of those animals to the `pet_store`.</mark> 

In [61]:
pet_store.extend(['terrier', 'chameleon', 'bulldog', 'terrier'])
pet_store

['beagle',
 'parrot',
 'gerbil',
 'fish',
 'terrier',
 'chameleon',
 'bulldog',
 'terrier']

<mark> Now, how many animal  do we currently have in the store? </mark> 

In [63]:
len(pet_store)

8

<mark> How many `terrier` dogs do we have in the store? (Hint: the `list` data type has a built-in function called `count()`) </mark> 

In [64]:
pet_store.count('terrier')

2

<mark> As an owner, I actually have a very bad obsessive-compulsive disorder. Could you please reverse-alphabetically sort the pet types at the store? Thanks so much! Also going to need you to work this entire weekend. For free.</mark> 

In [65]:
sorted(pet_store, reverse=True)

['terrier',
 'terrier',
 'parrot',
 'gerbil',
 'fish',
 'chameleon',
 'bulldog',
 'beagle']

<mark> What is the index of `parrot`? </mark> 

In [67]:
pet_store

['beagle',
 'parrot',
 'gerbil',
 'fish',
 'terrier',
 'chameleon',
 'bulldog',
 'terrier']

In [68]:
pet_store.index('parrot')

1

## Tuples

On the surface, tuples appear similar to `list`:

* Both Tuples and Lists contain a sequence of individual elements
* Both Tuples and Lists are stored in the order that they were added
* Both Tuples and Lists can store mixed data types
* We access individual elements with the syntax `variable[index]`

So what is the difference? 

The big difference is that tuples are an **immutable** data type (like strings). This means that none of the functions that are built-in to modify variables can be applied to tuples.

In order to indicate this **crucial** difference between lists and tuples, Python uses a different syntax to create a tuple.
The syntax for creating a `list` uses `[]`. The syntax to create a `tuple` uses `()`. 


In [30]:
# Create a tuple that stores the attributes of our golden retriever, `penny`.

penny = (60, 75, 'yellow') #Length (in), Weight (lbs), Color 
print( type(penny) )
penny

<class 'tuple'>


(60, 75, 'yellow')

Notice that since it isn't necessarily clear what each of the individual elements stood for in the tuple, I added a comment listing the meaning of each field. As we discussed earlier, comments are signaled in Python by the symbol `#` Whatever you type in a line after the `#` will be ignored by the Python interpreter.

Because tuples are immutable element assignment does not work.

In [32]:
penny[-1] = 'brown'

TypeError: 'tuple' object does not support item assignment

In [34]:
del penny[1]

TypeError: 'tuple' object doesn't support item deletion

and neither does element removal.  However, We can still perform the additive operators with tuples though

In [36]:
penny + ('brown', 'stinky') #Eye color, breath smell
print( penny )

penny2 = penny + ('brown', 'stinky')
print( penny2 )

(60, 75, 'yellow')
(60, 75, 'yellow', 'brown', 'stinky')


The reason is that when we add something to a tuple we are not changing the tuple, we are creating a new tuple. Notice that if we do not copy the new tuple into a new variable, then it is lost ot us.

Another subtlety with tuples is that they require a non-obvious syntax when they have a single element.  Specifically, a tuple with a single element must be written as using the syntax `(value, )`. Without the comma, the Python interpreter would decide that we are using the `()` in the context of a mathematical operation.

In [37]:
penny + ('brown', )

(60, 75, 'yellow', 'brown')

In [38]:
penny

(60, 75, 'yellow')

### So wait, why are there both lists and tuples? 

I know, right now it's pretty hard to see why you would use one data type over the other. 

Remember that Python was developed with the ideal of being readable and understandable. That means that when you are writing code, you want to make choices that make it easy for others to understand what is going on. For example, if you have a some variable that contains values that are never going to change, then why would you use a mutable data type such as a list to store it?  

Think of the days of the week or the months of the year: Does it make more sense to define

`days_of_the_week = ("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")`

or 

`days_of_the_week = ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]`?

By using a tuple to store some data, you are signaling to a reader of your code, that the values you stored are not going to change during the execution of your code.

## Sets

A Python set is an unordered collection that prevents the repetition of values from occurring.  To create a `set` we use the syntax `{}`. 

Why do we need sets? Imagine, for example, that you want to store the names of your friends; you will want to use a list since you might have several friends named Mary.  However, if you are storing the city in which your friends live, you do not want the same city repeated several times. If you want to find out cities to visit where you have friends you only need the city name in there once.

Let us see this in play with our pet store.  We want to keep track of what pet species we have.

In [70]:
pet_species = {'bulldog', 'hamster', 'parrot','parrot'}
pet_species


{'bulldog', 'hamster', 'parrot'}

In [40]:
pet_species.add('crocodile')
print( pet_species )
pet_species.add('crocodile')
print( pet_species )

{'crocodile', 'hamster', 'parrot', 'bulldog'}
{'crocodile', 'hamster', 'parrot', 'bulldog'}


Sets are also **mutable**, like a `list`. However, they don't use the same built-in functions. To add an item, you use the built-in function `add()`. T remove an element, you use the built-in function `discard()`.  Notice, however, that **sets cannot be added or multiplied**.

In [41]:
pet_species.add('gerbil')
print( pet_species )
pet_species.discard('crocodile')
print( pet_species )

{'gerbil', 'parrot', 'bulldog', 'hamster', 'crocodile'}
{'gerbil', 'parrot', 'bulldog', 'hamster'}


### Other set operations

Another class of operations that can be performed on sets are `intersection`, `union`, and `difference`. These operations are identical to their homonymous functions in the mathematics of sets.

In [42]:
store_one = {'bulldog', 'parrot', 'hamster', 'fish'}
store_two = {'fish', 'parrot', 'terrier', 'cat'}

print( store_one.intersection(store_two) )

print( store_one.union(store_two) )

print( store_one.difference(store_two) )  # A - B = A - (A intersection B)

print( store_two.difference(store_one) )

{'fish', 'parrot'}
{'terrier', 'cat', 'parrot', 'bulldog', 'fish', 'hamster'}
{'hamster', 'bulldog'}
{'terrier', 'cat'}


Note that while `intersection` and `union` are symmetrical -- for example, `A.union(B) = B.union(A)` -- the same is not true for `difference`.

###  Changing types across sets, lists, and tuples

Since `sets`, `lists`, and `tuples` are all collections of elements, Python allows us to easily convert variables from one type to another as needed. You just need to cast the variable.

In [43]:
list_store_one = list( store_one )
print( list_store_one )
print( type(list_store_one) )

['fish', 'hamster', 'parrot', 'bulldog']
<class 'list'>


In [45]:
tuple_store_one = tuple( store_one )
print( tuple_store_one )
print( type(tuple_store_one) )

('fish', 'hamster', 'parrot', 'bulldog')
<class 'tuple'>


In [46]:
reset_store_one = set(tuple_store_one)
print( reset_store_one )
print( type(reset_store_one) )

{'fish', 'hamster', 'parrot', 'bulldog'}
<class 'set'>


# Nesting

So far, most of the elements that we have added to our collections have been `string`, `float`, or `integer` data types. However, the elements of a collections can be collections themselves. Indeed, we can include any data type inside a `tuple` or a `list` or even a `set`.  This process is called **nesting** and it is a bit like Russian dolls.



Let's see nesting at play.  Going back to our pet store, assume that our costumers are picky and care about the breed of the dog they want to buy.

In [47]:
print( pets )
pets.pop(1)   # Get rid of dogs
dog_breeds = ['bulldog', 'terrier', 'greyhound']
pets.append(dog_breeds)  # append list of dog breeds available
pets

['birds', 'dogs', 'fish', 'parrot', 'sea monkeys']


['birds', 'fish', 'parrot', 'sea monkeys', ['bulldog', 'terrier', 'greyhound']]

In [48]:
pets[-1]

['bulldog', 'terrier', 'greyhound']

How can we now access a specific dog breed? 

The dog breeds are in a list, so we are able to access any element in that list using an index. We just need to know the name of the list. The name is, of course, `pets[-1]`.


In [49]:
pets[-1][0]

'bulldog'

The parenteses in the command above are unnecessary.  We can just write `pets[-1][0]`. Thus, to access elements in nested lists, we just need to keep adding `[]` in order to navigated deeper and deeper into the nested collection of lists. When Python navigates through your list variable it will always read the indices from left to right, with the leftmost `[]` being the topmost `list`.


Also, we can change the values inside our nested lists just like at our topmost `list` and use all of our normal `list` functions that we used before.

In [50]:
pets[-1][0] = 'golden retriever'
pets

['birds',
 'fish',
 'parrot',
 'sea monkeys',
 ['golden retriever', 'terrier', 'greyhound']]

In [51]:
pets[-1].append('pug')
pets

['birds',
 'fish',
 'parrot',
 'sea monkeys',
 ['golden retriever', 'terrier', 'greyhound', 'pug']]

### The subtleties of copying nested structures

Imagine that we open a second pet store, and that initially we will be having the same inventory as in the first store.  As all programmers worth their name, we want to minimize the amount of work we do. Thus, we would never add the names to the list one by one.  We just want to copy the list with the inventory from the first store.

In [52]:
store_one = pets
store_two = store_one
print( store_one )
print( store_two )

['birds', 'fish', 'parrot', 'sea monkeys', ['golden retriever', 'terrier', 'greyhound', 'pug']]
['birds', 'fish', 'parrot', 'sea monkeys', ['golden retriever', 'terrier', 'greyhound', 'pug']]


Business is going well at store one. We sold the `pug`, and acquired a `cat` and a `gerbil`:

In [53]:
store_one[-1].remove('pug')
store_one.extend(['cat', 'gerbil'])


In [54]:
print( store_one )
print( store_two )

['birds', 'fish', 'parrot', 'sea monkeys', ['golden retriever', 'terrier', 'greyhound'], 'cat', 'gerbil']
['birds', 'fish', 'parrot', 'sea monkeys', ['golden retriever', 'terrier', 'greyhound'], 'cat', 'gerbil']


**Both inventories were changed!** 

The reason is that the statement

`store_two = store_one`

did not copy the values in the collection labelled by `store_one` to a new place in memory labelled by `store_two`. It just made the name `store_two` point to the same position in memory as `store_one`. 

If you actually want to copy the values in the collection labelled by `store_one` to a new place in memory, then you have to use one of the built-in functions `copy()` and `deepcopy()`. 

Notice that there are important differences between the two functions. `copy()` does not make copies of nested stuctures. That is, if you a using `copy()` on a list that contains other lists, the other lists are not being copied to a different location in memory.

**Every distribution of Python includes these two functions, but unlike other built-in functions, we have to import them before we can use them!**

In [55]:
from copy import copy

store_one = ['beagle', 'terrier', 'bulldog', 'fish']
store_two = copy(store_one)
store_one.remove('beagle')
store_one.extend(['parrot', 'gerbil'])

print('Store One', store_one)
print('Store Two', store_two)

Store One ['terrier', 'bulldog', 'fish', 'parrot', 'gerbil']
Store Two ['beagle', 'terrier', 'bulldog', 'fish']


For now, do not worry about the meanig of the `from ... import ...` command. Just be aware of the subtleties when copying collections, especially nested collections since you may not be doing what you intended to do. Accidentally setting one variable name equal to another without noticint it is a way for logical errors to creep into your programs. **Always be careful! Always test your code!**