<a href="https://colab.research.google.com/github/coding-dojo-data-science/python-basics-notebooks/blob/main/Python_Collections.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Collections

## Learning Objectives:
When you complete this lesson you will be able to:
Explain the difference between lists, tuples, sets, and dictionaries in Python.
Choose the best type of collection for a given task.

## Covered in this Notebook:
1. Lists
2. Tuples
3. Sets
4. Dictionaries

# Lists

![image of to-do list](https://github.com/ninja-josh/image-storage/raw/main/thomas-bormans-pcpsVsyFp_s-unsplash.jpg)
Photo by Thomas Bormans on Unsplash


Lists are one of the most common types of collections in Python.  They are useful because they are ordered and keep their order unless changed explicitly.  On the other hand, they can easily be altered in many ways.

Lists are:
1. Ordered
2. Mutable (changeable)
3. Able to hold anything

## Pros:
1. Lists are ordered and can the order of a list can hold important information in addition to the data stored in it.  Imagine a list of customer names kept in the order they asked for seating in a packed restaurant.  If seating were ‘first come, first served’, then the names of the customers represent one kind of data, and the order they will be seated is another kind.  

1. Lists are mutable, that is to say, they can be altered.  Data can be added, removed, changed, or reordered within the list.  Many Python methods (build in functions) exist to perform different operations on lists.  In the example of our list of customers to be seated, we could remove each customer from the beginning of the list when they are seated and add new ones to the end as they arrive.

## Cons
1. Lists are not computationally efficient.  It takes much more time to access and change data in a list than in more efficient kinds of collections.  This won’t make much difference when you are working with a small amount of data, like lists of customers or business contacts, or competition race times.  However, when we start working with billions of data, using lists will cause a computer to slow to a crawl.

2. Lists are not memory efficient.  They take up more space in computer memory than many other kinds of collections, as Python must reserve some space to add more elements as necessary.

Lists are created and denoted in Python with square brackets `[ ]` and data stored in them are separated by commas. Other collections can be converted to lists by using the `list()` function.

Since lists are ordered, you can retrieve data from them, also called 'indexing' using brackets next to the variable name and an integer corresponding to the ordered position.  (Remember that Python starts counting at 0, so index 1 is the 2nd element)


In [None]:
word_list = ['this', 'is', 'a', 'list', 'of', 'strings']
random_list = [12, 'frog', ['this','is','a','list','in','a','list'], True, 3.14]

print(word_list)
print(random_list)
print(random_list[2])

['this', 'is', 'a', 'list', 'of', 'strings']
[12, 'frog', ['this', 'is', 'a', 'list', 'in', 'a', 'list'], True, 3.14]
['this', 'is', 'a', 'list', 'in', 'a', 'list']


You can add items to a list with `list_name.append()`, which always add items to the end of a list.  You can remove items using `list_name.remove()`.

In [None]:
word_list.append('longer')
print(word_list)

word_list.remove('a')
print(word_list)

['this', 'is', 'a', 'list', 'of', 'strings', 'longer']
['this', 'is', 'list', 'of', 'strings', 'longer']


In [None]:
word_list + random_list

['this',
 'is',
 'list',
 'of',
 'strings',
 'longer',
 12,
 'frog',
 ['this', 'is', 'a', 'list', 'in', 'a', 'list'],
 True,
 3.14]

# Tuples

![image of safe](https://github.com/ninja-josh/image-storage/raw/main/jason-dent-3wPJxh-piRw-unsplash.jpg)
Photo by Jason Dent on Unsplash

Tuples are another kind of collection you will run into in Python.  Tuples are similar to lists in some ways and different in others.  They contain data and are ordered and they can store any combination of other objects.  The difference is that tuples can not be changed.  They are immutable.

Tuples are:
1. Ordered
2. Immutable
3. Able to hold anything

Pros:
1. Tuples are ordered.  You can access the first, last, or any other element and they keep their order.  They could hold that list of customers for seating but…

2. Tuples cannot be changed once created.  This may seem at first like a bad thing, but there are times you want to make sure that certain data is not being changed.  Tuples are a way to protect a collection of data from being altered.  Once you create your seating list, you would not be able to add new customers or remove them once they were seated.  

3. Tuples are slightly more computationally efficient than lists when accessing the data in them, but not a lot.

Cons:
1. Tuples are immutable.  Yes, this is both a pro and a con.  You can copy a tuple, create a new tuple, or delete a tuple, but you can’t change one once created.

2. Tuples are still less computationally efficient for accessing data than some other data types.

Tuples can be created using parentheses `( )` and other collections can be converted to a tuple using the `tuple()` function.

Since tuples are ordered, they can be accessed like lists, with indices.


In [None]:
integer_tuple = (1,2,3,4,5)
random_tuple = (1, 'horse', integer_tuple, word_list, False)

print(integer_tuple)
print(random_tuple)
print(tuple(word_list))
print(random_tuple[1])

(1, 2, 3, 4, 5)
(1, 'horse', (1, 2, 3, 4, 5), ['this', 'is', 'list', 'of', 'strings', 'longer'], False)
('this', 'is', 'list', 'of', 'strings', 'longer')
horse


In [None]:
type(word_list)

list

Remember, you cannot add, remove, or change items in a tuple!  However, you can combine tuples into a NEW tuple.

In [None]:
tuple1 = (1,2)
tuple2 = (3,4)
combined_tuple = tuple1 + tuple2

print(tuple1)
print(tuple2)
print(combined_tuple)

(1, 2)
(3, 4)
(1, 2, 3, 4)


# Sets

![image of jar of colored pencils](https://github.com/ninja-josh/image-storage/raw/main/pierre-bamin-BFvNJXf2rpg-unsplash.jpg)

Photo by Pierre Bamin on Unsplash

Sets are mutable, unordered collections of data.  They also cannot contain duplicate data.  A set has only one instance of any data inside them and any duplicates are dropped.  They are also very efficient if you want to know whether or not something is inside a set.  

Sets are:
1. Unordered
2. Mutable
3. Can hold anything that is "hashable" (No lists!)

In our restaurant seating example, a set would not retain the order of the customers, however, you could quickly check to see whether a customer was in the set, remove a customer, or add them.  It would also have the property of ensuring that no customers were in the collection more than once!  Or, at least no customers with the same name.

## Pros:
1. Sets are useful for removing duplicates from other collections.  If you have a list, and you convert it to a set the result will have no duplicates:


2. Sets are mutable.  You can add or remove members of a set, though any duplicates will be dropped.

3. Sets are computationally efficient to access and change.  It is efficient to check if something is in a set or to add or remove an item from a set.

## Cons
1. Sets are unordered.  You cannot trust them to protect the order of items inside of them.  

2. Sets cannot contain duplicates.  This can be a pro or a con.

3. **Sets cannot contain lists!** Sets are a type of collection that is hashable.  This is what makes them so efficient to access.  Hash tables are outside the scope of this lesson, but lists are not hashable and so cannot be stored in a set.  A tuple can be stored in a set…as long as it does not contain a list!

Sets are denoted and created by using curly braces `{ }` and you can recast other collections as sets using the function `set()`.  Note that this will delete all duplicate values from the collection:



In [None]:
# create some sets
float_set = {3.14, 12.0, .0001}
random_set = {'pelican', 5, False, (1, 3.14, 42)}

print(float_set)
print(random_set)

{0.0001, 3.14, 12.0}
{False, (1, 3.14, 42), 'pelican', 5}


Sets can be used to remove duplicates from a collection

In [None]:
# list with duplicates
num_list = [1,1,1,3,3]
print('List:', num_list)

# convert to set
set_list = set(num_list)
print('Set:', set_list)

List: [1, 1, 1, 3, 3]
Set: {1, 3}


You can add or remove items from a set using `set_name.add()` and `set_name.remove()`.  Remember that you cannot trust a set to keep items in order!

In [None]:
# adding and removing items from a set

# add an item to a set
set_list.add(2)

print(set_list)

# remove an item from a set
set_list.remove(1)

print(set_list)

{1, 2, 3}
{2, 3}


# Dictionaries

![Image of card catalogue](https://github.com/ninja-josh/image-storage/raw/main/erol-ahmed-Y3KEBQlB1Zk-unsplash.jpg)
Photo by Erol Ahmed on Unsplash

Dictionaries are one of the most common and powerful types of Python collections, however, they are a little more complicated than lists, tuples, or sets.  Once you’ve mastered them, however, they are a Python Superpower!

Dictionaries are:
1. Unordered
2. Keys are immutable, but values are mutable

## Key/Value Pairs

They are unordered collections with key/value pairs of data.  You can think of dictionaries as books with a table of contents.  The table of contents is called the ‘keys’ and the chapters they refer to are called the ‘values’.  Every item in the table of contents tells the location of a chapter and every chapter is listed in the table of contents

Every key in a dictionary is paired with a value.  You can add and remove data from a dictionary but must do so for both the key and the value. 

If we wanted to use a dictionary for our restaurant, we could use it to keep track of which table a customer has been seated at.  The keys could be customer names and the values could be the table number they are at.  A server could very quickly see where to find each customer.

## Immutable Keys and Mutable Values
 
Keys must be immutable, but values can be mutable. A value can be anything, even another dictionary, but keys can only be immutable types.  An integer, a string, or a tuple could be a key, but a list can not be a key.


So remember: 
### In a dictionary, you use a ‘key’ to access a ‘value’.

## Pros
Dictionaries are fast.  They are very computationally efficient to access information.

## Cons
Dictionaries are not ordered.  They will not retain information about the ordering of the keys/value pairs. 

Dictionaries are denoted and created with curly braces containing key/value pairs {'key':value}.  The keys and values are separated by colons and the pairs are separated by commas.  In some situations you can convert a collection to a dictionary with `.dict()`, but only if key/value pairs are apparent, such as in a list of tuples.

In [None]:
# create some dictionaries
boolean_dict = {'George': True, 
                'Ahmed': False, 
                'Juanita': True}

random_dict = {1: 'horse', 
              (3, 'hello'): [4, 'goodbye'], 
              'Nested Dictionary': {'Pi': 3.14, 
                                    'More than pi': 3.15}}

print(boolean_dict)
print(random_dict)

{'George': True, 'Ahmed': False, 'Juanita': True}
{1: 'horse', (3, 'hello'): [4, 'goodbye'], 'Nested Dictionary': {'Pi': 3.14, 'More than pi': 3.15}}


In [None]:
# convert a list of tuples to a dictionary

tuple1 = ('first key', 1)
tuple2 = ('second key', 2)

list_of_tuples = [tuple1, tuple2]
tuple_dict = dict(list_of_tuples)

print(tuple_dict)

{'first key': 1, 'second key': 2}


Retrieving a value from a dictionary with a key.  You will learn more about how to index into collections in another lesson.

In [None]:
# print the dictionaries

print(boolean_dict['Ahmed'])
print(random_dict['Nested Dictionary'])
print(random_dict['Nested Dictionary']['Pi'])
print(tuple_dict['second key'])

False
{'Pi': 3.14, 'More than pi': 3.15}
3.14
2


You can add to a dictionary by defining a new key and setting it to a value.  
If the key does not exist, it will be added.

If the key already exists, the value associated with it will be changed!

In [None]:
# add a new key/value pair to a dictionary
tuple_dict['third key'] = 3
print(tuple_dict)

# change the value of a dictionary entry
tuple_dict['third key'] = 4
print(tuple_dict)

{'first key': 1, 'second key': 2, 'third key': 3}
{'first key': 1, 'second key': 2, 'third key': 4}


# Summary
Collections in Python can hold Python objects.  Each kind of collection has some rules about what it can and can’t hold and how it holds it.  Lists, tuples, sets, and dictionaries are some of the basic types of Python collections.  Like other Python object types, they each have their own rules and ways of interacting with them.  Lists are ordered and immutable.  Tuples are ordered and immutable. Sets are unordered, but mutable and cannot contain duplicate values.  Dictionaries are unordered and data is stored as pairs of keys and values where each key points to a value like a table of contents points to a chapter in a book.


# Optional Challenges

1. Create a dictionary of top search trends for 3 different countries.

The keys should be the country names and the values should be a list of 3 strings, each a trending search term.

You can find this data [HERE](https://trends.google.com/trends/trendingsearches/daily?geo=US).  You can choose countries in the top right under the search bar.

2. Add 2 more countries to the same list without remaking it from scratch.

3. Index into your dictionary to retrieve the search terms for one of the countries you chose.