# Dictionaries and Sets

<hr style="border:1px solid gray">

## Dictionaries

One of the most useful built-in data types in Python is the dictionary or the `dict` type. A dictionary is a collection that is *mutable*. The elements of the collection are **key-value** pairs. Instead of being indexed by a range of numbers (like a `list` or `tuple`), a dictionary is indexed by *keys* which can be any immutable type. For example, strings and numbers can always be keys. You **cannot** use `list`s as keys since they can be modified in place. The values in the dictionary can be any valid data type.

It is best to think of dictionaries as *key:value* pairs with the requirement that keys are unique. To create a dictionary, you place comma-separated key:value pairs inside of curly braces, `{}`. You will use the **key** to look up or find the **value**.

Beginning in Python version 3.7, dictionaries are guaranteed to be **ordered** by insertion order. Older versions of Python do not guarantee insertion order. What this means is that if you print out a dictionary in older Python versions, the results may not be in the order you added data to the dictionary. This change also actually saves memory when creating and using dictionaries.

Let's create an income statement as a dictionary for our company.

In [None]:
# Create an income statement dictionary
income_stmt = {'Revenue': 100,
             'COGS': 52,
             'Gross Margin': 45,
             'SG&A': 40,
             'Net Income': 5}

print(income_stmt)

### Retrieving Elements

We use the **key** to get the **value** out of a dictionary. We pass the key between square brackets after the name of the dictionary. This is similar to retrieving elements of a list, except the keys can be something other than integers. 

Let's get our cost of goods sold value from the income statement.

In [None]:
# Retrieve an element using the key
income_stmt['COGS']

### Adding to a Dictionary

We often want to add a new key-value pair to the dictionary. In our case, we forgot to include the fiscal year for our income statement. We can add a new key-value pair by calling `dictionary_name[new_key] = new_value`. Let's add the fiscal year to our income statement.

In [None]:
# To add a key-value to a dictionary, assign the value to a new key
# Add "Fiscal Year": 2022
income_stmt['Fiscal Year'] = 2022
print(income_stmt)

### Changing a Value

We can overwrite (i.e., change) the value of a key-value pair by calling `dictionary_name[existing_key] = new_value`. Suppose we wanted to change our fiscal year to 1998. Let's try it.

In [None]:
# To change a value, access it using the key and reassign
# Change the fiscal year to 1998
income_stmt['Fiscal Year'] = 1998
print(income_stmt)

#### CAUTION

When trying to add a new key-value pair, if the key already exists, it will simply overwrite the existing value. Therefore, you may want to first check to see if a key is already in the dictionary before trying to add the data. For this you could look at all the keys as a list. An easier approach is to use the `in` keyword to see if the key exists in the dictionary.

In [None]:
# Get the keys of the dictionary as a list
list(income_stmt)

In [None]:
# You can use the `in` operator to determine if the key exists
'COGS' in income_stmt

In [None]:
'cogs' in income_stmt

### How Many Elements

You can see how many key-value pairs are in the dictionary by use the `len` function.  

In [None]:
# How many key-value pairs in income_stmt?
print(f'income_stmt has {len(income_stmt)} key-value pairs')

### Deleting Elements

If you want to delete an element from the dictionary, you can use the statement `del dictionary_name[key_to_delete]`. 

In [None]:
# Delete 'Fiscal Year'
del income_stmt['Fiscal Year']
print(income_stmt)
print(f'income_stmt has {len(income_stmt)} key-value pairs')

### Creating an Empty Dictionary

Sometimes you will create an empty dictionary and then add elements to it. To create an empty dictionary, simply use curly braces on the right-hand-side of the equal sign with nothing between them. Let's try it.

In [None]:
# Create empty dictionary
my_dictionary = {}
print(my_dictionary)
print(f'my_dictionary contains {len(my_dictionary)} elements')

In [None]:
# Add something to my_dictionary for fun
my_dictionary['odd_nums'] = [1,3,5,7,9]
print(my_dictionary)

<hr style="border:1px solid gray">

<font color='red' size = '5'> Student Exercise </font>

Complete the following tasks in the empty **Code** cells below.

1. Create a new empty dictionary called `fun_dictionary`. Be sure to print its type and how many elements are in it to make sure you have done this task correctly.
2. Add the following key-value pairs to `fun_dictionary` and then print it:
    - `ice cream flavors` : `['chocolate', 'vanilla', 'strawberry', 'rocky road']`
    - `colors` : `['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']`
    - `version` : 0.1
    - `date` : today's date
3. Use the `in` operator to see if the key `version` is in `fun_dictionary`.
4. How many entries are in `fun_dictionary`?

<hr style="border:1px solid gray">

In [None]:
# 1. Create a new empty dictionary called `fun_dictionary`.


In [None]:
# 2. Add the following key-value pairs to `fun_dictionary`:
#    - `ice cream flavors` : `['chocolate', 'vanilla', 'strawberry', 'rocky road']`
#    - `colors` : `['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']`
#    - `version` : 0.1
#    - `date` : today's date


In [None]:
# 3. Use the `in` operator to see if the key `version` is in `fun_dictionary`.


In [None]:
# 4. How many entries are in `fun_dictionary`?


<hr style="border:1px solid gray">

### Some Dictionary Methods

Dictionary objects have several methods. Here we will look at a few of most useful ones. They include: `clear`, `get`, `items`, `keys`, `pop`, `popitem`, and `values`. 

Let's start with `get`. It is an alternative to using the square brackets for getting a value from a dictionary. It does not raise an excpetion if the key is not present. The syntax is `dictionary_name.get(key, default)` where the optional argument `default` is what will be returned if the key is not in the dictionary. 

In [None]:
# Get value for the key COGS
income_stmt.get('COGS')

In [None]:
# What happens if the key does not exist
income_stmt.get('cogs')

In [None]:
# Try again using the parameter `default`
income_stmt.get('cogs', 'Entry not found')

If you want to completely delete all the elements in a dictionary, you can use the `clear` function. This will leave the dictionary empty. Let's try it.

In [None]:
# First see what is in it
print(my_dictionary)

In [None]:
# Empty out my_dictionary
my_dictionary.clear()

In [None]:
# Print my_dictionary out again to see if it is empty
print(my_dictionary)

To get all the keys from a dictionary, you use the `keys` method. To get all the values from a dictionary, you use the `values` method. If you want both the keys and values as tuples, then use the `items` method. Let's look at each of them.

In [None]:
print(income_stmt.keys())

In [None]:
print(income_stmt.values())

In [None]:
# To get all the items as an iterable object, you call .items()
print(income_stmt.items())

In [None]:
# As a preview let's loop through the dictionary
# Call .items() and unpack each tuple and print
for k, v in income_stmt.items():
    print(f'key: {k:15}value: {v:>5}')

The function `pop` returns the **value** of the associated element, if it exists, and then **removes** the entry from the dictionary. When you call `dictionary_name.pop(key, default)` it will return the value for the specified key, or, if the key does not exist, the value passed to the `default` argument. 

In [None]:
# Print out income_stmt first to see it
print(income_stmt)

In [None]:
# Let's pop 'Net Income'
print(income_stmt.pop('Net Income', 'Entry not found'))

In [None]:
# Print income_stmt to see if 'Net Income' has been deleted
print('income_stmt is now:')
print(income_stmt)

In [None]:
# Try something that is not in dictionary
income_stmt.pop('junk', 'Entry not found')

The function `popitem` removes and returns the last inserted element in the dictionary. The LIFO order is guaranteed in Python versions >=3.7. In older versions of Python `popitem` would return and remove an arbitrary element in the dictionary. If you call `dictionary_name.popitem()` on an empty dictionary, it will raise a `KeyError`. 

In [None]:
# This should return the last entry in the dictionary
last_item = income_stmt.popitem()
print(f'last_item is {last_item} and has type {type(last_item)}')

In [None]:
# Unpack the last entry
last_key, last_value = last_item

print('After calling income_stmt.popitem() we have:')
print(f'\tlast_key   = {last_key}')
print(f'\tlast_value = {last_value}')

<hr style="border:1px solid gray">

<font color='red' size = '5'> Student Exercise </font>

Complete the following tasks in the empty **Code** cells below using the `fun_dictionary` that you created in the previous student exercise.

1. Retrieve and print the value associated with the key `version`.
2. Update the `version` to '0.1.1'. Print `fun_dictionary` to verify your update worked.
3. Print all of the keys in `fun_dictionary`.
4. Print all of the values in `fun_dictionary`.
5. Print all of the key-value pairs in `fun_dictionary`.
6. Retrieve and remove the last item that you added to `fun_dictionary`. Unpack the tuple as `fun_key` and `fun_value` and print them out.

<hr style="border:1px solid gray">

In [None]:
# 1. Retrieve and print the value associated with the key `version`.


In [None]:
# 2. Update the `version` to 0.1.1


In [None]:
# 3. Print all of the keys in `fun_dictionary`.


In [None]:
# 4. Print all of the values in `fun_dictionary`.


In [None]:
# 5. Print all of the key-value pairs in `fun_dictionary`.


In [None]:
# 6. Retrieve and remove the last item that you added to `fun_dictionary`.
#    Unpack the tuple as `fun_key` and `fun_value` and print them out.


<hr style="border:1px solid gray">

## Sets

A **set** contains a collection of unique values. You can think of it like a mathematical set. Characteristics of sets include:

- All the elements must be unique.
- Sets are unordered, i.e., elements in a set are not stored in any particular order.
- The elements in a set can be of different data types.

Being an unordered collection, sets do not record element position or order of insertion. What this means is that sets do not support indexing, slicing, or other sequence-like behavior. You can however, still use the `in` operator, the `len()` function, and loop over a set (looping is in another module).

### Creating a Set

To create a set, give it a name and on the right-hand-side of the equal sign use `set()`. This will create an **empty** set. If you want to create a set with entries, you can pass it a list or any iterable object. Let's try creating a few different sets.

In [None]:
# Create an empty set
my_empty_set = set()

# Print it
print(my_empty_set)

# Print how many elements
print(f'my_empty_set has {len(my_empty_set)} elements in it')

#### Creating a Set with a `list`

Now, let's create a set by passing in a list.

In [None]:
# Create a set with odd numbers
my_set_of_odd_nums = set([1,3,5,7,9])

# Print it
print(my_set_of_odd_nums)

# How many elements?
print(f'my_set_of_odd_nums has {len(my_set_of_odd_nums)} elements')

Notice that when a set is printed out it has curly braces around it. Do **NOT** confuse this with a dictionary. Remember, a dictionary has key-value pairs. You can see from the print statement that the set only has values.

Now, let's try something crazy. Let's try creating a set with a string. Recall that strings are just a text sequence and sequences are iterable. Let's see what happens.

In [None]:
# Create the string
short_string = 'This is a string'

# Create a set by passing in a string
str_set = set(short_string)

print(str_set)
print(f'short_string has {len(short_string)} characters')
print(f'str_set has {len(str_set)} elements')

Notice that the set is **unordered** and the values are **unique**. For example, even though the lowercase letter `s` is in the string 3 times, it is only in the set once. 

What if, instead of making the set contain individual characters from a string, we wanted it to contain the words from the sentence? There is a function called `split()` that you can call on a string object. It returns a list of "words" where it splits on whitespace. (We will see this concept again.) Let's try splitting `short_string` and send in the resulting list when creating a set.

In [None]:
# Split short_string and send it to set()
str_set2 = set(short_string.split())

print(str_set2)
print(f'str_set2 has {len(str_set2)} elements')

*Thought Exercise:* In this case our string `short_string` does not have any repeating words. If it did, what would the resulting `set` look like?

### Creating a Set with `{}`

You can also create a `set` by sending comma-separated elements inside curly braces, `{}`. Let's try it. NOTE: You **cannot** send in a `list` as one of the elements to be added to set.

In [None]:
# Create a set using {}
another_set = {'Wow', 76, 3.14159}
print(another_set)
print(f'another_set has {len(another_set)} elements')

### Adding and Deleting Elements to a Set

Adding elements is accomplished by using the `add` method. There is also an `update` method that allows you to pass a list of elements to be added to the set. 

To remove elements from a set, you can use the `remove` method or the `discard` method. The `remove` method will raise a `KeyError` exception if the element does not exist in the set; `discard` does not. 

Now, let's create an empty set and then add and remove elements from it.

In [None]:
# Create an empty set and then add elements to it
play_set = set()

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Add a few elements
play_set.add('A String')
play_set.add(42)
play_set.add(9.87)

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try adding 42 to play_set again
play_set.add(42)

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try using .update() to add elements to a set
play_set.update(['another', 'bunch', 45])

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try .update() with a string by itself
play_set.update('xyz')

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try removing 'x' from play_set
play_set.remove('x')

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try removing a non-existing element with `discard`
play_set.discard('Not there')

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

In [None]:
# Try removing a non-existing element with `remove`
play_set.remove('Not there')

print(play_set)
print(f'play_set has {len(play_set)} elements in it')

#### Mathematical Set Functions

Commonly used mathematical set operations are also available to use on Python sets. You can find the *union* of sets using the `union` method or the shortcut operator `|`. You can find the *intersection* of sets using the `intersection` method or the shortcut operator `&`. You can also find the *difference* and *symmetric difference* of sets using the `difference` and `symmetric_difference` methods. There are also methods that let you determine if a set is a *subset* or a *superset*. We will not cover these topics here, but I encourage you to explore the ancillary information below for details.

<hr style="border:1px solid gray">

<font color='red' size = '5'> Student Exercise </font>

Complete the following tasks in the empty **Code** cells below.

1. Create a new empty set called `fun_set`. Be sure to print its type and how many elements are in it to make sure you have done this task correctly.
2. Try adding the following values to `fun_set` and then print it and its length:
    - add a list : `['chocolate', 'vanilla', 'strawberry', 'rocky road']`
    - add a number : 42
    - add a string : `'indigo'`
    - add a number : 42.0
    - add a number : 42
3. Use the `in` operator to see if the value 42 is in `fun_set`.
4. Use the `in` operator to see if the value 42.0 is in `fun_set`.

What interesting detail did you notice from adding and checking for values in the set?

<hr style="border:1px solid gray">

In [None]:
# 1. Create a new empty set called `fun_set`.
# Be sure to print its type and how many elements are in it 
# to make sure you have done this task correctly.


In [None]:
# 2. Try adding the following values to `fun_set` and then print it and its length:
#    - add a list : `['chocolate', 'vanilla', 'strawberry', 'rocky road']`
#    - add a number : 42
#    - add a string : `'indigo'`
#    - add a number : 42.0
#    - add a number : 42


In [None]:
# 3. Use the `in` operator to see if the value 42 is in `fun_set`.


In [None]:
# 4. Use the `in` operator to see if the value 42.0 is in `fun_set`.


## Ancillary Information

The following links point you to additional resources that you might find helpful in learning this material.

- The official Python tutorial about [dictionaries][1].
- The official Python documentation for the [`dict` class][2].
- The official Python documentation for [sets][3].

-----

[1]: https://docs.python.org/3/tutorial/datastructures.html#dictionaries
[2]: https://docs.python.org/3/library/stdtypes.html?#mapping-types-dict
[3]: https://docs.python.org/3/library/stdtypes.html?#set-types-set-frozenset

**&copy; 2022 - Present: Matthew D. Dean, Ph.D.   
Clinical Associate Professor of Business Analytics at William \& Mary.**