# Data structures - Part II

Data structures are fundamental components in programming that allow you to efficiently **store, organize, and manipulate** data. They provide a way to manage and **work with collections of values** or entities. In Python, there are several commonly used data structures:

In the previous lesson we talked about *lists*. **Lists** are ordered collections of items enclosed in square brackets [ ]. They can store elements of different data types and allow for indexing, appending, removing, and modifying elements.

In this lesson, we will talk about *dictionaries*. **Dictionaries** are key-value pairs enclosed in curly braces { }. Each element in a dictionary consists of a key and its associated value. Dictionaries provide fast lookup and retrieval of values based on their keys and are useful for organizing and accessing data using meaningful labels. (MUTABLE)  

## Dictionaries 

In Python, a dictionary is a powerful data structure that allows you to store and organize data in key-value pairs. It is similar to a real-life dictionary where you can look up a word (key) and find its corresponding definition (value). Dictionaries are mutable, meaning you can modify them after they are created.

They are enclosed in **curly braces {}** and consist of multiple **key-value pairs separated by commas**. Dictionaries allow you to access elements by using their respective keys.

### Creating a Dictionary

You can create a dictionary by enclosing comma-separated key-value pairs within curly braces {}. For example:

In [None]:
contacts = {'John': '312-555-1234', 'Paul': '312-555-3123', 'George': '312-555-3333', 'Ringo': '312-555-2222'}

In [None]:
type(contacts)

Here we have defined a dictionary and we saved it in a variable named `contacts`.  

A key-value pair is defined using a colon **`:`** - `key : value`.

Different key value pairs here are: 
- "John" : "312-555-1234"
- "Paul" : "312-555-3123" 
- "George" : "312-555-3333" 
- "Ringo" : "312-555-2222"

All the **keys** in a dictionary **must always unique**, the *values can be repeated*.

If you try to add a key that already exists in the dictionary, the new value will overwrite the existing value associated with that key. This means that the second occurrence of the key will replace the first occurrence, effectively updating the value for that key.

In [None]:
# Lets add two 'John' keys with different values
contacts = {'John': '312-555-1234', 'John': '111-111-1234', 'Paul': '312-555-3123', 'George': '312-555-3333', 'Ringo': '312-555-2222'}

In [None]:
contacts # We can see that it only kept one 'John' with the latest value

### Accessing values using the keys

You can access the values in a dictionary by using the corresponding key in square brackets []. 

In [None]:
contacts['John']  # This is very similar to lists except that here we are using keys as indexes

In [None]:
contacts['Paul']

**Note:** Remember that Python is case sensitive, so keys are case sensitive as well. 

If you use the following code, it will produce an error as shown:

In [None]:
contacts['PAUL'] # We get a KeyError since the key PAUL is not in the dictionary

### Adding and Modifying Values

To add a new key-value pair to a dictionary, you can simply assign a value to a new key. If the key already exists, assigning a new value to it will modify the existing value.

`<Name of dictionary>[<key>] = value`

In [None]:
contacts['Mel'] = '480-111-2222'

In [None]:
contacts # Mel was added to the dictionary

In [None]:
contacts['Mel'] = '480-999-8888'
contacts # Mel value (number) was modified

In the previous example, the keys and values stored in the dictionary were only strings. But we can have any
kind of data type for the keys and values.

In [None]:
student_names = {1: 'John', 2: 'Smith', 3: 'Matt', 4: 'Jimmy', 5: 'Sue'}
student_names[1]

In [None]:
student_names[2]

The main idea here is that the values are identified in a dictionary using the keys. 

Values inside a dictionary can be any data type, including lists or dictionaries itself.

In [None]:
student_data = {
    'Name': 'John', 
    'E-mail': 'john@gmail.com', 
    'Age': 28, 
    'subjects': ['math', 'science', 'history', 'geography'] 
}

In [None]:
student_data

In [None]:
# Remember that the keys are case sensitive 
student_data['Name']

In [None]:
student_data['E-mail']

In [None]:
student_data['Age']

In [None]:
student_data['subjects']

**Note**: Value associated with the key 'subjects' is a list.

According to this, if we want to access 'science', we need first to access the `subjects` key in the dictionary. Then, as the `value` is a list, we need to access the corresponding element of the list.

In [None]:
student_data['subjects'][1] 

In the student_data dictionary, the value associated with the key 'subjects' is the list ['math', 'science', 'history', 'geography']. Since lists are ordered and indexed starting from 0 in Python, we can access specific elements by their index.

For example, to access the element 'science' in the list, we use its index, which is 1. Remember that the index starts counting from 0, so the first element has an index of 0, the second element has an index of 1, and so on.

### Removing Values

You can remove a key-value pair from a dictionary using the del keyword:

In [None]:
del contacts["Mel"]

In [None]:
contacts

### Checking Key Existence

To check if a key exists in a dictionary, you can use the `in` keyword:

In [None]:
if "John" in contacts:
    print("Yes, 'John' is in the dictionary.")

### Dictionary Methods

Python provides several built-in methods to perform operations on dictionaries. Some commonly used methods include keys(), values(), items(), get(), and pop(). You can explore these methods in the Python documentation.

**`.keys()`** method: This method returns a view object that contains all the keys in the dictionary.

In [None]:
contacts.keys()

**`.values()`** mehod: This method returns a view object that contains all the values in the dictionary.

In [None]:
contacts.values()  # But this directly does not tell us to which keys these values are associated 

**`.items()`** method: This method returns a view object that contains all the key-value pairs in the dictionary as tuples.


In [None]:
contacts.items()

**`get(key)`** method: This method returns the value associated with the specified key. If the key is not found, it returns a default value (None by default) instead of raising an error.

The main difference between using the `get()` method and accessing a dictionary with `[]` is how they handle missing keys. When you access a dictionary using `[]` and provide a key that does not exist in the dictionary, it will raise a `KeyError` exception. On the other hand, the `get()` method provides a way to retrieve a value from a dictionary without raising an error if the key is not found. Instead of raising a `KeyError`, it returns a `default` value that you specify or `None` (indicating the key wasn't found). If the key is found, it returns the corresponding value. 

In [None]:
contacts.get('John')

In [None]:
print(contacts.get('Anna', 'Not found'))

Instead of returning an error, as accessing with contacts['Anna'] would do since the key does not exist, the `get()` method returns the default value we specified.

If we don't specify what to return if the key is not found, it returns as default the special value called `None`. `None` is a built-in constant in Python that represents the absence of a value.

In [None]:
print(contacts.get('Anna'))

When using the `print()` function in Python, the function displays the value that is passed to it and outputs it to the console. If the value is `None`, the `print()` function will explicitly display the word `None` as the output.

In contrast, when you write a variable in a Jupyter Notebook cell without explicitly using the print() function, the notebook environment automatically displays the value of the variable as the output of the cell. However, if the value of the variable is None, the notebook environment behaves differently. Instead of displaying `None` as the output, it simply shows nothing.

In [None]:
contacts.get('Anna')

## Exercises

1. Use the dictionary given below. This is a dictionary that shows the frequency of the words in a text paragraph. The keys are the words, and the values are its frequency.


In [None]:
word_freq = {'love': 25, 'conversation': 1, 'every': 6, "we're": 1, 'plate': 1, 'sour': 1, 'jukebox': 1, 'now': 11, 'taxi': 1, 'fast': 1, 'bag': 1, 'man': 1, 'push': 3, 'baby': 14, 'going': 1, 'you': 16, "don't": 2, 'one': 1, 'mind': 2, 'backseat': 1, 'friends': 1, 'then': 3, 'know': 2}

Answer the following questions:   

- How many *key-value* pairs are in this dictionary?
- What *keys* are present in this dictionary? 
- What is the frequency of following words in the dictionary: 
    - 'friends'
    - 'taxi' 
    - 'jukebox'
- Is the word `begin` present in the dictionary?
- Add the following words and their frequencies to the dictionary:
        - 'begin': 1
        - 'start': 2
        - 'over': 1
        - 'body': 17
- Use the method `list()` to convert the result from `word_freq.keys()` to a list: `list(word_freq.keys())`.
  - Store the results of the above code in a variable called `word_list`.
  - What is the first word in `word_list`? What is the frequency of that word?
  - What is the last word in `word_list`? What is the frequency of that word?

In [2]:
word_freq = {'love': 25, 'conversation': 1, 'every': 6, "we're": 1, 'plate': 1, 'sour': 1, 'jukebox': 1, 'now': 11, 'taxi': 1, 'fast': 1, 'bag': 1, 'man': 1, 'push': 3, 'baby': 14, 'going': 1, 'you': 16, "don't": 2, 'one': 1, 'mind': 2, 'backseat': 1, 'friends': 1, 'then': 3, 'know': 2}

# Calculate the number of key-value pairs in the dictionary
print(f"there are {len(word_freq)} key-value pairs.")

# Get the keys present in the dictionary
print(f"the keys are: {list(word_freq.keys())}")

# Check the frequency of the words 'friends', 'taxi', and 'jukebox'
print(f"the frequency of 'friends' is: {word_freq.get('friends', 0)}")
print(f"the frequency of 'taxi' is: {word_freq.get('taxi', 0)}")
print(f"the frequency of 'jukebox' is: {word_freq.get('jukebox', 0)}")

# Check if the begin is present in the dictionary
print(f"is begin present in the dictionary? {'begin' in word_freq}")

# Add begin, start, over, and body
word_freq['begin'] = 1
word_freq['start'] = 2
word_freq['over'] = 1
word_freq['body'] = 17

# Convert the keys of the dictionary to a list and store the result in a variable
word_list = list(word_freq.keys())

# Get the first word in word_list and its frequency
first_word = word_list[0]
print(f"The first word in word_list is '{first_word}' and its frequency is {word_freq[first_word]}.")

# Get the last word in word_list and its frequency
last_word = word_list[-1]
print(f"The last word in word_list is '{last_word}' and its frequency is {word_freq[last_word]}.")

there are 23 key-value pairs.
the keys are: ['love', 'conversation', 'every', "we're", 'plate', 'sour', 'jukebox', 'now', 'taxi', 'fast', 'bag', 'man', 'push', 'baby', 'going', 'you', "don't", 'one', 'mind', 'backseat', 'friends', 'then', 'know']
the frequency of 'friends' is: 1
the frequency of 'taxi' is: 1
the frequency of 'jukebox' is: 1
is begin present in the dictionary? False
The first word in word_list is 'love' and its frequency is 25.
The last word in word_list is 'body' and its frequency is 17.


2. Can a dictionary have two key-value pairs with the same key?

In [None]:
no, dictionary keys must be unique

3. Can a dictionary have two key-value pairs with the same value but different keys?

In [None]:
Yes, a dictionary can have multiple key-value pairs with the same value but different keys

## Summary

It is always crucial to know what kind of variable you have in your hands because this determines how to retrieve the data. 

- lists -> accessed by **index**
- dictionaries -> accessed by **key** (misleadingly, the key can be a number but usually is a string)

Lists are easily identified because they start and end with square brackets: `[ ]`

In contrast, dictionaries start and end with curly brackets: `{ }`.

# Additional Content: sets and tuples

In our previous lesson, we briefly mentioned that there are other data structures we will cover in more detail later in the bootcamp. However, as an additional topic, let's briefly touch upon them in case you feel comfortable with lists and dictionaries and want to get a head start in understanding more about data structures before the bootcamp begins. This will give you a deeper understanding of Python right from the start.

- **Sets**: Sets are unordered collections of unique elements enclosed in curly braces { }. They do not allow duplicate values and provide operations like union, intersection, and difference. Sets are useful for membership testing and eliminating duplicates from a sequence.(MUTABLE)

- **Tuples**: Tuples are similar to lists but are immutable, meaning their elements cannot be modified once created. They are enclosed in parentheses ( ) and commonly used to represent fixed collections of related values. (IMMUTABLE)

### Sets 

Sets are an unordered collection of unique elements in Python. Here are some key points about sets:

- Sets are defined using curly braces {} or the set() function.
- Sets can contain elements of different data types, such as integers, strings, or even other sets.
- Sets do not allow duplicate elements. If you try to add a duplicate element to a set, it will be ignored.
- Sets are mutable, meaning you can add or remove elements from them.
- You can perform various operations on sets, such as union, intersection, difference, and more.

In [None]:
# Creating a set
fruits = {'apple', 'banana', 'orange'}
print(fruits)  # Output: {'apple', 'banana', 'orange'}

# Adding elements to a set
fruits.add('grape')
print(fruits)  # Output: {'apple', 'banana', 'orange', 'grape'}

# Removing elements from a set
fruits.remove('banana')
print(fruits)  # Output: {'apple', 'orange', 'grape'}

### Tuples

Tuples are similar to lists but are immutable, meaning they cannot be modified once created. Here are some key points about tuples:

- Tuples are defined using parentheses () or the tuple() function.
- Tuples can contain elements of different data types, just like lists.
- Once a tuple is created, you cannot add, remove, or modify its elements.
- Tuples are commonly used to store related pieces of information together.

In [None]:
# Creating a tuple
person = ('John', 25, 'USA')
print(person)  # Output: ('John', 25, 'USA')

In [None]:
# Accessing elements of a tuple
name = person[0]
age = person[1]
print(name, age)  # Output: John 25

In [None]:
# Trying to modify a tuple (this will raise an error)
person[0] = 'Jane'

# Additional Content: nested dictionaries 

In Python, a nested dictionary is a dictionary where the values are themselves dictionaries. This can be useful when you need to organize data into multiple levels or categories. Here are some key points about nested dictionaries:

- A nested dictionary can have multiple levels of nesting, with each level representing a specific category or subcategory.
- You can access the values in a nested dictionary by specifying the keys at each level.
- You can add, modify, or remove elements from a nested dictionary, just like with regular dictionaries.
- Each level of a nested dictionary can have different keys and values, providing flexibility in structuring your data.


In [None]:
# Creating a nested dictionary
student_data = {
    'John': {
        'age': 20,
        'major': 'Computer Science',
        'grades': [85, 90, 78]
    },
    'Jane': {
        'age': 22,
        'major': 'Biology',
        'grades': [92, 88, 95]
    }
}

In [None]:
# Accessing values in a nested dictionary
john_age = student_data['John']['age']
jane_major = student_data['Jane']['major']
john_grades = student_data['John']['grades']

print(john_age)  # Output: 20
print(jane_major)  # Output: Biology
print(john_grades)  # Output: [85, 90, 78]

In [None]:
# Modifying values in a nested dictionary
student_data['John']['major'] = 'Electrical Engineering'
student_data['Jane']['grades'].append(97)

In [None]:
# Adding a new student to the nested dictionary
student_data['Sarah'] = {
    'age': 19,
    'major': 'Physics',
    'grades': [90, 91, 88]
}

In [None]:
# Removing a student from the nested dictionary
del student_data['John']

print(student_data)

In [None]:
student_data.keys()

In [None]:
student_data.values()

As you can see, the `keys` of a dictionary can be either:

- strings 
- numbers

In this above example, the dictionary `keys` which are strings. In this particular case, the `values` are also dictionaries.

In [None]:
student_data["Jane"].keys()

In [None]:
student_data["Jane"].values()

ðŸ’¡Check for understanding: how would you access the first grade of Jane?

In [None]:
# Try it yourself here

We mentioned nested dictionaries (dictionaries inside dictionaries) but as you can see in the example, we can also have lists inside dictionaries. 

You can also have lists in which each element is a dictionary. Therefore the amount of possible combinations and levels of nesting is endless.

In [None]:
my_data = [{'Mike': [25,23000],'Jane': [38, 40000],'Bill': [45,35000]},{'Developers': ['Mike','Bill']},{'HR': ['Jane']}]
my_data

This is an example of a nested list with three elements, and each element is a dictionary. When working with nested data structures like this, it is important to understand the type of variable you are dealing with.

Let's check the type of the variable `my_data`:

In [None]:
type(my_data)

Let's access the first element of the list. Since the variable is a **list**, we use **indexing**. Indexing allows us to retrieve specific elements from a list by their position. In this case, to access the first element, we use the index 0 since Python starts counting from 0.

In [None]:
my_data[0]

This element is a dictionary, which can be determined by its structure and the presence of key-value pairs. However, if you're unsure about the data type, you can use the `type()` function to confirm it.

In [None]:
type(my_data[0])

Now let's access the data of 'Jane'. Since this element is a **dictionary**, we need to access its values using the corresponding **keys**.

In [None]:
my_data[0]['Jane']

As we can see, we get another **list**. To access the last element of this list, we can use its **index**.

In [None]:
type(my_data[0]['Jane']) # Let's make sure its a list

In [None]:
my_data[0]['Jane'][1]