#  Data Structures- Hashmaps, Sets, Hash Tables, Hashing and Collisions
> Observing hashmaps with python dictionaries
- toc: true
- image: /images/python.png
- categories: []
- type: pbl
- week: 28

## What is a Hashtable/Hashmap?

> A hashtable is a data structure that with a collection of key-value pairs, where each key maps to a value, and the keys must be unique and hashable.

- In Python there is a built in hashtable known as a _______dictionary________.

> The primary purpose of a hashtable is to provide efficient lookup, insertion, and deletion operations. When an element is to be inserted into the hashtable, a hash function is used to map the key to a specific index in the underlying array that is used to store the key-value pairs. The value is then stored at that index. When searching for a value, the hash function is used again to find the index where the value is stored.

> The key advantage of a hashtable over other data structures like arrays and linked lists is its average-case time complexity for lookup, insertion, and deletion operations.

- The typical time complexity of a hashtable is _____constant__________. 


## What is Hashing and Collision?

> Hashing is the process of mapping a given key to a value in a hash table or hashmap, using a hash function. The hash function takes the key as input and produces a hash value or hash code, which is then used to determine the index in the underlying array where the value is stored. The purpose of hashing is to provide a quick and efficient way to access data, by eliminating the need to search through an entire data structure to find a value.

> However, it is possible for two different keys to map to the same hash value, resulting in a collision. When a collision occurs, there are different ways to resolve it, depending on the collision resolution strategy used.

> Python's dictionary implementation is optimized to handle collisions efficiently, and the performance of the dictionary is generally very good, even in the presence of collisions. However, if the number of collisions is very high, the performance of the dictionary can degrade, so it is important to choose a good hash function that minimizes collisions when designing a Python dictionary.

## What is a Set?

In [1]:
# Creating a set using set() function
my_set = set([1, 2, 3, 2, 1])
print(my_set)  

# What do you notice in the output?
# The output consists of numbers
# The output does not consist of the exact set of numbers, instead it removes the duplicates then prints the set.

# Why do you think Sets are in the same tech talk as Hashmaps/Hashtables?
# Because they also remove duplicates
# both data structures used to store collections of elements in computer programs.
# they share similar  strategies, such as using hash functions and arrays

{1, 2, 3}


## Dictionary Example

Below are just some basic features of a dictionary. As always, documentation is always the main source for all the full capablilties. 

In [8]:
# Creating a dictionary with information about the album "Lover"
lover_album = {
    "title": "Lover",
    "artist": "Taylor Swift",
    "year": 2019,
    "genre": ["Pop", "Synth-pop"],
    "tracks": {
        1: "I Forgot That You Existed",
        2: "Cruel Summer",
        3: "Lover",
        4: "The Man",
        5: "The Archer",
        6: "I Think He Knows",
        7: "Miss Americana & The Heartbreak Prince",
        8: "Paper Rings",
        9: "Cornelia Street",
        10: "Death By A Thousand Cuts",
        11: "London Boy",
        12: "Soon You'll Get Better (feat. Dixie Chicks)",
        13: "False God",
        14: "You Need To Calm Down",
        15: "Afterglow",
        16: "Me! (feat. Brendon Urie of Panic! At The Disco)",
        17: "It's Nice To Have A Friend",
        18: "Daylight"
    }
}

# What data structures do you see?
# dictionary
# string

# Printing the dictionary
print(lover_album)

{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}}


In [3]:
# Retrieve value from dictionary with key
print(lover_album.get('tracks'))
# or
print(lover_album['tracks'])

{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}


In [9]:
# Retrieve value from a dictionary inside a dictionary
print(lover_album.get('tracks')[4])
# or
print(lover_album['tracks'][4])

The Man
The Man


In [10]:
# adding a value with a new key
lover_album["producer"] = set(['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Taylor Swift', 'Louis Bell', 'Frank Dukes'])

# What can you change to make sure there are no duplicate producers?
# To make it better I can implement the set function to take out the duplicates


# Printing the dictionary
print(lover_album)




{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}, 'producer': {'Frank Dukes', 'Jack Antonoff', 'Taylor Swift', 'Louis Bell', 'Joel Little'}}


In [11]:
# Adding a an key-value pair to an existing key 
lover_album["tracks"].update({19: "All Of The Girls You Loved Before"})

# How would add an additional genre to the dictionary, like electropop? 
# you can use the append method, which adds to the dictionary


# Printing the dictionary
print(lover_album)

{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}, 'producer': {'Frank Dukes', 'Jack Antonoff', 'Taylor Swift', 'Louis Bell', 'Joel Little'}}


In [12]:
# Print lover_album in more readable format
for k,v in lover_album.items(): # iterate using a for loop for key and value
    print(str(k) + ": " + str(v))

# Write your own code to print tracks in readable format
#
#

title: Lover
artist: Taylor Swift
year: 2019
genre: ['Pop', 'Synth-pop']
tracks: {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}
producer: {'Frank Dukes', 'Jack Antonoff', 'Taylor Swift', 'Louis Bell', 'Joel Little'}


In [18]:
for track_number, track_title in lover_album["tracks"].items():
    print(f"Track {track_number}: {track_title}")

Track 1: I Forgot That You Existed
Track 2: Cruel Summer
Track 3: Lover
Track 4: The Man
Track 5: The Archer
Track 6: I Think He Knows
Track 7: Miss Americana & The Heartbreak Prince
Track 8: Paper Rings
Track 9: Cornelia Street
Track 10: Death By A Thousand Cuts
Track 11: London Boy
Track 12: Soon You'll Get Better (feat. Dixie Chicks)
Track 13: False God
Track 14: You Need To Calm Down
Track 15: Afterglow
Track 16: Me! (feat. Brendon Urie of Panic! At The Disco)
Track 17: It's Nice To Have A Friend
Track 18: Daylight
Track 19: All Of The Girls You Loved Before


In [24]:
# Using conditionals to retrieve a random song
def search():
    search = input("What would you like to know about the album?")
    if lover_album.get(search.lower()) == None:
        print("Invalid Search")
    else:
        print(lover_album.get(search.lower()))

search()




2019


In [26]:
# This is a very basic code segment, how can you improve upon this code?
def search_album():
    search_key = input("What would you like to know about the album?")
    search_result = lover_album.get(search_key.lower(), "Invalid Search")
    print(search_result)

search_album()


['Pop', 'Synth-pop']


In [32]:
def album(search):
    if search.lower() in lover_album:
        print(lover_album[search.lower()])
    else:
        print("Invalid search.")

search_album(input("What would you like to know about the album? "))


['Pop', 'Synth-pop']


## Hacks

- Answer *ALL* questions in the code segments
- Create a diagram or comparison illustration (Canva).
    - What are the pro and cons of using this data structure? 
    - Dictionary vs List    
- Expand upon the code given to you, possible improvements in comments
- Build your own album showing features of a python dictionary

- For Mr. Yeung's class: Justify your favorite Taylor Swift song, answer may effect seed

# My own album!

In [98]:
Map_of_the_Soul_album = {
    "title": "Map of the Soul",
    "artist": "BTS",
    "year": 2020,
    "genre": ["Pop", "R&B", "hip hop"],
    "tracks": {
        1: "ON",
        2: "Black Swan",
        3: "Intro: Persona",
        4: "Moon",
        5: "My Time",
        6: "00:00 (zero o’clock)",
        7: "Dionysus",
        8: "Filter",
        9: "We Are Bulletproof : The Eternal",
        10: "Louder Than Bombs",
        11: "UGH!",
        12: "Interlude : Shadow",
        13: "Outro : Ego",
        14: "Respect",
        15: "Inner Child",
        16: "Friends",
        17: "Make It Right",
        18: "Jamais Vu"
    }
}

print("Title:", Map_of_the_Soul_album["title"])
print("Artist:", Map_of_the_Soul_album["artist"])
print("Year:", Map_of_the_Soul_album["year"])
print("Genre:", ", ".join(Map_of_the_Soul_album["genre"]))
print("Tracks:")

for track_number, track_title in Map_of_the_Soul_album["tracks"].items():
    print(track_number, track_title)


Title: Map of the Soul
Artist: BTS
Year: 2020
Genre: Pop, R&B, hip hop
Tracks:
1 ON
2 Black Swan
3 Intro: Persona
4 Moon
5 My Time
6 00:00 (zero o’clock)
7 Dionysus
8 Filter
9 We Are Bulletproof : The Eternal
10 Louder Than Bombs
11 UGH!
12 Interlude : Shadow
13 Outro : Ego
14 Respect
15 Inner Child
16 Friends
17 Make It Right
18 Jamais Vu


In [105]:
import random

# Get the tracks dictionary from the album dictionary
tracks_dict = Map_of_the_Soul_album["tracks"]
print(tracks_dict)


{1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu'}


In [106]:
# Get a list of track titles
track_titles = list(tracks_dict.values())
print(track_titles)



['ON', 'Black Swan', 'Intro: Persona', 'Moon', 'My Time', '00:00 (zero o’clock)', 'Dionysus', 'Filter', 'We Are Bulletproof : The Eternal', 'Louder Than Bombs', 'UGH!', 'Interlude : Shadow', 'Outro : Ego', 'Respect', 'Inner Child', 'Friends', 'Make It Right', 'Jamais Vu']


In [109]:
# Randomly select a track title
selected_title = random.choice(track_titles)
print(selected_title)


Dionysus


In [110]:
# Print the selected title
print("Randomly selected BTS song: {}".format(selected_title))


Randomly selected BTS song: Dionysus


In [70]:
print(Map_of_the_Soul_album['tracks'])

{1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu'}


In [72]:
print(Map_of_the_Soul_album.get('tracks')[1])


ON


In [73]:
Map_of_the_Soul_album["tracks"].update({19: "Dynamite"})

# Printing the dictionary
print(Map_of_the_Soul_album)

{'title': 'Map of the Soul', 'artist': 'BTS', 'year': 2020, 'genre': ['Pop', 'R&B', 'hip hop'], 'tracks': {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu', 19: 'Dynamite'}}


In [112]:
# Get the tracks dictionary from the album dictionary
tracks_dict = Map_of_the_Soul_album["tracks"]

# Get user input for a track number
track_number = input("Enter a track number (1-18): ")

# Check if the track number is valid
if track_number.isdigit() and int(track_number) in tracks_dict:
    # Get the track title for the given track number
    track_title = tracks_dict[int(track_number)]
    print("Track title: {}".format(track_title))
else:
    print("Invalid track number.")


Track title: We Are Bulletproof : The Eternal


# Accessing the dictionary

### Get()

In [56]:
x = Map_of_the_Soul_album.get("tracks")
print(x)


{1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu'}


In [57]:
x = Map_of_the_Soul_album.keys()
print(x)



dict_keys(['title', 'artist', 'year', 'genre', 'tracks'])


In [58]:
x = Map_of_the_Soul_album.values()
print(x)

dict_values(['Map of the Soul', 'BTS', 2020, ['Pop', 'R&B', 'hip hop'], {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu'}])


In [60]:
x = Map_of_the_Soul_album.items()
print(x)

dict_items([('title', 'Map of the Soul'), ('artist', 'BTS'), ('year', 2020), ('genre', ['Pop', 'R&B', 'hip hop']), ('tracks', {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu'})])


In [69]:
if "tracks" in Map_of_the_Soul_album :
  print("yes!")
else :
  print("no")

yes!


# Removing Items

In [81]:
Map_of_the_Soul_album.pop("title")
print(Map_of_the_Soul_album)

{'artist': 'BTS', 'year': 2020, 'genre': ['Pop', 'R&B', 'hip hop'], 'tracks': {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu', 19: 'Dynamite'}, '1': 'ON!', 1: 'ON!'}


In [82]:
Map_of_the_Soul_album.popitem()
print(Map_of_the_Soul_album)

{'artist': 'BTS', 'year': 2020, 'genre': ['Pop', 'R&B', 'hip hop'], 'tracks': {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu', 19: 'Dynamite'}, '1': 'ON!'}


In [83]:
del Map_of_the_Soul_album["artist"]
print(Map_of_the_Soul_album)


{'year': 2020, 'genre': ['Pop', 'R&B', 'hip hop'], 'tracks': {1: 'ON', 2: 'Black Swan', 3: 'Intro: Persona', 4: 'Moon', 5: 'My Time', 6: '00:00 (zero o’clock)', 7: 'Dionysus', 8: 'Filter', 9: 'We Are Bulletproof : The Eternal', 10: 'Louder Than Bombs', 11: 'UGH!', 12: 'Interlude : Shadow', 13: 'Outro : Ego', 14: 'Respect', 15: 'Inner Child', 16: 'Friends', 17: 'Make It Right', 18: 'Jamais Vu', 19: 'Dynamite'}, '1': 'ON!'}


# Dictionary Methods

In [92]:
# Define a dictionary of method descriptions
print("Methods for Dictionaries")
print("------------------------")
method_desc = {
    "clear()": "Removes all the elements from the dictionary",
    "copy()": "Returns a copy of the dictionary",
    "fromkeys()": "Returns a dictionary with the specified keys and value",
    "get()": "Returns the value of the specified key",
    "items()": "Returns a list containing a tuple for each key value pair",
    "keys()": "Returns a list containing the dictionary's keys",
    "pop()": "Removes the element with the specified key",
    "popitem()": "Removes the last inserted key-value pair",
    "setdefault()": "Returns the value of the specified key. If the key does not exist: insert the key, with the specified value",
    "update()": "Updates the dictionary with the specified key-value pairs",
    "values()": "Returns a list of all the values in the dictionary"
}

# Iterate over the method descriptions and print them in a readable way
for method, desc in method_desc.items():
    print(f"{method}: {desc}")


Methods for Dictionaries
------------------------
clear(): Removes all the elements from the dictionary
copy(): Returns a copy of the dictionary
fromkeys(): Returns a dictionary with the specified keys and value
get(): Returns the value of the specified key
items(): Returns a list containing a tuple for each key value pair
keys(): Returns a list containing the dictionary's keys
pop(): Removes the element with the specified key
popitem(): Removes the last inserted key-value pair
setdefault(): Returns the value of the specified key. If the key does not exist: insert the key, with the specified value
update(): Updates the dictionary with the specified key-value pairs
values(): Returns a list of all the values in the dictionary


# Diagram

# Dictionary vs. List

![image.png](attachment:image.png)

# What are the pro and cons of using this data structure? 

![image.png](attachment:image.png)