**Use specialized Python collections to create more efficient programs!**

### Sets

**Introduction to Python Sets**

In Python, a set is a group of elements that are unordered and do not contain duplicates.

For example, we can imagine two different groups of items that have some similarities and differences. Using set mathematics, we can find the matching items, differences, combine the sets based on different parameters, and more!

Alternatively, there is also an immutable version of a set called a frozenset. A frozenset behaves similarly to a normal set, but it does not include methods that modify the frozenset in any way. 

* How to create a set and a frozenset.  
* How to add to a set (we won't be able to mutate a frozenset).
* How to remove from a set (we won't be able to mutate a frozenset).
* How to find specific elements in a set and a frozenset.  
* How to perform set operations such as unions, intersections, and more.

**A set object can be created by passing an interable object into its constructor, using curly braces, or using a set comprehension.**

In [1]:
music_genres = {'country', 'punk', 'rap', 'techno', 'pop','latin'}

In [2]:
music_genres

{'country', 'latin', 'pop', 'punk', 'rap', 'techno'}

In [3]:
music_genres1 = set(['country', 'punk', 'rap', 'techno', 'pop', 'latin'])

In [4]:
music_genres1

{'country', 'latin', 'pop', 'punk', 'rap', 'techno'}

In [5]:
music_genres_3 = set(['country', 'punk', 'rap', 'pop', 'pop', 'pop'])

In [6]:
music_genres_3

{'country', 'pop', 'punk', 'rap'}

In [7]:
#can use any data types as long as they are unique
music_difference = {70, 'music times', 'categories', True, 'country', 45.7}

In [8]:
music_difference

{45.7, 70, True, 'categories', 'country', 'music times'}

In [9]:
#Creating an empty set
empty_genres = set()

In [10]:
empty_genres

set()

In [11]:
#sets using a set comprehension and a data set such as a list
items = ['country', 'punk', 'rap', 'techno', 'pop', 'latin']

In [12]:
music_genres = {category for category in items if category[0]=='p'}

In [13]:
music_genres

{'pop', 'punk'}

In [14]:
genre_results = ['rap', 'classical', 'rock', 'rock', 'country', 'rap', 'rock', 'latin', 'country', 'k-pop', 'pop', 'rap', 'rock', 'k-pop',  'rap', 'k-pop', 'rock', 'rap', 'latin', 'pop', 'pop', 'classical', 'pop', 'country', 'rock', 'classical', 'country', 'pop', 'rap', 'latin']

# Write your code below!
survey_genres = set(genre_results)
print(survey_genres)
survey_abbreviated = {genre[0:3] for genre in genre_results}
print(survey_abbreviated)

{'k-pop', 'pop', 'classical', 'country', 'rock', 'latin', 'rap'}
{'pop', 'lat', 'cla', 'k-p', 'roc', 'cou', 'rap'}


#### Creating a Frozenset

Unlike a normal set, we can only create a frozenset using its constructor. Using a frozenset means that we cannot modify the elements inside of it.

In [15]:
#Creating a frozenset from a list
frozen_music_genres = frozenset(['country', 'punk', 'rap', 'techno', 'pop', 'latin'])

In [16]:
frozen_music_genres

frozenset({'country', 'latin', 'pop', 'punk', 'rap', 'techno'})

In [17]:
#Create an empty frozenset
empty_frozen_music_genres = frozenset()

In [18]:
top_genres = ['rap', 'rock', 'pop']

In [19]:
#Creating a frozenset to preserve this data, and prevent
#it from being modified

In [20]:
frozen_top_genres = frozenset(top_genres)

In [21]:
frozen_top_genres

frozenset({'pop', 'rap', 'rock'})

#### Adding to a Set

There are two ways to add elements to a set

In [22]:
#.add() method can add a single element to the set
# Create a set to hold the song tags

In [23]:
song_tags = {'country', 'folk', 'acoustic'}

In [24]:
song_tags

{'acoustic', 'country', 'folk'}

In [25]:
song_tags.add('guitar')
song_tags.add('country')

In [26]:
print(song_tags)

{'country', 'guitar', 'acoustic', 'folk'}


In [27]:
#.update() method can add multiple elements

In [28]:
song_tags = {'country', 'folk', 'acoustic'}
# Add more tags using a hashable object (such as a list of elements)
other_tags = ['live', 'blues', 'acoustic']
song_tags.update(other_tags)

In [29]:
song_tags

{'acoustic', 'blues', 'country', 'folk', 'live'}

* Neither of these methods will add a duplicate item to a set
* A frozen set cannot have any items added to it
* Not printed in the same order in which they entered the set, as set and frozenset containers are unordered.

In [30]:
song_data = {'Retro Words': ['pop', 'warm', 'happy', 'electric']}


In [31]:
song_data

{'Retro Words': ['pop', 'warm', 'happy', 'electric']}

In [32]:
set(song_data['Retro Words'])

{'electric', 'happy', 'pop', 'warm'}

In [33]:
song_data = {'Retro Words': ['pop', 'warm', 'happy', 'electric']}

user_tag_1 = 'warm'
user_tag_2 = 'exciting'
user_tag_3 = 'electric'

# Write your code below!
tag_set = set(song_data['Retro Words'])
tag_set.update([user_tag_1, user_tag_2, user_tag_3]
)
song_data['Retro Words'] = tag_set

In [34]:
song_data

{'Retro Words': {'electric', 'exciting', 'happy', 'pop', 'warm'}}

#### Removing from a Set

In [35]:
# Two methods for removing specific elements from a set

In [36]:
# .remove() method searches for an element within the set
# and removes it if it exists, otherwise, a KeyError is thrown.

In [37]:
# Given a list of song tags
song_tags = {'guitar', 'acoustic', 'folk', 'country', 'live', 'blues'}

In [38]:
song_tags

{'acoustic', 'blues', 'country', 'folk', 'guitar', 'live'}

In [39]:
# Remove an existing element
song_tags.remove('folk')

In [40]:
song_tags

{'acoustic', 'blues', 'country', 'guitar', 'live'}

In [41]:
song_tags.remove('feeble')

KeyError: 'feeble'

In [None]:
# .discard() method works the same way but does not throw
# an exception if an element is not present.

In [None]:
song_tags.discard('guitar')

In [None]:
song_tags

In [None]:
song_tags.discard('pop')

In [None]:
# Items cannot be removed from a frozenset

In [None]:
song_data_users = {'Retro Words': ['pop', 'onion', 'warm', 'helloworld', 'happy', 'spam', 'electric']}

# Write your code below!
tag_set = set(song_data_users['Retro Words'])

In [None]:
tag_set

In [None]:
tag_set.remove('onion')
tag_set.remove('helloworld')
tag_set.remove('spam')

song_data_users['Retro Words'] = tag_set

In [None]:
song_data_users

#### Finding elements in a Set

set and frozenset items cannot be accessed by a specific index, as both containers are unordered and have no indices. But, like most python containers, we can use the **in** keyword to test of an elements is in a set or frozenset.

In [None]:
# Give a list of song tags
song_tags = {'guitar', 'acoustic', 'folk', 'country', 'live', 'blues'}

In [None]:
song_tags

In [None]:
print('country' in song_tags)

In [None]:
#Also works for frozenset
frozen_tags = frozenset(song_tags)

In [None]:
frozen_tags

In [None]:
type(song_tags)

In [None]:
print('rock' in frozen_tags)

In [None]:
allowed_tags = ['pop', 'hip-hop', 'rap', 'dance', 'electronic', 'latin', 'indie', 'alternative rock', 'classical', 'k-pop', 'country', 'rock', 'metal', 'jazz', 'exciting', 'sad', 'happy', 'upbeat', 'party', 'synth', 'rhythmic', 'emotional', 'relationship', 'warm', 'guitar', 'fiddle', 'romance', 'chill', 'swing']

song_data_users = {'Retro Words': ['pop', 'explosion', 'hammer', 'bomb', 'warm', 'due', 'writer', 'happy', 'horrible', 'electric', 'mushroom', 'shed']}

# Write your code below!
tag_set = set(song_data_users['Retro Words'])
bad_tags = []
for i in tag_set:
  if i not in allowed_tags:
    bad_tags.append(i)

for tag in bad_tags:
  tag_set.remove(tag)

song_data_users['Retro Words'] = tag_set

In [None]:
song_data_users

#### Introduction to Set Operations
* Unions
* Intersections (and Intersection Updates)
* Difference (and Difference Updates)
* Symmetric Differences (and Symmetric Difference Updates)

#### Set Union

One of the most common operations we can perform is a merge using .union() method or | operator. Doing so will return a new set or frozenset containing all elements from both sets without duplicates.

In [None]:
# Given a set and frozenset of song tags for two python
# related hits

In [None]:
prepare_to_py = {'rock', 'heavy metal', 'electric guitar', 'synth'}

In [None]:
prepare_to_py

In [None]:
py_and_dry = frozenset({'classic', 'rock', 'electric guitar', 'rock and roll'})

In [None]:
py_and_dry

In [None]:
#Get the union using the .union() method 
combined_tags = prepare_to_py.union(py_and_dry)

In [None]:
print(combined_tags)

In [None]:
# Look at left

#### Using |

In [None]:
# Get the union using the | operator
frozen_combined_tags = py_and_dry | prepare_to_py

In [None]:
print(frozen_combined_tags)

In [None]:
# left operand

To improve the logic for adding user tags to songs in the app, we can use the union of tag sets! Our team has provided us two dictionaries.

The first dictionary (song_data) contains song data including tags from the original artists, while the second dictionary (user_tag_data) includes tags that have been added by users. Let’s attempt to merge the tag sets together so we have a full collection of tags.

First, create an empty dictionary called new_song_data which will hold the merged tag data.

In [None]:
song_data = {'Retro Words': ['pop', 'warm', 'happy', 'electronic'],
             'Wait For Limit': ['rap', 'upbeat', 'romance'],
             'Stomping Cue': ['country', 'fiddle', 'party'],
             'Lowkey Space': ['electronic', 'dance', 'synth']}

user_tag_data = {'Lowkey Space': ['party', 'synth', 'fast', 'upbeat'],
                 'Retro Words': ['happy', 'electronic', 'fun', 'exciting'],
                 'Wait For Limit': ['romance', 'chill', 'rap', 'rhythmic'], 
                 'Stomping Cue': ['country', 'swing', 'party', 'instrumental']}

# Write your code below!
new_song_data = {}
for k,v in song_data.items():
  song_tag_set = set(song_data[k])
  user_tag_set= set(user_tag_data[k])
  new_song_data[k] = song_tag_set.union(user_tag_set)

print(new_song_data)

#### Set Intersection

In [None]:
#.intersection() and .intersection_update(), here the original
# is updated to contain the result of the intersection

In [None]:
song_data = {'Retro Words': ['pop', 'warm', 'happy', 'electronic', 'synth'],
             'Wait For Limit': ['rap', 'upbeat', 'romance'],
             'Stomping Cue': ['country', 'fiddle', 'party'],
             'Lowkey Space': ['electronic', 'dance', 'synth', 'upbeat'],
             'Back To Art': ['pop', 'sad', 'emotional', 'relationship'],
             'Blinding Era': ['rap', 'intense', 'moving', 'fast'],
             'Down To Green Hills': ['country', 'relaxing', 'vocal', 'emotional'],
             'Double Lights': ['electronic', 'chill', 'relaxing', 'piano', 'synth']}

user_recent_songs = {'Retro Words': ['pop', 'warm', 'happy', 'electronic', 'synth'],
                     'Lowkey Space': ['electronic', 'dance', 'synth', 'upbeat']}

# Checkpoint 1
tags_int = set(user_recent_songs['Retro Words']) & set(user_recent_songs['Lowkey Space'])

# Checkpoint 2
recommended_songs = {}
for key, val in song_data.items():
    for tag in val:
        if tag in tags_int:
            if key not in user_recent_songs:
                recommended_songs[key] = val

print(recommended_songs)

In [None]:
song_data = {'Retro Words': ['pop', 'warm', 'happy', 'electronic', 'synth'],
             'Wait For Limit': ['rap', 'upbeat', 'romance'],
             'Stomping Cue': ['country', 'fiddle', 'party'],
             'Lowkey Space': ['electronic', 'dance', 'synth', 'upbeat'],
             'Back To Art': ['pop', 'sad', 'emotional', 'relationship'],
             'Blinding Era': ['rap', 'intense', 'moving', 'fast'],
             'Down To Green Hills': ['country', 'relaxing', 'vocal', 'emotional'],
             'Double Lights': ['electronic', 'chill', 'relaxing', 'piano', 'synth']}

user_recent_songs = {'Retro Words': ['pop', 'warm', 'happy', 'electronic', 'synth'],
                     'Lowkey Space': ['electronic', 'dance', 'synth', 'upbeat']}

# Checkpoint 1
tags_int = set(user_recent_songs['Retro Words']) & set(user_recent_songs['Lowkey Space'])

In [None]:
recommended_songs = {}

In [None]:
for k, v in song_data.items():
    for tag in v:
        if tag in tags_int:
            if k not in user_recent_songs:
                recommended_songs[k] = v

In [None]:
recommended_songs

In [42]:
import csv

def process_csv_supplies():
    data = []
    with open('supplies_data.csv', 'r') as csvfile:
        r = csv.reader(csvfile)
        for row in r:
            data.append(tuple(row))

    print(data)
process_csv_supplies()

[['item', 'num_pallets', 'importance'], ['nylon', '10', 'unimportant'], ['leather', '4', 'unimportant'], ['wool', '1', 'important'], ['leather', '1', 'unimportant'], ['nylon', '1', 'unimportant'], ['polyester', '2', 'important'], ['silk', '1', 'important'], ['cotton', '5', 'important'], ['cotton', '5', 'important'], ['leather', '3', 'unimportant'], ['silk thread', '5', 'important'], ['nylon', '6', 'unimportant'], ['cotton', '3', 'unimportant'], ['leather', '8', 'unimportant'], ['polyester', '2', 'unimportant'], ['cotton thread', '5', 'important'], ['denim', '8', 'important'], ['cotton thread', '10', 'unimportant'], ['silk thread', '9', 'important'], ['cotton', '5', 'important'], ['elastic thread', '7', 'unimportant'], ['polyester thread', '7', 'important'], ['polyester', '7', 'important'], ['polyester thread', '9', 'important'], ['polyester', '6', 'unimportant'], ['denim thread', '8', 'unimportant'], ['silk', '2', 'important'], ['nylon', '6', 'unimportant'], ['elastic thread', '1', 'un