### Sets
Used to efficiently extract unique values
- very similar to lists
- Usage: for unique items
- Syntax: {item1, item2,...}
- Type of data: Set
- Order: Unordered
- Indexing: No
- Duplicate Values: Not Allowed
- Mutability: Mutable 

In [1]:
job_skills = {'tableau', 'sql', 'python', 'statistics'}

In [None]:
job_skills #the order in which the items appear is different than how it was defined; usually alphabetical but don't count on it

{'python', 'sql', 'statistics', 'tableau'}

In [3]:
job_skills[1] #index used to call on specific items in lists but this doesn't work here

TypeError: 'set' object is not subscriptable

In [4]:
job_skills.add('looker')

In [5]:
job_skills

{'looker', 'python', 'sql', 'statistics', 'tableau'}

In [6]:
job_skills.add('sql')

In [7]:
job_skills #sql was already in, even though the code run above, it will only show once

{'looker', 'python', 'sql', 'statistics', 'tableau'}

In [8]:
job_skills.pop('tableau')

TypeError: set.pop() takes no arguments (1 given)

In [None]:
# because sets are not indexed and unordered we cannot choose a specific value to pop
job_skills.pop() # it will choose one arbitrary value and remove it

'python'

In [10]:
job_skills

{'looker', 'sql', 'statistics', 'tableau'}

In [11]:
# to remove a specific item we have to use remove
job_skills.remove('tableau')

In [12]:
job_skills

{'looker', 'sql', 'statistics'}

In [None]:
# List with duplicate values
skill_list = ['python', 'sql', 'statistics', 'tableau', 'python', 'sql', 'statistics', 'tableau']

In [14]:
# Sets can be used to extract a list of unique values
set(skill_list)

{'python', 'sql', 'statistics', 'tableau'}

In [15]:
list(set(skill_list)) # we can put them back in a list

['python', 'sql', 'tableau', 'statistics']

#### Practice

Add a new job title 'AI Specialist' to the set of unique job titles and print the updated set. The initial set of unique_job_titles is {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}.

In [16]:
unique_job_titles = {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}

In [17]:
unique_job_titles

{'Data Analyst', 'Data Scientist', 'Machine Learning Engineer'}

In [18]:
unique_job_titles.add('AI Specialist')

In [19]:
unique_job_titles

{'AI Specialist',
 'Data Analyst',
 'Data Scientist',
 'Machine Learning Engineer'}

Remove the job title 'Data Analyst' from the set of unique job titles and print the updated set. The initial set of unique_job_titles is {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}.

In [20]:
unique_job_titles = {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}

In [21]:
unique_job_titles.remove('Data Analyst')
print(unique_job_titles)

{'Data Scientist', 'Machine Learning Engineer'}


Create a set of job locations named job_locations from a list of job locations for data science roles and print the set. The list is ['New York', 'San Francisco', 'New York', 'Austin', 'San Francisco'].

In [22]:
job_locations = ['New York', 'San Francisco', 'New York', 'Austin', 'San Francisco']

In [24]:
unique_job_locations = set(job_locations)
print(unique_job_locations)

{'New York', 'Austin', 'San Francisco'}


Find the union of two sets of job skills and print the result. The first set is {'Python', 'SQL', 'Tableau'} and the second set is {'R', 'SQL', 'Machine Learning'}.

In [25]:
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Re

In [26]:
skills_set1 = {'Python', 'SQL', 'Tableau'}
skills_set2 = {'R', 'SQL', 'Machine Learning'}

In [28]:
all_skills = skills_set1.union(skills_set2)
print(all_skills)

{'Tableau', 'Python', 'R', 'Machine Learning', 'SQL'}
