# Sets

Here's a quick comparison between these 4 container data types:

| Feature          | List                                  | Dictionary                           | Set                                | Tuple                             |
|------------------|---------------------------------------|--------------------------------------|------------------------------------|-----------------------------------|
| Syntax           | `[item1, item2, ...]`                 | `{'key1': value1, 'key2': value2}`   | `{item1, item2, ...}`              | `(item1, item2, ...)` or `item,`  |
| Order            | Ordered                               | Unordered                            | Unordered                          | Ordered                           |
| Indexing         | Yes (by index)                        | Yes (by key)                         | No                                 | Yes (by index)                    |
| Duplicate Values | Allowed                               | Values can be duplicated, keys cannot| Not allowed                        | Allowed                           |
| Mutability       | Mutable                               | Mutable                              | Mutable                            | Immutable                         |
| Usage            | For a collection of ordered items     | For key-value pairs                  | For unique items                   | For fixed data                    |

In [5]:
job_skills = {'tableau', 'sql', 'python', 'statistics'}

print(job_skills)

{'tableau', 'statistics', 'python', 'sql'}
<class 'set'>


In [4]:
# no index on sets.  will error out
job_skills[1]

TypeError: 'set' object is not subscriptable

In [6]:
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |
 |  Build an unordered collection of unique elements.
 |
 |  Methods defined here:
 |
 |  __and__(self, value, /)
 |      Return self&value.
 |
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __gt__(self, value, /)
 |      Return self>value.
 |
 |  __iand__(self, value, /)
 |      Return self&=value.
 |
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  __ior__(self, value, /)
 |      Return self|=value.
 |
 |  __isub__(self, value, /)
 |      Return self-=value.
 |
 |  __iter__(self, /)
 |      Implement iter(self).
 |
 |  __ixor__(self, value, /)
 |      Return self^=value.
 |
 |  __l

In [7]:
print("Before: ", job_skills)

job_skills.add('looker')

print("After: ", job_skills)

Before:  {'tableau', 'statistics', 'python', 'sql'}
After:  {'statistics', 'tableau', 'python', 'sql', 'looker'}


In [9]:
# Can only add unique values.  Attempt at adding duplicate

print("Before: ", job_skills)

job_skills.add('sql')

print("After: ", job_skills)

Before:  {'statistics', 'tableau', 'python', 'sql', 'looker'}
After:  {'statistics', 'tableau', 'python', 'sql', 'looker'}


In [10]:
# Attempt to Remove an element with pop; Won't work because you can't specify which one to pop; Supplying no parameter with result in a random removal

job_skills.pop('tableau')

TypeError: set.pop() takes no arguments (1 given)

In [11]:
# Must use remove method to remove a specify item
print("Before: ", job_skills)

job_skills.remove('tableau')

print("After: ", job_skills)

Before:  {'statistics', 'tableau', 'python', 'sql', 'looker'}
After:  {'statistics', 'python', 'sql', 'looker'}


In [15]:
# sets are good for cleaning up duplicate values;  Very efficient at it.

skill_list = ['python', 'sql', 'statistics', 'tableau', 'python', 'sql', 'statistics', 'tableau']
print("Before: ", skill_list, " , Type: ", type(skill_list))

skill_set = set(skill_list)
print("Converted to set: ", skill_set, ", Type: ", type(skill_set))

# convert it back to list
skill_list = list(skill_set)

print("After: ", skill_list, ", Type: ", type(skill_list))


Before:  ['python', 'sql', 'statistics', 'tableau', 'python', 'sql', 'statistics', 'tableau']  , Type:  <class 'list'>
Converted to set:  {'tableau', 'statistics', 'python', 'sql'} , Type:  <class 'set'>
After:  ['tableau', 'statistics', 'python', 'sql'] , Type:  <class 'list'>


# Problems

## Add Job Title to Set (1.10.1) - Problem

In [16]:
unique_job_titles = {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}

unique_job_titles.add('AI Specialist')

print(unique_job_titles)

{'Data Scientist', 'Data Analyst', 'AI Specialist', 'Machine Learning Engineer'}


## Remove Job Title from Set (1.10.2) - Problem

In [17]:
unique_job_titles = {'Data Scientist', 'Data Analyst', 'Machine Learning Engineer'}

unique_job_titles.remove('Data Analyst')

print(unique_job_titles)

{'Data Scientist', 'Machine Learning Engineer'}


## Create Unique Job Locations Set (1.10.3) - Problem

In [19]:
job_locations = ['New York', 'San Francisco', 'New York', 'Austin', 'San Francisco']

unique_job_locations = set(job_locations)

print(unique_job_locations)

{'San Francisco', 'Austin', 'New York'}


## Union of Job Skills Sets (1.10.4) - Problem

In [21]:
skills_set1 = {'Python', 'SQL', 'Tableau'}
skills_set2 = {'R', 'SQL', 'Machine Learning'}

combined_set = skills_set1.union(skills_set2)

print(combined_set)

{'Tableau', 'R', 'Python', 'SQL', 'Machine Learning'}
