### [Python Sets and Set Theory](https://www.datacamp.com/community/tutorials/sets-in-python)

### Initialize a Set

Sets are a mutable collection of distinct (unique) immutable values that are **unordered**.

In [1]:
emptySet = set()
emptySet

set()

Initialize by passing in a list to `set()`.

In [2]:
dataScientist = set(['Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS'])
dataScientist

{'Git', 'Python', 'R', 'SAS', 'SQL', 'Tableau'}

In [3]:
dataEngineer = set(['Python', 'Java', 'Scala', 'Git', 'SQL', 'Hadoop'])
dataEngineer

{'Git', 'Hadoop', 'Java', 'Python', 'SQL', 'Scala'}

Initialize by using curly braces.

In [4]:
dataScientist = {'Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS'}
dataScientist

{'Git', 'Python', 'R', 'SAS', 'SQL', 'Tableau'}

In [5]:
dataEngineer = {'Python', 'Java', 'Scala', 'Git', 'SQL', 'Hadoop'}
dataEngineer

{'Git', 'Hadoop', 'Java', 'Python', 'SQL', 'Scala'}

### Add and Remove Values from Sets

To add or remove values from a set, you first have to initialize a set.

In [6]:
graphicDesigner = {'InDesign', 'Photoshop', 'Acrobat', 'Premiere', 'Bridge'}
graphicDesigner

{'Acrobat', 'Bridge', 'InDesign', 'Photoshop', 'Premiere'}

#### Add Values to a Set

You can use the method `add` to add a value to a set.

In [7]:
graphicDesigner.add('Illustrator')
graphicDesigner

{'Acrobat', 'Bridge', 'Illustrator', 'InDesign', 'Photoshop', 'Premiere'}

#### Remove Values from a Set

**Option 1:** You can use the `remove` method to remove a value from a set.

In [8]:
graphicDesigner.remove('Illustrator')
graphicDesigner

{'Acrobat', 'Bridge', 'InDesign', 'Photoshop', 'Premiere'}

In [9]:
# Note: If you try to remove a value that is not in your set, you will get a KeyError.
graphicDesigner.remove('Muse')
graphicDesigner

KeyError: 'Muse'

**Option 2:** You can use the `discard` method to remove a value from a set.

In [10]:
graphicDesigner.discard('Premiere')
graphicDesigner

{'Acrobat', 'Bridge', 'InDesign', 'Photoshop'}

In [11]:
# Note: The benefit of this approach is if you try to remove a value that is not part of the set, you will not get a KeyError.
graphicDesigner.discard('Premiere')
graphicDesigner

{'Acrobat', 'Bridge', 'InDesign', 'Photoshop'}

**Option 3:** You can also use the pop method to remove and return an arbitrary value from a set.

In [12]:
graphicDesigner.pop()

'Acrobat'

In [13]:
graphicDesigner

{'Bridge', 'InDesign', 'Photoshop'}

### Remove All Values from a Set

You can use the clear method to remove all values from a set.

In [14]:
graphicDesigner.clear()
graphicDesigner

set()

### Iterate through a Set

Notice that the values printed in the set are not in the order they were added in. This is because sets are unordered.

In [15]:
dataScientist = {'Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS'}

for skill in dataScientist:
    print(skill)

Tableau
SQL
R
SAS
Python
Git


### Transform Set into Ordered Values

In [16]:
type(sorted(dataScientist))

list

In [17]:
dataScientist

{'Git', 'Python', 'R', 'SAS', 'SQL', 'Tableau'}

In [18]:
sorted(dataScientist, reverse = True)

['Tableau', 'SQL', 'SAS', 'R', 'Python', 'Git']

### Set Operation Methods

In [19]:
dataScientist = set(['Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS'])
dataEngineer = set(['Python', 'Java', 'Scala', 'Git', 'SQL', 'Hadoop'])

#### Union

All elements that are in either set.

In [20]:
# Return the union of two sets as a new set.
dataScientist.union(dataEngineer)

{'Git', 'Hadoop', 'Java', 'Python', 'R', 'SAS', 'SQL', 'Scala', 'Tableau'}

In [21]:
# Equivalent result
dataScientist | dataEngineer

{'Git', 'Hadoop', 'Java', 'Python', 'R', 'SAS', 'SQL', 'Scala', 'Tableau'}

#### Intersection

All elements that are in both sets.

In [22]:
# Return the intersection of two sets as a new set.
dataScientist.intersection(dataEngineer)

{'Git', 'Python', 'SQL'}

In [23]:
# Equivalent result
dataScientist & dataEngineer

{'Git', 'Python', 'SQL'}

#### Disjoint Sets

You can test for disjoint sets by using the `isdisjoint` method.

In [24]:
# Initialize a set
graphicDesigner = {'Illustrator', 'InDesign', 'Photoshop'}

These sets have elements in common so it would return False.

In [25]:
# Return True if two sets have a null intersection.
dataScientist.isdisjoint(dataEngineer)

False

These sets have no elements in common so it would return True.

In [26]:
# Return True if two sets have a null intersection.
dataScientist.isdisjoint(graphicDesigner)

True

#### Difference

All elements that are in this set but not the others.

In [27]:
# Return the difference of two or more sets as a new set.
dataScientist.difference(dataEngineer)

{'R', 'SAS', 'Tableau'}

In [28]:
# Equivalent result
dataScientist - dataEngineer

{'R', 'SAS', 'Tableau'}

#### Symmetric Difference

All elements that are in exactly one of the sets.

In [29]:
# Return the symmetric difference of two sets as a new set.
dataScientist.symmetric_difference(dataEngineer)

{'Hadoop', 'Java', 'R', 'SAS', 'Scala', 'Tableau'}

In [30]:
# Equivalent result
dataScientist ^ dataEngineer

{'Hadoop', 'Java', 'R', 'SAS', 'Scala', 'Tableau'}

### Set Comprehension

In [31]:
{skill for skill in ['SQL', 'SQL', 'Python', 'Python']}

{'Python', 'SQL'}

In [32]:
{skill for skill in ['Git', 'Python', 'SQL'] if skill not in {'Git', 'Python', 'Java'}}

{'SQL'}

### Membership Tests

In [33]:
# Initialize a set
possibleSet = {'Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS', 'Java', 'Spark', 'Scala'}

# Membership test
'Python' in possibleSet

True

In [34]:
# Membership test
'Fortran' in possibleSet

False

### Subset

In [35]:
possibleSkills = {'Python', 'R', 'SQL', 'Git', 'Tableau', 'SAS'}
mySkills = {'Python', 'R'}

In [36]:
# Report whether another set contains this set.
mySkills.issubset(possibleSkills)

True

### Frozensets

You have have encountered nested lists and tuples.

In [37]:
# Nested Lists
nestedLists = [['the', 12], ['to', 11], ['of', 9], ['and', 7], ['that', 6]]
nestedLists

[['the', 12], ['to', 11], ['of', 9], ['and', 7], ['that', 6]]

In [38]:
# Nested Tuples
nestedTuples = (('the', 12), ('to', 11), ('of', 9), ('and', 7), ('that', 6))
nestedTuples

(('the', 12), ('to', 11), ('of', 9), ('and', 7), ('that', 6))

The problem with nested sets is that you cannot normally have nested sets as sets cannot contain mutable values including sets.

In [39]:
nestedSets = set([set()])

TypeError: unhashable type: 'set'

This is one situation where you may wish to use a frozenset. A frozenset is very similar to a set except that a frozenset is immutable.

You make a frozenset by using `frozenset()`.

In [40]:
# Build an immutable unordered collection of unique elements.
immutableSet = frozenset()
immutableSet

frozenset()

You can make a nested set if you utilize a frozenset similar to the code below.

It is important to keep in mind that a major disadvantage of a frozenset is that since they are immutable, it means that you cannot add or remove values.

In [41]:
nestedSets = set([frozenset()])
nestedSets

{frozenset()}