# Sets

Perhaps you recall learning about sets and set theory at some point in your mathematical education. Maybe you even remember Venn diagrams:
##### A set can be thought of simply as a well-defined **collection** of distinct objects, typically called elements or members. #####
Grouping objects into a <b>set</b> can be useful in programming as well, and Python provides a built-in set type to do so.
##### Sets are distinguished from other object types by the unique operations that can be performed on them. #####
## Defining a Set ##
Python’s built-in set type has the following characteristics:<br>
1. Sets are **unordered**.
2. Set elements are **unique**. Duplicate elements are not allowed.
3. A **set** itself **may be modified**, but the elements contained in the set must be of an **immutable** type.

#### Creating a Set

The basic form of creating a set is like this:<br>
    **x = set(<"iter">)**<br>
where the argument <'iter'> is an iterable

In [19]:
x = set(['foo', 'bar', 'baz', 'foo', 'qux'])
y = set(('foo', 'bar', 'baz', 'foo', 'qux'))
print(x,'\n', y, '\n', "x and y are equivalent")

{'bar', 'baz', 'qux', 'foo'} 
 {'bar', 'baz', 'qux', 'foo'} 
 x and y are equivalent


**Strings** are also iterable, so a string can be passed to **set()** as well. You have already seen that **list(s)** generates a list of the characters in the string s. Similarly, **set(s)** generates a set of the characters in s:

In [26]:
s = 'quux'
s

'quux'

In [28]:
list(s)

['q', 'u', 'u', 'x']

In [30]:
set(s)

{'q', 'u', 'x'}

You can see that the resulting sets are **unordered**: the original order, as specified in the definition, is not necessarily preserved.<br>
Additionally, **duplicate values are only represented in the set once**, as with the string 'foo' in the first two examples and the letter 'u' in the third.

Observe the difference between these next two set definitions:

In [36]:
{'foo'}

{'foo'}

In [38]:
set('foo')

{'f', 'o'}

In [40]:
x = set()
type(x)

set

In [42]:
x = {}
type(x)

dict

## Operators vs. Methods
Most, though not quite all, set operations in Python can be performed in two different ways: by operator or by method.<br>

#### Union

In [58]:
# Union of x1 and x2
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

# x1.union(x2)
x1 | x2

{'bar', 'baz', 'foo', 'quux', 'qux'}

The union of x1 and x2 is {'foo', 'bar', 'baz', 'qux', 'quux'}.

In the expmple above, notice that the element 'baz', which appears in both x1 and x2, appears only once in the union. <br>
**Sets never contain duplicate values.**<br>

In [61]:
x = set(['foo', 'bar', 'baz', 'foo', 'qux'])
y = set(('foo', 'bar', 'baz', 'foo', 'qux'))
x | y

{'bar', 'baz', 'foo', 'qux'}

In the example above, it is seen that:
1. Sets never contain duplicate values.
2. Sets never respect the order.

The way they are used in the examples above, the operator and method behave identically.<br>
But there is a subtle difference between them.<br>
* When you use the **|** operator, both operands must be sets. <br>
* The **.union()** method, on the other hand, will take any iterable as an argument, convert it to a set, and then perform the union.<br>
**Observe the difference between these two statements:**(The first with error, the second with not)

In [74]:
x1 | ('baz', 'qux', 'quux')

TypeError: unsupported operand type(s) for |: 'set' and 'tuple'

In [76]:
x1.union(('baz', 'qux', 'quux'))

{'bar', 'baz', 'foo', 'quux', 'qux'}

#### Intersection

In [86]:
x1 = {'1', 'bar', 'baz'}
x2 = {'baz', 'qux', '1', '2'}

# x1.intersection(x2)
x1 & x2

{'1', 'baz'}

In [88]:
a = {1, 2, 3, 4}
b = {2, 3, 4, 5}
c = {3, 4, 5, 6}
d = {4, 5, 6, 7}

# a.intersection(b, c, d)
a & b & c & d

{4}

In [100]:
x1 = {'1', 'bar', 'baz'}
x2 = {'baz', 'qux', '2'}

# x1.difference(x2)
x1 - x2
# the result are elements contained in x1 that are not present in x2

{'1', 'bar'}

In [104]:
x2 - x1
# the result are elements contained in x2 that are not present in x1

{'2', 'qux'}

When multiple sets are specified, the operation is performed from left to right.

In [106]:
a = {1, 2, 3, 4, 5}
b = {10, 2, 30, 40}
c = {1, 200, 300, 400}

# a.difference(b, c)
a - b - c
# the result are elements contained in a that are not present neither in b nor in c

{3, 4, 5}

### Simetric Difference

x1.symmetric_difference(x2) and x1 ^ x2 return the set of all elements in either x1 or x2, but not both:

In [117]:
x1 = { 'aa', 'bb', 'cc', 'dd', 'ee' }
x2 = { 'aa', 'ff', 'gg', 'hh', 'ee' }

# x1.symmetric_difference(x2)
x1 ^ x2

{'bb', 'cc', 'dd', 'ff', 'gg', 'hh'}

### Subsets

x1 < x2 Determines whether one set is a proper subset of the other. It's the opposite of Superset.

In [145]:
A = { 'a', 'b', 'c', 'd' }
B = { 'a', 'b' }

print('A is included in B:', A < B, ' B is included in A:', B < A)

A is included in B: False  B is included in A: True


### Supersets

A set x1 is considered a superset of another set x2 if x1 contains every element of x2. It's the opposite of Subset.

In [161]:
A = { 'a', 'b', 'c', 'd' }
B = { 'a', 'b' }

print('A includes B:', A >= B, 'B includes A:', B >= A)

A includes B: True B includes A: False


### Proper superset

Determines whether one set is a proper superset of the other. A proper superset is the same as a superset, except that the sets can’t be identical. A set x1 is considered a proper superset of another set x2 if x1 contains every element of x2, and x1 and x2 are not equal.

In [163]:
A = { 'a', 'b', 'c', 'd' }
B = { 'a', 'b' }

print('A includes B:', A > B, 'B includes A:', B > A)

A includes B: True B includes A: False


In [165]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'bar'}
x1 > x2

True

In [167]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'bar', 'baz'}
x1 > x2

False

### Augmented Assignment Operators and Methods

In [170]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}

x1 |= x2
x1

{'bar', 'baz', 'foo', 'qux'}

In [172]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}

x1 &= x2
x1

{'baz', 'foo'}

In [174]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}

x1 -= x2
x1

{'bar'}

In [176]:
x1 = {'foo', 'bar', 'baz'}
x2 = {'foo', 'baz', 'qux'}

x1 ^= x2
x1

{'bar', 'qux'}

### Other Methods For Modifying Sets

In [189]:
x = {'foo', 'bar', 'baz'}

x.add('qux')
x

{'bar', 'baz', 'foo', 'qux'}

In [191]:
x = {'foo', 'bar', 'baz'}

x.remove('baz')
x

{'bar', 'foo'}

x.discard(<elem>) also removes <elem> from x. However, if <elem> is not in x, this method quietly does nothing instead of raising an exception:

In [194]:
x = {'foo', 'bar', 'baz'}

x.discard('baz')
x

{'bar', 'foo'}

In [196]:
x = {'foo', 'bar', 'baz'}

x.pop()

x

{'baz', 'foo'}

In [198]:
x = {'foo', 'bar', 'baz'}
x


x.clear()
x

set()

### Frozen Sets

Python provides another built-in type called a frozenset, which is in all respects exactly like a set, except that a frozenset is immutable. You can perform non-modifying operations on a frozenset:

In [206]:
x = frozenset(['foo', 'bar', 'baz'])
x


len(x)


x & {'baz', 'qux', 'quux'}

frozenset({'baz'})

But methods that attempt to modify a frozenset fail:

In [209]:
x = frozenset(['foo', 'bar', 'baz'])

x.add('qux')

AttributeError: 'frozenset' object has no attribute 'add'