A set is a collection of unique objects, basic use is removing duplicates. Set elements must be hashable.

In [1]:
l = ['spam', 'spam', 'eggs', 'spam']
set(l)

{'eggs', 'spam'}

In [2]:
# Sets in Python implement all of the typicall logic
a, b = set(), set()
a | b  # Set union
a & b  # Set intersection
a - b  # Set difference

needles = {1, 3}
haystack = {3, 4, 5, 6 , 8, 9, 11, 22, 33}

# How many needles occur in haystack
found = len(needles & haystack)

# Without set intersection it becomes
found = 0
for n in needles:
    if n in haystack:
        found += 1
        
# or
def find_needles(needles, haystack):
    for n in needles:
        if n in haystack:
            yield n

found = sum(1 for needle in find_needles(needles, haystack))

We can initialize sets in Python using either literal notation `{1, 2}` or `set()` constructor. However, `{}` gives an empty dict, so only way to initilize empty set is `set()`.

In [3]:
{1, 2, 3}  # is faster and more readable than using constructor.
set([1, 2, 3])
# This is slower because python first has to look up set name
# Fetch the constructor, build a list and finally pass it to
# constructor

# {1, 2, 3} Python runs specialized BUILD_SET bytecode


{1, 2, 3}

Constructors are slower because Python first has to look up `set` name, fetch its constructor, build a list and finally pass it to the constructor. While, `{1, 2, 3}` runs a specialised BUILD_SET bytecode. This is illustrated by disassembling the Python bytecode below:

In [4]:
import dis

dis.dis('{1, 2, 3}')

  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (2)
              4 LOAD_CONST               2 (3)
              6 BUILD_SET                3
              8 RETURN_VALUE


In [5]:
dis.dis('set([1, 2, 3])')

  1           0 LOAD_NAME                0 (set)
              2 LOAD_CONST               0 (1)
              4 LOAD_CONST               1 (2)
              6 LOAD_CONST               2 (3)
              8 BUILD_LIST               3
             10 CALL_FUNCTION            1
             12 RETURN_VALUE


Set has an immutable sibling `frozenset` which is built-in, you call it via constructor.

Another way to initialize sets is set comprehension.

In [6]:
# Build a set of Latin-1 characters which have word "SIGN" in their name

from unicodedata import name
{chr(i) for i in range(32, 256) if 'SIGN' in name(chr(i), '')}

{'#',
 '$',
 '%',
 '+',
 '<',
 '=',
 '>',
 '¢',
 '£',
 '¤',
 '¥',
 '§',
 '©',
 '¬',
 '®',
 '°',
 '±',
 'µ',
 '¶',
 '×',
 '÷'}