Sets & Tuples

@Authors: Sridhar Nerur, Samuel Jayarajan, and Mahyar Vaghefi

This IPython Notebook introduces you to sets and tuples. The primary difference between sets and lists is that sets cannot contain duplicate values. Sets, like Lists, are MUTABLE, which means they can be changed. It is useful for finding intersection, union, difference, symmetric difference, subsets, etc. It must be noted that sets do not support indexing.

In [1]:
#assigning an empty set
s = set() #an empty set
s

set()

In [2]:
#you can assign values at the time you define your set
s = {1, 2, 3, 4} #note that we use {}
s

{1, 2, 3, 4}

In [3]:
#can we replace the 1 in index 0 with 121?
s[0] = 121 #you will get an error because set does not support assignment

TypeError: 'set' object does not support item assignment

In [4]:
aList = [1, 1, 2, 3, 1, 1, 5, 2]
s = set(aList) #use the list to create a set; duplicates will be removed
s

{1, 2, 3, 5}

In [5]:
#length of set
len(s)

4

In [6]:
#how do you add to a set?
s = set()
s.add(1)
print("Add a 1 to an empty set: ", s)
s.add(2)
print("Add a 2: ", s)
s.add(1)
print("Add another 1: ", s) #will be discarded because there is already a 1 in the set

Add a 1 to an empty set:  {1}
Add a 2:  {1, 2}
Add another 1:  {1, 2}


Note that "+" and "*" don't work with sets

In [9]:
#let us clear the set
s.clear()

In [10]:
s #should be empty

set()

Let us now examine some operations that are typically performed on sets.

In [11]:
set1 = {1,2,3,5,7,9,10}
set2 = {3,4,6,8,10}

#Let us look at union first
print("Union of two sets: ", set1 | set2 )
print("Another way to do union: ", set1.union(set2))

Union of two sets:  {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Another way to do union:  {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}


In [12]:
#intersection
print("Intersection of two sets: ", set1 & set2 )
print("Another way to do intersection: ", set1.intersection(set2))

Intersection of two sets:  {10, 3}
Another way to do intersection:  {10, 3}


In [14]:
#difference
print("Difference between two sets: ", set1 - set2 ) #unique to set2

Difference between two sets:  {1, 2, 5, 7, 9}


In [32]:
#symmetric difference -> elements unique to both sets
print("Symmetric Difference between two sets: ", set1 ^ set2 ) #common elements are eliminated
print("Symmetric Difference between two sets: ", set1.symmetric_difference(set2) )

Symmetric Difference between two sets:  {1, 2, 4, 5, 6, 7, 8, 9}
Symmetric Difference between two sets:  {1, 2, 4, 5, 6, 7, 8, 9}


In [19]:
#check to see if the two sets are disjoint
set1.isdisjoint(set2)

False

In [20]:
#remove or discard elements from the set
#remove gives an error if object is not in set, discard doesn't
s = {1, 2, 3, 4, 5, 6, 7, 8, 9}
s.remove(2)
s

{1, 3, 4, 5, 6, 7, 8, 9}

In [21]:
s.remove(10) #should give an error

KeyError: 10

In [22]:
s = {1, 2, 3, 4, 5, 6, 7, 8, 9}
s.discard(2)

In [23]:
s

{1, 3, 4, 5, 6, 7, 8, 9}

In [24]:
s.discard(10) #no error raised

In [29]:
A = {1,2,3,4,5,6,7,8}
B = {3,4,5,6}
C = {1,2,3,4,5,6,7,8}
print("Is A a superset of B?", A.issuperset(B))
print("Is B a subset of C?", B.issubset(C))
print("Is A a subset of C?", A.issubset(C))
print("Is C a subset of A?", A.issubset(C))


Is A a superset of B? True
Is B a subset of C? True
Is A a subset of C? True
Is C a subset of A? True


In [31]:
#One can also use greater than, less than, and other comparison operators
print("Is set A greater than B?", A > B)
print("Is set B less than C?", B < C)
print("Is A equal to C?", A == C )

Is set A greater than B? True
Is set B less than C? True
Is A equal to C? True


In [18]:
#let us look at all the methods associated with sets
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Re

In [34]:
#removing elements from a set
s = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
s.pop() #removes and returns and arbitrary element

1

In [35]:
s.pop()

2

In [36]:
#Let us try to pop another one
print(s)
s.pop()

{3, 4, 5, 6, 7, 8, 9, 10}


3

The documentation says pop() will remove and return an arbitrary element, but that doesn't seem to be the case. Nevertheless, stay with the documentation.

In [37]:
#let us remove 9 from the set
s.remove(9)
s

{4, 5, 6, 7, 8, 10}

In [38]:
#check to see if 9 is in the set
9 in s

False

I think we have covered the key aspects of sets. Remember that sets are mutable and contain distinct values (i.e., they cannot have duplicates). Let us now look at another interesting data structure or container called Tuple. Tuples, like strings (and unlike sets and lists), are IMMUTABLE. They are widely used in the data science world.

In [45]:
t = (1, 2, 3, 4, 5, 6) #a tuple of integers
t[0] #will display the first element

1

In [54]:
type(t)

tuple

In [46]:
t[2:]

(3, 4, 5, 6)

In [47]:
t[-3:]

(4, 5, 6)

In [48]:
#you cannot change its values. For example, the following will not work
t[0] = 25

TypeError: 'tuple' object does not support item assignment

In [50]:
len(t)

6

In [51]:
t.count(5) #no different from the count method in list

1

In [52]:
#index can be used to get the location of an object
t.index(4)

3

In [49]:
#let us look at some methods associated with tuples
help(tuple)

Help on class tuple in module builtins:

class tuple(object)
 |  tuple() -> empty tuple
 |  tuple(iterable) -> tuple initialized from iterable's items
 |  
 |  If the argument is a tuple, the return value is the same object.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __getnewargs__(...)
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |      Return hash(self).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self

In [53]:
#Be careful when you create a single element tuple
#for example, the following will not be tuples
s = (1) # I am trying to create a tuple with one element, 1
#check the type of s to see what I mean
type(s)

int

Note that a single element in the tuple is construed as a regular variable and not a tuple. If our goal is to have a tuple, we must include a comma after the single element.

In [55]:
s = (1,)
type(s)

tuple

In [57]:
s = (1, 2, 3, 1, 2, 2, 3, 4, 1, 5, 1)
print(s.count(1))

4


In [58]:
s.index(5)

9

This concludes our lesson on sets and tuples. Remember that sets are mutable while tuples are not. We will see that tuples are widely used in data science.