<a href="https://colab.research.google.com/github/AtrCheema/python-courses/blob/master/basics/sets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sets
A set is a collection of objects just like lists with the exception that it is unordered, does not contain same objects more than once, and can not contain immutable objects like lists.

A set can be created from an existing sequence object such as a string, list or tuple.

In [1]:
urdu = set("National language of Pakistan")

print(type(urdu))

print(urdu)

<class 'set'>
{'l', ' ', 'a', 'u', 'P', 't', 'k', 's', 'n', 'e', 'N', 'g', 'o', 'f', 'i'}


In [2]:
pak_langs = set(["Balochi", "Barohi", "Sindhi", "Balti"])
pak_langs

{'Balochi', 'Balti', 'Barohi', 'Sindhi'}

If our sequence contains repeating objects, only one instance of those repeating objects will be included in the list.

In [3]:
pak_langs = set(("Balochi", "Barohi", "Sindhi", "Balti", "Balochi"))
pak_langs

{'Balochi', 'Balti', 'Barohi', 'Sindhi'}

Although, we can create sets from lists, but a set can not contain a list as an object. 

In [4]:
pak_langs = set((("Balochi", "Barohi"), ("punbabi", "siraiki")))
pak_langs

{('Balochi', 'Barohi'), ('punbabi', 'siraiki')}

In [5]:
pak_langs = set((["Balochi", "Barohi"], ["punbabi", "siraiki"]))
pak_langs

TypeError: ignored

In second case above, we wnat our set to have two lists as objects, so the error was prompted.

Sets are mutable i.e. they can be changed. We can add new objects in sets as following

In [6]:
pak_langs = set(["Balochi", "Barohi", "Sindhi"])
pak_langs.add("Pashto")
pak_langs

{'Balochi', 'Barohi', 'Pashto', 'Sindhi'}

There are immutable sets as well with the name `frozensets`.

In [7]:
balochistan_langs = frozenset(["Balochi", "Barohi", "Pashto"])
balochistan_langs.add("punjabi")

AttributeError: ignored

# Operations on sets



## adding elements
We saw, how to add objects in sets with the method `add`. We can not violate aformentioned rules using `add` method.

In [8]:
imperialists = {"bbc", "cnn"}

imperialists.add(["voa","dw"])

TypeError: ignored

In [9]:
imperialists.add('bbc')
imperialists

{'bbc', 'cnn'}

In [10]:
imperialists.update(["voa","dw"])
imperialists

{'bbc', 'cnn', 'dw', 'voa'}

In [11]:
imperialists = {"bbc", "cnn"}
imperialists.update([["voa","dw"]])
imperialists

TypeError: ignored

`|` operator can also be used to add/concatenate two sets

In [12]:
imperialists = {"bbc", "cnn"}

imperialists | {"voa", "dw"}

{'bbc', 'cnn', 'dw', 'voa'}

In [13]:
imperialists = {"bbc", "cnn"}

imperialists |= {"voa", "dw"}

imperialists

{'bbc', 'cnn', 'dw', 'voa'}

## `clear`
We can clear the contents of a set by using the method `clear` on a set.

In [14]:
dakus = {"musharaf", "nawaz", "benazir"}

dakus.clear()  # after NRO (https://en.wikipedia.org/wiki/National_Reconciliation_Ordinance)
dakus

set()

## Copy

The assignment operation `=` does not create a new set.

In [15]:
more_dakus = {"pervaiz elahi", "altaf husain"}
dakus_backup = more_dakus
more_dakus.clear()
dakus_backup

set()

`copy` method creates a shallow copy

In [16]:
more_dakus = {"pervaiz elahi", "altaf husain"}
dakus_backup = more_dakus.copy()
more_dakus.clear()
dakus_backup

{'altaf husain', 'pervaiz elahi'}

In [17]:
imperialists = {"BBC", "CNN", "VOA"}

more_imperialists = imperialists.copy()

more_imperialists.add("DW")

print(imperialists)

print(more_imperialists)



{'VOA', 'BBC', 'CNN'}
{'VOA', 'BBC', 'DW', 'CNN'}


## `difference`



In [18]:
pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.difference(pti)


{'pervaiz elahi', 'zafrullah jamali'}

In [19]:
lotas_2013 = pml_q.difference(pml_q.difference(pml_n))
print(lotas_2013)

{'umar ayyub'}


We can also make use of `-` operator

In [20]:
pml_q - pti

{'pervaiz elahi', 'zafrullah jamali'}

## `difference_update`

This makes change in original set. similar to `x-y` with the exception that `x` is itself changed.

In [21]:
pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.difference_update(pml_n)

print(pml_q)

{'pervaiz elahi', 'fawad hussain', 'zafrullah jamali'}


In [22]:
pml_q.difference_update(pti)

print(pml_q)

{'pervaiz elahi', 'zafrullah jamali'}


## `discard`

removes an element from set if it is present.

In [23]:
pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"} 
pml_q.discard("zafrullah jamali")
pml_q

{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

In [24]:
pml_q.discard("choi nisar")
pml_q

{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

`ferdows ashiq` is not present in set `musharaf` but using `discard` did not raise an error.

## `remove`

Same as `discard` with the exception that an error is raised if the object is not present in set.

In [25]:
pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"} 
pml_q.remove("zafrullah jamali")
pml_q

{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

In [26]:
pml_q.remove("choi nisar")
pml_q

KeyError: ignored

## `pop`

In [27]:
pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"} 
pml_q.pop()
pml_q

{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

In [31]:
pml_q.pop()
pml_q

KeyError: ignored

Running the above cell multiple times will eventually raise an error when the set becomes empty.

## `union`

In [32]:
pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"} 
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}

pml_q.union(pml_n)

{'choi Nisar',
 'fawad hussain',
 'firdows ashiq',
 'khawaja Asif',
 'pervaiz elahi',
 'umar ayyub'}

In [33]:
pml_q | pml_n

{'choi Nisar',
 'fawad hussain',
 'firdows ashiq',
 'khawaja Asif',
 'pervaiz elahi',
 'umar ayyub'}

## `intersection`

In [34]:
pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"}  
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.intersection(pti)

{'fawad hussain', 'firdows ashiq', 'umar ayyub'}

We can also use `&` operator

In [35]:
pml_q & pti

{'fawad hussain', 'firdows ashiq', 'umar ayyub'}

The original set `pml_q` remains unchanged.

In [36]:
pml_q

{'fawad hussain', 'firdows ashiq', 'pervaiz elahi', 'umar ayyub'}

However, if we use `intersection_update`, the original set is changed

In [37]:
pml_q.intersection_update(pti)
pml_q

{'fawad hussain', 'firdows ashiq', 'umar ayyub'}

If we want to find out intersection between multiple sets, we can do it as following.

In [38]:
pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"} 
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}

sets = [pml_q, pml_n, pti]
set.intersection(*sets)

{'umar ayyub'}

or

In [39]:
sets = [pml_n, pti]
pml_q.intersection(*sets)

{'umar ayyub'}

So we can say that [`umar ayyub`](https://en.wikipedia.org/wiki/Omar_Ayub_Khan) is the most consistent lota.

## `isdisjoint`

returns `True` if the intersection of two sets is not empty set.

In [40]:
ppp = {"firdows ashiq", "fawad hussain", "Amin Faheem", "umar ayyub"} 
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}
ji = {"liaquat baloch", "siraj ul haq", "munawar hasan"}

ppp.isdisjoint(ji)

True

In [41]:
ppp.isdisjoint(pti)

False

## `issubset`

`<` is used for proper seubset and `<=` is used for subset checking.

In [42]:
pml_n = {"nawaz", "shahbaz", "pervaiz elahi", "mushahid husain"}
pml_q = {"pervaiz elahi", "mushahid husain"}

pml_q.issubset(pml_n)

True

In [43]:
pml_q <= pml_n

True

In [44]:
pml_q < pml_q

False

## `issuperset`

`>` is used for proper superset and `>=` is used for superset checking.

In [45]:
pml_n = {"nawaz", "shahbaz", "pervaiz elahi", "mushahid husain"}
pml_q = {"pervaiz elahi", "mushahid husain"}

pml_n.issuperset(pml_q)

True

In [46]:
pml_n >= pml_q

True

In [47]:
pml_n > pml_n

False

Since sets are unordered, the operation `in` is faster when applied to sets as compared to lists.

In [48]:
"nawaz" in pml_n

True

In [49]:
"nawaz" not in pml_q

True