# Python: Everything
- 13) Sets in Python: Introduction
    - Sets are unordered data collections which ignore duplicates
    - A example on **Jaccard distance** is also provided

##  هنگت‌ها در پایتون: آشنایی
    - با یک نمونه روی دوری جکارد
---
By Hamed Shah-Hosseini
<br>The whole code: https://github.com/ostad-ai/Python-Everything
 <br>Explanation in **English**: https://www.pinterest.com/HamedShahHosseini/programming-languages/python
<br>Explanation in **Persian**: https://www.instagram.com/words.persian

To create an empty set: There is only one way
<br>  تنها یک راه برای آفریدن یک هنگت تهی داریم:

In [1]:
myset=set()
print('The empty set: ',myset)
print(f'Length of an empty set is: {len(myset)}')

The empty set:  set()
Length of an empty set is: 0


To create and initialize a set: using curly brackets *{ }* or *set(iterable)*
<br> برای آفریدن و آغازدهی یک هنگت، دو راه داریم:

In [3]:
myset={'potato','tomato','oil','egg'}
print(myset)
myset2=set(['meat','oil','onion'])
print(myset2)

{'egg', 'tomato', 'oil', 'potato'}
{'onion', 'meat', 'oil'}


To access elements: using *for-loop*:<br>
دسترسی سازه‌ها در یک هنگت را میتوان در یک چنبر، انجام داد:

In [7]:
# using for-loop
myset={'potato','tomato','oil','egg'}
for elem in myset:
    print(f'It contains {elem},')

It contains egg,
It contains tomato,
It contains oil,
It contains potato,


To add elements or remove them, we use *add* and *remove* (or *discard*): <br>
افزودن  یا کاستن سازه‌ها:

In [14]:
myset={'potato','tomato','oil','egg'}
print('The set before adding: ',myset)
elem='meat'
myset.add(elem)
print(f'The set after adding {elem}: ',myset)
elem='egg'
myset.remove(elem) # could use discard
print(f'The set after removing {elem}: ',myset)

The set before adding:  {'egg', 'tomato', 'oil', 'potato'}
The set after adding meat:  {'potato', 'egg', 'oil', 'meat', 'tomato'}
The set after removing egg:  {'potato', 'oil', 'meat', 'tomato'}


We can do set operations on sets in Python:<br>
میتوانیم راه‌اندازشهای هنگت را، روی هنگت‌ها در پایتون انجام دهیم:

In [24]:
A={'potato','tomato','oil','egg'}
B={'meat','oil','onion'}
print('Union of the sets: ',A|B)
print('Intersection of the sets: ',A&B)
print('Set difference: ',A-B)
print('symmetric set difference: ',A^B)

Union of the sets:  {'onion', 'oil', 'potato', 'meat', 'egg', 'tomato'}
Intersection of the sets:  {'oil'}
Set difference:  {'egg', 'tomato', 'potato'}
symmetric set difference:  {'potato', 'egg', 'onion', 'meat', 'tomato'}


We may use functions such as *union*, *intersection* instead of operators:<br>
ما میتوانیم از کارکنش‌ها بهره بگیریم، به جای راه‌اندازگرها:

In [50]:
A={'potato','tomato','oil','egg'}
B={'meat','oil','onion'}
print('Union of the sets: ',A.union(B))
print('Intersection of the sets: ',A.intersection(B))
print('Set difference: ',A.difference(B))
print('symmetric set difference: ',A.symmetric_difference(B))

Union of the sets:  {'onion', 'oil', 'potato', 'meat', 'egg', 'tomato'}
Intersection of the sets:  {'oil'}
Set difference:  {'egg', 'tomato', 'potato'}
symmetric set difference:  {'potato', 'egg', 'onion', 'meat', 'tomato'}


To copy a set, or clear a set:<br>
    برای رونوشت از یک هنگت، یا پاک کردن آن

In [29]:
#using built-in function copy()
myset={'one','two','three'}
myset2=myset.copy()
print(f'ID={id(myset)} for {myset}')
print(f'ID={id(myset2)} for {myset2}')
#--to clear a set
myset.clear()
print('The set after clearing becomes empty: ',myset)

ID=82184224 for {'one', 'three', 'two'}
ID=82184448 for {'one', 'three', 'two'}
The set after clearing becomes empty:  set()


## Jaccard distance
Let's compute Jaccard distance between two sets:<br>
$distance(A,B)=1-\frac{|A \cap B|}{|A \cup B|}$<br>
The more similar the two sets, the closer the distance to zero. The maximum value of Jaccard distance is one.<br>
بیاییم دوری جکارد میان دو هنگت را رایانش کنیم:<br>
هر چه دو هنگت، مانندتر باشند؛ دوری جکارد، به صفر، نزدیکتر خواهد بود<br>
ارزش بیشینه دوری جکارد، یک است

In [2]:
# Jaccard distance
def JaccardD(A,B): #A and B are sets
    return 1-len(A&B)/len(A|B)
# let's compare the distance between the two following sentences
# بیاییم دوری میان دو سَهمان زیر را رایانش کنیم
sentence1='This is sentence with no punctuations to make it simple'
sentence2='The results on this topic is mentioned in the next sentence'
#---converting sentence to set
set1=set(sentence1.lower().split())
print('Set 1: ',set1)
print('------')
set2=set(sentence2.lower().split())
print('Set 2: ',set2)
print('---------------')
print("Jaccard distance between the two sets: ", JaccardD(set1,set2))

Set 1:  {'no', 'to', 'it', 'with', 'simple', 'is', 'make', 'punctuations', 'this', 'sentence'}
------
Set 2:  {'the', 'in', 'topic', 'results', 'next', 'mentioned', 'sentence', 'this', 'is', 'on'}
---------------
Jaccard distance between the two sets:  0.8235294117647058


## frozenset in Python
Let's mention about the frozenset, which is set in Python, which after creation, you cannot even remove or add elements:<br>
بیاببم درباره هنگت‌یخزده، یادآوری کنیم که گونه‌ای هنگت در پایتون است، که پس از آفریده شدن، نمیتوان سازه به آن افزود، و یا سازه از آن برداشت:

In [59]:
fset=frozenset(['hello','world','you', 'cannot', 'change', 'me'])
# the following statement gives AttributeError
#fset.add('bye')

**Hint:** **True** and **1** are considered the same in sets. Also, **False** and **0** are the same in sets.<br>
ارزش **ترو** و **یک**؛ همچنین **فالس** و **0**؛ در هنگت‌ها، یکی به شمار میآیند

In [3]:
A={True,1,False,0} # 0 and 1 are ignored because of False and True
for elem in A:
    print(f'Set contains {elem}')

Set contains False
Set contains True
