<img src="https://www.mines.edu/webcentral/wp-content/uploads/sites/267/2019/02/horizontallightbackground.jpg" width="100%"> 
### CSCI250 Python Computing: Building a Sensor System
<hr style="height:5px" width="100%" align="left">

# Python data type: `set`

# Objective
* introduce the `set` data type
* discuss methods for interaction with sets

# Resources
* [Python introduction](https://docs.python.org/3/tutorial/datastructures.html#sets)
* [Programiz Python tutorial](https://www.programiz.com/python-programming/set)

# Definition 
A `set` is an unordered collection of **unique** items:
* elements in a set are unique
* indexing has no meaning (order is irrelevant)

In [None]:
s = {'a',2,3.1,2.1 + 0.2j,'a','a'}

print(  id(s))
print(type(s))
print(     s )

# `set` accessibility

1. indexing
2. slicing
3. mutability
4. unpacking
5. nesting

## 1. indexing

`set` types cannot be indexed (order does not matter).

## 2. slicing

`set` types cannot be sliced (order does not matter).

## 3. mutability
Is the ability to change the content without changing the identity.

`set` type is **mutable**.

In [None]:
print(s)
print( id(s) )

In [None]:
s.add('80401')

print(s)
print( id(s))

## 4. unpacking
Allows simultaneous access to components of the `set` type.

In [None]:
s = {'a','b','c'}
print(s)

In [None]:
x,y,z = s
print(x,y,z)

## 5. nesting
`set` types **cannot** be nested.

# `set` specific methods
Can be accessed by typing the variable name, followed by `.` and **TAB**. 

The name of the method followed by `?` returns the associated selfdoc.

<img src="http://www.dropbox.com/s/fcucolyuzdjl80k/todo.jpg?raw=1" width="10%" align="right">

Explain the meaning of **methods** associated with type `set`.
* Add comments explaining their purpose. 
* Include examples demonstrating their usage.

In [None]:
s.add('blah')
print( s )

In [None]:
s.update( ['c',2.0+1.2j] )
print( s )

In [None]:
s.discard('blah')
print( s )

In [None]:
s.pop()

In [None]:
u = s.copy()
print( u )
u.pop()
print( u )

In [None]:
u.clear()

In [None]:
a = { 1,2,3,4,5}
b = { 3,4,5,6,7}

In [None]:
# union
print(a | b)
print(a.union(b))

In [None]:
# intersection
print(a & b)
print(a.intersection(b))

In [None]:
# difference
print(a - b)
print(a.difference(b))

In [None]:
# difference
print(b - a)
print(b.difference(a))

In [None]:
# symmetric difference
print(a ^ b)
print(a.symmetric_difference(b))

In [None]:
print(2 in a)
print(9 in b)

# `set` builtin functions

Functions and types available to the Python interpreter:

https://docs.python.org/3.3/library/functions.html

<img src="http://www.dropbox.com/s/fcucolyuzdjl80k/todo.jpg?raw=1" width="10%" align="right">

Explain the meaning and use of **builtins** usable on type `set`.
* Add comments explaining their purpose. 
* Include examples demonstrating their usage.

In [None]:
s = { 3,1,2,5,4}

In [None]:
all(s)

In [None]:
any(s)

In [None]:
list(enumerate(s))

In [None]:
len(s)

In [None]:
min(s)

In [None]:
max(s)

In [None]:
sorted(s)

In [None]:
sum(s)

# `set` addition

Is **undefined**.

In [None]:
a = { 1,2,3,4,5}
b = { 3,4,5,6,7}

a+b

# `set` multiplication

Is **undefined**.

In [None]:
a = { 1,2,3,4,5}

a*3

<img src="https://www.dropbox.com/s/7vd3ezqkyhdxmap/demo.png?raw=1" width="10%" align="left">

# Demo
Consider the text in the file pbd.txt extracted from [**Carl Sagan**](https://en.wikipedia.org/wiki/Carl_Sagan)'s book entitled **Pale Blue Dot**.

*** 
Use Python `set` functions to  
* find the number of distinct characters used in the text
* how many times each character appears in the text

In [None]:
import string

Load the PBD text from an external file into a string:

In [None]:
with open ("pbd.txt", "r") as pbdFile:
    myPBD = pbdFile.read()

Make a set of the characters present in the text.

In [None]:
mySET = set(myPBD)

print(mySET)

Remove punctuation and newline from the set using `discard()`:

In [None]:
mySET.discard("\n")

pct = string.punctuation
for p in pct:
    mySET.discard(p)

print(mySET)

Define a `dict` with 
* keys for each character in the set and 
* values for the number of time the character is present. 

In [None]:
myDCT = dict()

for c in mySET:
    myDCT[c] = myPBD.count(c)

In [None]:
for k in myDCT.keys():
    print(k,myDCT[k])

Display the dictionary sorted by key alphabetically:

In [None]:
for k in sorted(myDCT.keys()):
    print(k,myDCT[k])