# Sets and Tuples

In this notebook we will: 
- Learn how to work with `Sets` and `Tuples`

### Sets

`Sets` are groups of unique and unordered elements.  `Sets` also use curly brackets `{...}` but they should not be confused with `dictionaries`.

In [None]:
my_list = [1, 1, 1, 2, 2, 2]
my_set = set(my_list)
my_set

In [None]:
my_string = 'aaabbb'
my_set = set(my_string)
my_set

### _in_ and _not in_ in a Set

These operations work the same as how we use them in `lists` and `strings`.

In [None]:
my_set = set([1, 2, 2, 3, 3, 3])
print(my_set)
2 in my_set

In [None]:
my_set = set([1, 2, 2, 3, 3, 3])
print(my_set)
6 in my_set

__Note__ that we cannot index the set by index or element. 

In [None]:
my_set = set(["car", "car", "airplane"])
print(my_set)
my_set[0]

In [None]:
my_set = set(["car", "car", "airplane"])
print(my_set)
my_set["car"]

We can also iterate through a `set`.

In [None]:
my_set = set(["car", "car", "airplane"])
print(my_set)

for val in my_set:
    print(val)

### Set Operations

If we have two sets (`A` and `B`), we can find the:

- `A.union(B)` (things that are in A, in B, or both)
- `A.intersection(B)` (things that are in A and B)
- `A.difference(B)` (things that are in A but not B)

In [None]:
A = set(["a","b","c"])
B = set(["c","d","e"])

In [None]:
dir(A)

In [None]:
A.union(B) # elements in A or B

In [None]:
A.intersection(B) # elements in both A and B

In [None]:
A.difference(B) # returns elements in A by not in B

### Exercise

Write a function that determines if a 4 digit code has all values unique.  For example the code '1234' should return `True`, but the code '8281' should return `False`. 

In [None]:
code1 = '1234' # should return True
code2 = '8281' # should return False

### Write code here


In [None]:
code1 = '1234' # should return True
code2 = '8281' # should return False

def is_code_valid(code):
    return len(set(code)) == 4

print(f"Is code {code1} valid? {is_code_valid(code1)}")
print(f"Is code {code2} valid? {is_code_valid(code2)}")

### Exercise:

Write a function that receives a string and returns a dictionary that has the count of words as `keys` and the list of the words as the `value`.  

For example, the text `This is Super super cool` would return 
> {1: ['cool', 'is', 'this'], 2: ['super']}

__Note__ it must be case insensitive such that `Super` and `super` count as the same word.


In [None]:
text = "This is Super super cool"

### Write code here

In [None]:
### ANSWER
text = "This is Super super cool"
def word_counter(text):
    # initialize the dictionary
    count_2_word = {}
    
    # handles the case sensitive by making it all lower case
    # and then splits the text into a list separated by the space
    # notice that we can do chaining
    word_list = text.lower().split()
    
    unique_words = set(word_list)
    
    for word in unique_words:        
        if word_list.count(word) not in count_2_word:
            count_2_word[word_list.count(word)] = []
        count_2_word[word_list.count(word)].append(word)
        
    return count_2_word

In [None]:
print(word_counter(text))

In [None]:
# In Dicken's day, authors were paid by the word
#
text = """
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, 
it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of 
Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing 
before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was 
so far like the present period, that some of its noisiest authorities insisted on its being received, for good 
or for evil, in the superlative degree of comparison only.
"""

res = word_counter(text)
res

To find out what was the most common word we can do the following:

In [None]:
res[max(res.keys())]

Or if we want the 3 least common numbers used:

In [None]:
top_keys = sorted(res.keys())[0:3]
for key in top_keys:
    print(f"{key} : {res[key]}")

### Tuples

We won't spend a lot of time on tuples, because they are _very_ similar to `lists`. 
- they can hold any data type
- they are arbitrary length
- they are indexed (0, 1, 2, ..., length_of_tuple - 1)
- they use parenthesis `(...)`
The biggest difference are that tuples are immutable (cannot be modified)

In [None]:
list_of_people = ['Albert Einstein', 'Marie Curie', 'Ada Lovelace'] # square brackets for lists
tuple_of_people = ('Albert Einstein', 'Marie Curie', 'Ada Lovelace') # parenthesis for tuples

In [None]:
list_of_people

In [None]:
tuple_of_people

In [None]:
list_of_people[0] = 'Emmy Noether'
list_of_people

In [None]:
tuple_of_people[0] = 'Emmy Noether'

We can do similar operations as in `lists`

In [None]:
tuple_of_people[1]

In [None]:
tuple_of_people[1:]

In [None]:
len(tuple_of_people)

`Tuples` contains data that you __do not__ wish to modify (i.e. a birthdate), while `lists` will contain data that may be modified (i.e. age).