# Sets and Tuples

In this notebook we will: 
- Learn how to work with `Sets` and `Tuples`

### Sets

`Sets` are groups of unique and unordered elements.  `Sets` also use curly brackets `{...}` but they should not be confused with `dictionaries`.

In [2]:
my_list = [1, 1, 1, 2, 2, 2]
my_set = set(my_list)
my_set

{1, 2}

In [4]:
my_string = 'aaabbb'
my_set = set(my_string)
my_set

{'a', 'b'}

### _in_ and _not in_ in a Set

These operations work the same as how we use them in `lists` and `strings`.

In [8]:
my_set = set([1, 2, 2, 3, 3, 3])
print(my_set)
2 == my_set

{1, 2, 3}


False

In [9]:
my_set = set([1, 2, 2, 3, 3, 3])
print(my_set)
6 in my_set

{1, 2, 3}


False

__Note__ that we cannot index the set by index or element. 

In [10]:
my_set = set(["car", "car", "airplane"])
print(my_set)
my_set[0]

{'airplane', 'car'}


TypeError: 'set' object is not subscriptable

In [11]:
my_set = set(["car", "car", "airplane"])
print(my_set)
my_set["car"]

{'airplane', 'car'}


TypeError: 'set' object is not subscriptable

We can also iterate through a `set`.

In [12]:
my_set = set(["car", "car", "airplane"])
print(my_set)

for val in my_set:
    print(val)

{'airplane', 'car'}
airplane
car


### Set Operations

If we have two sets (`A` and `B`), we can find the:

- `A.union(B)` (things that are in A, in B, or both)
- `A.intersection(B)` (things that are in A and B)
- `A.difference(B)` (things that are in A but not B)

In [13]:
A = set(["a","b","c"])
B = set(["c","d","e"])

In [14]:
dir(A)

['__and__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__iand__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__isub__',
 '__iter__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__rand__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__ror__',
 '__rsub__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__xor__',
 'add',
 'clear',
 'copy',
 'difference',
 'difference_update',
 'discard',
 'intersection',
 'intersection_update',
 'isdisjoint',
 'issubset',
 'issuperset',
 'pop',
 'remove',
 'symmetric_difference',
 'symmetric_difference_update',
 'union',
 'update']

In [15]:
A.union(B) # elements in A or B

{'a', 'b', 'c', 'd', 'e'}

In [16]:
A.intersection(B) # elements in both A and B

{'c'}

In [17]:
A.difference(B) # returns elements in A by not in B

{'a', 'b'}

### Exercise

Write a function that determines if a 4 digit code has all values unique.  For example the code '1234' should return `True`, but the code '8281' should return `False`. 

In [27]:
code1 = '1234' # should return True
code2 = '333' # should return False
def code(text):
    return len(set(text))==len(text)

print(f" is code {code1} valid?{code(code1)}")
print(f" is code {code2} valid?{code(code2)}")

 is code 1234 valid?True
 is code 333 valid?False


In [28]:
code1 = '1234' # should return True
code2 = '8281' # should return False

def is_code_valid(code):
    return len(set(code)) == 4

print(f"Is code {code1} valid? {is_code_valid(code1)}")
print(f"Is code {code2} valid? {is_code_valid(code2)}")

Is code 1234 valid? True
Is code 8281 valid? False


### Exercise:

Write a function that receives a string and returns a dictionary that has the count of words as `keys` and the list of the words as the `value`.  

For example, the text `This is Super super cool` would return 
> {1: ['cool', 'is', 'this'], 2: ['super']}

__Note__ it must be case insensitive such that `Super` and `super` count as the same word.


In [45]:
text = " is is is Super super"

### Write code here

In [46]:
### ANSWER
text = " is is is Super super"
def word_counter(text):
    # initialize the dictionary
    count_2_word = {}
    
    # handles the case sensitive by making it all lower case
    # and then splits the text into a list separated by the space
    # notice that we can do chaining
    word_list = text.lower().split()
    
    unique_words = set(word_list)
    
    for word in unique_words:        
        if word_list.count(word) not in count_2_word:
            count_2_word[word_list.count(word)] = []
        count_2_word[word_list.count(word)].append(word)
        
    return count_2_word

In [49]:
def word_conter(text):
    count_2_word={}
    word_list=text.lower().spilt()
    unique_words = set(word_list)
    for word in unique_words:
        if word_list.count(word) not in count_2_word:
            count_2_word[word_list.count(word)] = []
        count_2_word[word_list.count(word)].append(word)
        
    return count_2_word

In [51]:
print(word_counter(text))

{2: ['super'], 3: ['is']}


In [52]:
# In Dicken's day, authors were paid by the word
#
text = """
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, 
it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of 
Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing 
before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was 
so far like the present period, that some of its noisiest authorities insisted on its being received, for good 
or for evil, in the superlative degree of comparison only.
"""

res = word_counter(text)
res

{2: ['had',
  'its',
  'for',
  'epoch',
  'all',
  'age',
  'direct',
  'times,',
  'us,',
  'were',
  'before',
  'season',
  'going'],
 1: ['way—in',
  'being',
  'evil,',
  'wisdom,',
  'despair,',
  'insisted',
  'authorities',
  'in',
  'worst',
  'short,',
  'foolishness,',
  'everything',
  'winter',
  'to',
  'present',
  'incredulity,',
  'that',
  'hope,',
  'received,',
  'belief,',
  'degree',
  'darkness,',
  'spring',
  'period,',
  'period',
  'or',
  'noisiest',
  'like',
  'light,',
  'heaven,',
  'other',
  'nothing',
  'good',
  'superlative',
  'some',
  'so',
  'best',
  'far',
  'comparison',
  'on',
  'only.'],
 12: ['of'],
 4: ['we'],
 14: ['the'],
 11: ['was'],
 10: ['it']}

To find out what was the most common word we can do the following:

In [57]:
res[min(res.keys())]

['way—in',
 'being',
 'evil,',
 'wisdom,',
 'despair,',
 'insisted',
 'authorities',
 'in',
 'worst',
 'short,',
 'foolishness,',
 'everything',
 'winter',
 'to',
 'present',
 'incredulity,',
 'that',
 'hope,',
 'received,',
 'belief,',
 'degree',
 'darkness,',
 'spring',
 'period,',
 'period',
 'or',
 'noisiest',
 'like',
 'light,',
 'heaven,',
 'other',
 'nothing',
 'good',
 'superlative',
 'some',
 'so',
 'best',
 'far',
 'comparison',
 'on',
 'only.']

Or if we want the 3 least common numbers used:

In [58]:
top_keys = sorted(res.keys())[0:3]
for key in top_keys:
    print(f"{key} : {res[key]}")

1 : ['way—in', 'being', 'evil,', 'wisdom,', 'despair,', 'insisted', 'authorities', 'in', 'worst', 'short,', 'foolishness,', 'everything', 'winter', 'to', 'present', 'incredulity,', 'that', 'hope,', 'received,', 'belief,', 'degree', 'darkness,', 'spring', 'period,', 'period', 'or', 'noisiest', 'like', 'light,', 'heaven,', 'other', 'nothing', 'good', 'superlative', 'some', 'so', 'best', 'far', 'comparison', 'on', 'only.']
2 : ['had', 'its', 'for', 'epoch', 'all', 'age', 'direct', 'times,', 'us,', 'were', 'before', 'season', 'going']
4 : ['we']


### Tuples

We won't spend a lot of time on tuples, because they are _very_ similar to `lists`. 
- they can hold any data type
- they are arbitrary length
- they are indexed (0, 1, 2, ..., length_of_tuple - 1)
- they use parenthesis `(...)`
The biggest difference are that tuples are immutable (cannot be modified)

In [60]:
list_of_people = ['Albert Einstein', 'Marie Curie', 'Ada Lovelace'] # square brackets for lists
tuple_of_people = ('Albert Einstein', 'Marie Curie', 'Ada Lovelace') # parenthesis for tuples

In [61]:
list_of_people

['Albert Einstein', 'Marie Curie', 'Ada Lovelace']

In [62]:
tuple_of_people

('Albert Einstein', 'Marie Curie', 'Ada Lovelace')

In [63]:
list_of_people[0] = 'Emmy Noether'
list_of_people

['Emmy Noether', 'Marie Curie', 'Ada Lovelace']

In [68]:
tuple_of_people[0] = 'Emmy Noether'   

TypeError: 'tuple' object does not support item assignment

We can do similar operations as in `lists`

In [69]:
tuple_of_people[1]

'Marie Curie'

In [70]:
tuple_of_people[1:]

('Marie Curie', 'Ada Lovelace')

In [67]:
len(tuple_of_people)

3

`Tuples` contains data that you __do not__ wish to modify (i.e. a birthdate), while `lists` will contain data that may be modified (i.e. age).