![pythonLogo.png](attachment:pythonLogo.png)
# Introduction to the Python language - Part 3 #  

### R. Mather May, 2020 ###

## This section is concerned with the manipulation of Sets and Dictionaries. It may contain interactive code with errors for you to repair. At the end there is also <font color="red">a logbook exercise for you to complete</font> ##

## Python Collections - Sets & Dictionaries ##

- Python Compund types consist of Sequences (Lists, Tuples and Strings) and Collections (<font color="red">Sets</font> and <font color="red">Dictionaries</font>)
- A <font color="red">{set}</font> is a collection of unique, unordered and non-indexed items
- Therefore we can use the <font color="red">set()</font> function to remove duplicate items from lists … <font color="red">set(list)</font>
- To find the intersect of two sets … <font color="red">set1 & set2</font>
- To find items  unique to set 1 … <font color="red">set1 - set2</font>
- To find the union of both sets  … <font color="red">set1 | set2</font>
- Can also add and remove items with <font color="red">set.add("key")</font> and <font color="red">set.remove("key")</font>
- A <font color="red"> {dictionary} </font> is an associative array of key-value pairs
- Keys and values are seperated by a colon and pairs are separated by commas, for example ...
- <font color="red">table={"10A":"John Smith","10B":"Jack Smith","10C":"George Smith"}</font>
- To add an item to a dictionary ... <font color="red">table["10D"]="Gunnar Smith"</font>
- to delete and item ... <font color="red">del table["10D"]</font>
- To output keys ... <font color="red">table.keys()</font> 
- To output values … <font color="red">table.values()</font>

In [1]:
# SETS ... collections of unique, unordered and non-indexed items
# An example of a SET ... 
IDs = {"10A","10B","10C","10D"}
print(IDs)
# Duplicate items are automatically removed from sets ... 
IDs = {"10A","10B","10C","10D","10A"}
print(IDs)
# Sets are great for removing duplicates from lists ... "
identities=["10A","10B","10C","10D","10A","10A","10B","10C","10D","10A","10A","10B","10C","10D","10A"]
print(identities)
identities=set(identities)
print(identities)

{'10C', '10B', '10D', '10A'}
{'10C', '10B', '10D', '10A'}
['10A', '10B', '10C', '10D', '10A', '10A', '10B', '10C', '10D', '10A', '10A', '10B', '10C', '10D', '10A']
{'10C', '10B', '10D', '10A'}


In [2]:
# WORKING WITH SETS
id1={"10A","10B","10C"}
id2={"10C","10D","10E"}
print("To find the intersection of two sets do ... id1 & id2 ... which is ... ", id1 & id2)
print("To find items that are unique to id1 (not in id2) do ... id1 - id2 ... which is ... ", id1 - id2)
print("To find the union of both sets (in id2 & id2) do ... id1 | id2 ... which is ... ", id1 | id2)
id1.add('10X')
print("sets are mutable so can add ... id1.add('10X') ... ", id1 )
id1.remove('10X')
print("... and remove ... id1.remove('10X') ... ", id1 )

To find the intersection of two sets do ... id1 & id2 ... which is ...  {'10C'}
To find items that are unique to id1 (not in id2) do ... id1 - id2 ... which is ...  {'10A', '10B'}
To find the union of both sets (in id2 & id2) do ... id1 | id2 ... which is ...  {'10D', '10E', '10B', '10C', '10A'}
sets are mutable so can add ... id1.add('10X') ...  {'10X', '10C', '10B', '10A'}
... and remove ... id1.remove('10X') ...  {'10C', '10B', '10A'}


In [3]:
# THE dir() FUNCTION - returns all properties and methods of an object - here demonstrated with a set
print("Use dir(set) for a full list of methods ... ", dir(set))

Use dir(set) for a full list of methods ...  ['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__iand__', '__init__', '__init_subclass__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']


In [4]:
# WORKING WITH DICTIONARIES ... which are essentially associative array of KEYS-VALUE pairs
table={"10A":"John Smith","10B":"Jack Smith","10C":"George Smith"}
print(table)
print("The value corresponding to key 10B is ...", table["10B"])
print("To add another key-value pair ... table[\"10D\"]=\"Gunnar Smith\" ")
table["10D"]="Gunnar Smith"
print(table)
print("To remove it ... del table[\"10D\"] ")
del table["10D"]
print(table)

# Dictionaries are very useful for lookup values ... e.g. a vocabulary program
print("To output keys use ... table.keys() ...", table.keys())
print("To output values use ... table.values() ...", table.values())

{'10A': 'John Smith', '10B': 'Jack Smith', '10C': 'George Smith'}
The value corresponding to key 10B is ... Jack Smith
To add another key-value pair ... table["10D"]="Gunnar Smith" 
{'10A': 'John Smith', '10B': 'Jack Smith', '10C': 'George Smith', '10D': 'Gunnar Smith'}
To remove it ... del table["10D"] 
{'10A': 'John Smith', '10B': 'Jack Smith', '10C': 'George Smith'}
To output keys use ... table.keys() ... dict_keys(['10A', '10B', '10C'])
To output values use ... table.values() ... dict_values(['John Smith', 'Jack Smith', 'George Smith'])


In [5]:
# WORKED EXAMPLE 1 - accessing keys & values and storing dictionary keys & values as lists
departments={"Human Resources":3, "Sales":5, "R&D":4}
print("Here are keys directly from the dictionary", departments.keys())
print("Here are values directly from the dictionary", departments.values())
# Creating and using separate lists of keys & values
keys = list(departments.keys())
values = list(departments.values())
print("Here are keys from the list", keys)
print("Here are values from the list", values)




Here are keys directly from the dictionary dict_keys(['Human Resources', 'Sales', 'R&D'])
Here are values directly from the dictionary dict_values([3, 5, 4])
Here are keys from the list ['Human Resources', 'Sales', 'R&D']
Here are values from the list [3, 5, 4]


In [6]:
# WORKED EXAMPLE 2 - sets, lists and removing duplicate values
# Create a set ... this should remove the duplicate 5 at the end of the collection
s={3,4,5,6,7,5}
print(s)
l = list(s)
print("Using count method to double-check only one occurrence of 5 ... no occurrences =", l.count(5))

{3, 4, 5, 6, 7}
Using count method to double-check only one occurrence of 5 ... no occurrences = 1


In [7]:
# WORKED EXAMPLE 3 - working with dictionary values 
di={"Mon":20, "Tue":25, "Thu":30}
# Store all the dictionary values in a list
li=list(di.values())
print(li)
# Add another key "Wed" and assign value "40"
di["Wed"]=40
# Update the list with the new value
li=list(di.values())
print(li)
# Sum all values in both list and dictionary 
print("The sum of all li values is ...", li[0]+li[1]+li[2]+li[3])
print("The sum of all di values is ...", di["Mon"]+di["Tue"]+di["Wed"]+di["Thu"])
# Create another dictionary with numeric keys then add some values
d={0:10,1:20,2:20,3:30}
print("The sum of ... d[0]+d[3] ... is ... ", d[0]+d[3])

[20, 25, 30]
[20, 25, 30, 40]
The sum of all li values is ... 115
The sum of all di values is ... 115
The sum of ... d[0]+d[3] ... is ...  40


## <font color="red">Logbook Exercise 3</font> ##

Create a 'code' cell below. In this do the following:
- on the first line create the following set ... a=[0,1,2,3,4,5,6,7,8,9,10] 
- on the second line create the following set ... b=[0,5,10,15,20,25]
- on the third line create the following dictionary ... topscores={"Jo":999, "Sue":987, "Tara":960; "Mike":870}
- use a combination of print() and type() methods to produce the following output

```
list a is ...  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list b is ...  [0, 5, 10, 15, 20, 25]
The type of a is now ... <class 'list'>
```

- on the next 2 lines convert list a and b to sets using set()
- on the following lines use a combination of print(), type() and set notaion (e.g. 'a & b', 'a | b', 'b-a') to obtain the following output

```
set a is ...  {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
set b is ...  {0, 5, 10, 15, 20, 25}
The type of a is now ... <class 'set'>
Intersect of a and b is [0, 10, 5]
Union of a and b is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25]
Items unique to set b are {25, 20, 15}
```

- on the next 2 lines use print(), '.keys()' and '.values()' methods to obtain the following output 

```
topscores dictionary keys are dict_keys(['Jo', 'Sue', 'Tara', 'Mike'])
topscores dictionary values are dict_values([999, 987, 960, 870])
```

# References & Learning Resources#

 - W3Schools - there are many online resources for Python but the Python tutorial at https://www.w3schools.com/python/ is thorough, progressive, interactive and free. If you complete the main tutorial (skip the bits on installing Python as we will be using Ancaconda/Jupyter) the later sections on **"File Handling"**, **"NumPy"** and **"Machine Learning"** are also relevant. The **"Exercises"** and **"Quiz"** sections are also worthwhile activities for consolidating knowledge.
 - **Phillips, D. (2015). Python 3 object-oriented programming. Packt Publishing Ltd.** Although a 3rd edition has been released the 2nd edition is still pretty much up-to-date  and seems to be widely available in PDF format. As an added bonus this covers Design Patterns in some detail.
 - **https://www.learnpython.org/** is another comprehensive and intercative resource
 - **https://docs.python.org/3.7/tutorial/** is Python's own text-based tutorial. Despite the seemingly daunting number of sub-sections, it can be consumed in a fairly short time and manages to be both concise and comprehensive.
 - **Think Python 2e** is an excellent in-depth and free version of the O'Reilly hardcopy by Allen B. Downey and is available here ... https://greenteapress.com/wp/think-python-2e/
 - I have also adapted examples from *Learn Python In A Day: The Ultimate Crash Course To Learning The Basics Of Python In No Time* by *Acodemy* but this is out of print and is only mentioned for completeness.