# More Collections - Tuples, Sequences, Named Tuples, Dictionaries & Sets
This introduces [Tuples & Sequences](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences), [NamedTuples](https://docs.python.org/3/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields), [Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) & [Sets](https://docs.python.org/3/tutorial/datastructures.html#sets).

## Introduction
Collections are data structures for collections of data. Python has a number of these, which exist for different purposes. This notebook starts with a general introduction and then lists the main collection types.

In [None]:
# Import libraries and print the version numbers for troubleshooting
import sys ; print(f'Python version is {sys.version}')

### Summary of Properties
* **Tuples** are immutable (contents can be accessed, but not changed)
* **Sequences** are immutable
* **NamedTuples** are tuples where the contents are associated with pre-defined names (i.e. key-value pairs)
* **Dictionaries** are unordered collections of key-value pairs (keys must be unique)
* **Sets** are collections of unique data 

The main type that we have not introduced yet is the `class` object, and we will not cover it in these sessions. However, you can gain a good understanding of it by starting with the `namedtuple` (note that namedtuples can have functions and properties added to them by means of sub-classing, a form of 'inheritance').

### Instantiation (definition)
* **Tuples** are defined with the keyword `a_tuple = Tuple(3, 6)` or using the `()` format - `a_tuple = (3,6)`
* **NamedTuples** are first defined with the `namedtuple` keyword `e.g. Point2D = namedtuple('Point2D', ['x', 'y'])`, then values are assigned using normal methods for tuples, or using keys - `p = Point2D(11, y=22)`
* **Dictionaries** are defined with the keyword, value pairs,  `dict(sape=4139, guido=4127, jack=4098)` or from a list of (key, value) tuples `tel = dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])` or using the `{}` format - `tel = {'jack': 4098, 'sape': 4139}`
* **Sets** are defined by curly parentheses `{2,6,7,8}` or with the set() function to convert from other collections (e.g. `set([3, 6, 4, 2, 9, 6, 2])`)

### Identifying members of the collection
* If a value is presented in the form of `value in collection`, then this returns `True` if the value is in the collection, else `False`.

### Iterating over the collections
* It is possible to iterate over these collections - `for item in collection: ...`.

## Tuples

Tuples are 'immutable' collections of data. 'Immutable' means that the contents cannot be changed once the tuple has been created. They are defined by separating values by commas and (typically) by enclosing them in parentheses, e.g. (2.5, 4.1).

Tuples are useful for carrying multiple pieces of data, which do not have to be of the same type.

In [None]:
# tuples can be defined using simple notation - simple parentheses - '()'
tuple_1 = (3, 1)
tuple_1

In [None]:
# individual elements of a tuple can be addressed using an index [0]
tuple_2 = (3, 1)
tuple_2[0]

In [None]:
# tuples can carry different types of data
tuple_3 = (3, "apples")
tuple_3

In [None]:
# tuples can also be defined using the tuple() function to convert other collections
tuple_4 = tuple([2,4,-5])   # tuple created from list
tuple_4

In [None]:
# Check whether an item is in a tuple
4 in (5, 4, 7)

In [None]:
# Iterating over values in a tuple
for x in (2,4,5):
    print(3 + x, end = '\t')

In [None]:
# There is no such thing as a 'tuple comprehension'. However, using the corresponding notation will 
# return a 'generator expression', which is a way of storing enumerable definitions of items for later use.
(n * 2 for n in (0, 1, 2, 3))

### Tuples as output to functions

In [None]:
def tuple_out_reverse(a, b):
    return b, a

tuple_out_reverse('pear', 6)

## Named Tuples
> The `namedtuple` function returns a new tuple subclass with named fields. The new subclass is used to create tuple-like objects that have fields accessible by attribute lookup as well as being indexable and iterable. Instances of the subclass also have a helpful docstring (with typename and field_names) and a helpful __repr__() method which lists the tuple contents in a name=value format.

Named tuples have names that are defined for the data that they hold, and these can be used to reference the data. This means that they can act as efficient containers for data. Note that they are immutable - you cannot change the values once they have been created.

In [None]:
# Defining a named tuple
from collections import namedtuple
Pt3d = namedtuple('Pt3d', ['x', 'y', 'z'])
pt1 = Pt3d(5.4,-1.8,5.1)
pt1

In [None]:
# Extracting values
print('The first value can be extracted by index (pt1[0] = {}) or by name (pt1.x = {})'.format(pt1[0], pt1.x))

In [None]:
# Testing for membership (looks at values, not at names)
pt2 = Pt3d(3.4,8.5,1.2)
3.4 in pt2

In [None]:
# Iterating over values in a named tuple
pt3 = Pt3d(-9.1, 6.2, 4.8)
for i in pt3:
    print(i*i, end = '\t')

print()

# Iterating over name and value pairs
for name, value in pt3._asdict().items():   # turn it into a dictionary and extract key-value pairs
    print('{}-coordinate is {}'.format(name, value), end = '\t')

## Dictionaries
In their most basic form dictionaries are used to store (key, value) pairs. (Note that before Python 3.7 dictionaries were not required to retain the order of the items inserted.)

>A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is currently only one standard mapping type, the [dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries). (For other containers see the built-in list, set, and tuple classes, and the collections module.)

Two standard ways for creating dictionaries are by means of the `dict(x = a, y = b, ...)` function, or the curly brackets notation - `{'key1': value1, 'key2': value2, ...}` (not to be confused with the set notation that also uses curly brackets).

Additional flavours of dictionary include:
* OrderedDict - an ordered dictionary class for versions of Python before 3.7
* DefaultDict - a dictionary class that provides default values if the key is not in the dicitonary (rather than returning an exception)

In [None]:
dict(x = 2.4, y = 9.1, z = -7.1)

In [None]:
{'x':4.4, 'y':-2.7, 'z':1.2}

In [None]:
# Testing for membership
'x' in {'x':4.4, 'y':-2.7, 'z':1.2}

In [None]:
# Extracting values from a dictionary (note that this will throw an exception if the key is not in the dictionary)
dict_3 = {'x':4.4, 'y':-2.7, 'z':1.2}
print(dict_3['x'])

In [None]:
dict_3 = {'x':4.4, 'y':-2.7, 'z':1.2}
print(dict_3.get('x'))

In [None]:
dict_3 = {'x':4.4, 'y':-2.7, 'z':1.2}
print(dict_3.get('k'))

It is possible to iterate over dictionaries and also to do 'dictionary comprehension' that generates a new dictionary from an original dictionary by carrying out operations on each key-value pair in turn.

In [None]:
# Iterating over the keys in a dictionary
for k in {'x':4, 'y':7, 'z':2}:
    print('coordinate {}'.format(k), end = '\t')
    
print()

# Iterating over keys and values in a dictionary
for k,v in {'x':4, 'y':7, 'z':2}.items():
    print('{}-coordinate is {}'.format(k, v), end = '\t')

In [None]:
# Dictionary comprehension
dict_1 = {'x':4.4, 'y':-2.7, 'z':1.2}
{k: 2*v for k, v in dict_1.items()}

In [None]:
# Creating a new dictionary using dictionary comprehension applied to a list of tuples
{k: 2*v for k, v in [('x',-6.1), ('y',5.1), ('z',-4.9), ('r',7.6)]}

## Sets
>A [set object](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset) is an unordered collection of distinct [hashable](https://docs.python.org/3/glossary.html#term-hashable) objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. (For other containers see the built-in dict, list, and tuple classes, and the collections module.) 
>*\[from the [Python Documents](https://docs.python.org/3.7/index.html)\]*

1. If an element is added that already exists, then the set is not changed. 
2. Since the set is not sorted it is not possible to use indices to reference members of the set. For this you would need to use the `frozenset` collection, which is immutable (i.e. you cannot change it once it has been created).

In [None]:
{3, 5, 7, 3, 5, 3, -5, 1, -5, 8}

In [None]:
my_set = set([3, 5, 7, 3, 5, 3, 5, 1, 8])
my_set

In [None]:
# Test for existence of a value in the set
3 in my_set

In [None]:
# Elements of the set may be extracted / removed one at a time using the `pop` function
my_set = set([3, 5, 7, 3, 5, 6, 5, 8])
print(my_set)
print(my_set.pop())
print(my_set)

In [None]:
# Extracting the second element
print(my_set.pop())
print(my_set)

In [None]:
# adding a new value will change the list in-place
my_set.add(4)
my_set

In [None]:
# adding a value that is already present will not change the set
my_set.add(3)
my_set

It is possible to iterate over a set and also to carry out 'set comprehension'.

In [None]:
# Iterating over values in a set (note that while the values may appear sorted, they are technically unsorted)
for x in {4,7,7,4,3,8,6,-5}:
    print(x, end = '\t')

In [None]:
# Set comprehension (using the curly brackets convention)
{s**2 for s in range(1,9)}

Note: The set format can be useful for processing a list to generate a sorted list of unique values.

In [None]:
# Return a sorted list of unique values
my_list = [2.15,6.4,4.9,5.52,-5.0,8.6,-1.1]
sorted(set(my_list))

## References
* https://pymotw.com/2/collections/namedtuple.html
* https://dbader.org/blog/writing-clean-python-with-namedtuples
* https://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields
* https://docs.python.org/2/tutorial/datastructures.html#tuples-and-sequences