[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PyGIS222/Fall2019/blob/master/LessonM35_TuplesSets.ipynb)

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/PyGIS222/Fall2019/master?filepath=LessonM35_TuplesSets.ipynb)

### Notebook Lesson 3.5

# Object Types: Tuples and Sets

This Jupyter Notebook is part of Module 3 of the course GIS222 (Fall2019).

This lesson discusses the Python object types **Tuples** and **Sets**. Carefully study the content of this Notebook and use the chance to reflect the material through the interactive examples.

### Sources
Part A of this lesson was inspired by the page [Understanding Tuples in Python 3](https://www.digitalocean.com/community/tutorials/understanding-tuples-in-python-3) of the [Digital Ocean Community](https://www.digitalocean.com/community).

---

# Part A: Introduction

Table 1 summarizes categories and literals of complex data containers in Python. That gives a first idea how *Tuples* and *Sets* distinguish from *Lists* and *Dictionaries*.

Table 1. *Literals of Composite Complex Data Containers in Python*

| Container      | Category   |  Denotation  |  Feature    |  Examples  | 
| ---------:    | :---------:|  :---------: | :---------: | :--------: |
| **List**       | Sequence   | `( [ ] )`    | mutable     | `[‘a’, ‘b’, ‘c’]` |
| **Dictionary** |  Mapping   | `{'key: value', ...}` | mutable | `{‘Alice’: ‘38’, ‘Beth’: ‘35’}` |
| **Tuple**      | Sequence   | `( ( ) )`    | immutable   | `(‘a’, ‘b’, ‘c’)` |
| **Set**        | Set        | `set()`   |  mutable   | `set(['a', 'c', 'e'])` |

Tuples are immutable sequences, they are used for grouping data. Sets are mutable, but they are a completely new category, which derives from mathematical set theories. Below, both object types are introduced separately.

# Part B: Tuples


A tuple is a data structure that is an immutable, or unchangeable, ordered sequence of elements. A tuple is similar to a list, except for their mutability. Because tuples are immutable, their values cannot be modified and they do not come with methods that would change these objects in place. 

A tuple in Python looks like this:

In [2]:
coral = ('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')

Tuples have values between parentheses `( )` separated by commas `,`. Each element or value that is inside of a tuple is called an item.

Empty tuples will appear as `coral = ()`, but tuples with even one value must use a comma as in `coral = ('blue coral',)`. This is because, if a tuple sequence is assigned with only one number, the respective literal could not be distingisued from assigning an integer number:

In [161]:
a=(40)
b=(40,)
type(a), type(b)

(int, tuple)

Now, if we `print()` the tuple above, we’ll receive the following output, with the tuple still typed by parentheses:

In [162]:
print(coral)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')


When thinking about Python tuples and other data structures that are types of collections, it is useful to consider all the different collections you have on your computer: your assortment of files, your song playlists, your browser bookmarks, your emails, the collection of videos you can access on a streaming service, and more.

Tuples are similar to lists, but their values can’t be modified. Because of this, when you use tuples in your code, you are conveying to others that you don’t intend for there to be changes to that sequence of values. Additionally, because the values do not change, your code can be optimized through the use of tuples in Python, as the code will be slightly faster for tuples than for lists.

### Indexing Tuples

As an ordered sequence of elements, each item in a tuple can be called individually, through indexing. This is done based on sequence operations and it done in congruence to lists. 

Each item corresponds to an index number, which is an integer value, starting with the index number `0`. Because each item in a Python tuple has a corresponding index number, we can call a discrete item of the tuple by referring to its index number:

`coral[0]` calls the item `'blue coral'`       <br>
`coral[1]` calls the item  `'staghorn coral'`  <br>
`coral[2]` calls the item `'pillar coral'`     <br>
`coral[3]` calls the item `'elkhorn coral'`

If we call the tuple coral with an index number of any that is greater than `3`, it will be out of range as it will not be valid. Instead, an `IndexError` will be raised:

In [128]:
print(coral[22])

IndexError: tuple index out of range

In addition to positive index numbers, we can also access items from the tuple with a negative index number, by counting backwards from the end of the tuple, starting at `-1`. This is especially useful if we have a long tuple and we want to pinpoint an item towards the end of a tuple.

For the same tuple coral, the negative index breakdown looks like this:


`coral[-4]` calls the item `'blue coral'`       <br>
`coral[-3]` calls the item  `'staghorn coral'`  <br>
`coral[-2]` calls the item `'pillar coral'`     <br>
`coral[-1]` calls the item `'elkhorn coral'`


### Further Sequence Operations with Tuples

Since tuples are sequences, most sequence operations that are relevant for lists are also relevant for tuples, as long as they do not mutate the sequence.

#### Slicing

Slices allow us to call multiple values by creating a range of index numbers separated by a colon `[x:y]`. To just print the middle items of coral, we can do so by creating a slice. Remember, the first index number is where the slice starts (inclusive), and the second index number is where the slice ends (exclusive), which is why the following example prints the items at position 1 and 2: 

In [14]:
print(coral[1:3])

('staghorn coral', 'pillar coral')


If we want to include either end of the list, we can omit one of the numbers in the tuple slicing syntax. And we can also use negative index numbers when slicing tuples, just like with positive index numbers:

In [17]:
print(coral[:-1])

('blue coral', 'staghorn coral', 'pillar coral')


One last parameter that we can use with slicing of sequences, inclduing tuples, is called stride, which refers to how many items to move forward after the first item is retrieved from the tuple. So far, we have omitted the stride parameter, and Python defaults to the stride of 1, so that every item between two index numbers is retrieved.

The syntax for this construction is `tuple[x:y:z]`, with z referring to stride. Let’s make a larger list, then slice it, and give the stride a value of 2. The example below prints only every second item from `1` (inclusive) until `11` (exclusive):

In [21]:
numbers = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
print(numbers[1:11:2])

(1, 3, 5, 7, 9)


We can omit the first two parameters and use stride alone as a parameter with the syntax `tuple[::z]`. Here an example to print every third tuple item:

In [22]:
print(numbers[::3])

(0, 3, 6, 9, 12)


#### Concatenating and Multiplying Tuples

Operators can be used to concatenate or multiply tuples. Concatenation is done with the `+` operator, and multiplication is done with the `*` operator. 

We can concatenate string items in a tuple with other strings using the `+` operator:

In [24]:
print('This reef is made up of ' + coral[1])

This reef is made up of staghorn coral


We were able to concatenate the string item at index number `0` with the string `'This reef is made up of '`. We can also use the `+` operator to concatenate two or more tuples together.

In [73]:
coral = ('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')
kelp = ('wakame', 'alaria', 'deep-sea tangle', 'macrocystis')
coral_kelp = (coral + kelp)
print(coral_kelp)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis')


Because the `+` operator can concatenate, it can be used to combine tuples to form a new tuple, though it cannot modify an existing tuple.

The `*` operator can be used to multiply tuples. Perhaps you need to make copies of all the files in a directory onto a server or share a playlist with friends — in these cases you would need to multiply collections of data.

Let’s multiply the `coral` tuple by `2` and the kelp tuple by `3`, and assign those to new tuples:

In [27]:
multiplied_coral = coral * 2
multiplied_kelp = kelp * 3

In [28]:
print(multiplied_coral)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral', 'blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')


In [29]:
print(multiplied_kelp)

('wakame', 'alaria', 'deep-sea tangle', 'macrocystis', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis')


By using the `*` operator we can replicate our tuples by the number of times we specify, creating new tuples based on the original data sequence.

### How Tuples Differ from Lists

The primary way in which tuples are different from lists is that they cannot be modified. This means that items cannot be added to or removed from tuples, and items cannot be replaced within tuples. (You can, however, concatenate two or more tuples to form a new tuple, as we have seen in the examples above.)

Let’s consider our coral tuple. Say we want to replace the item 'blue coral' with a different item called 'black coral'. If we try to change that output the same way we do with a list, ...

In [64]:
coral[0] = 'black coral'

TypeError: 'tuple' object does not support item assignment

... we will receive a `TypeError`. This is because tuples cannot be modified.

#### Literals & Type Conversion

If we create a tuple and decide what we really need is a list, we can convert it to a list. To convert a tuple to a list, we can do so with `list()`. After that, the `coral` object type will be a list:

In [122]:
list(coral)

['blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral']

We can see that the tuple was converted to a list because the parentheses changed to square brackets.

Likewise, we can convert lists to tuples with tuple().

In [123]:
list(tuple(a_list))  # converting a list into a tuple, then back into a list

[15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3]

#### Nesting
Similarly to lists and dictionaries, tuples can be nested. For example a tuple can be nested in a tuple:

In [124]:
at=(1,2,3,(1,2,3))
print(at)

(1, 2, 3, (1, 2, 3))


#### Tuples as Keys in Dictionaries
In contrast to lists, tuples can also be used as keys in dictionaries. The following cell creates a dictionary `dict1` with three entries, where the last contains a tuple as a key. Then the dictionary is print to screen. Study the example and how tuples can be used as key's:

In [125]:
dict1 = {5:"number","to":"string",(1,"a"):"tuple"}
for x in d1.keys(): print(x,' : ',d1[x])

5  :  number
to  :  string
(1, 'a')  :  tuple


#### List comprehensions
Tuples can be processed by list comprehensions. Now, let's use *list comprehensions*. Try to generate a new list that bases on the tuple `a_tuple` and returns the square of each item. What object type does the operation return, a tuple or a list? And why is that so?

In [126]:
a_tuple = (15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3)

In [127]:
             # add your list comprehension on a_tuple here

### Tuple Built-in Functions & Methods

There are a few built-in functions that work with tuples. In general, most built-in functions that work with sequences, do work with tuples. For sequence methods, however, only few of them work, because often methods alter a sequence and tuples are not mutable. Some examples are given in Table 2:

Table 2: *Built-in Tuple Functions & Methods*

| Built-in Functions & Methods | Description |
| :-: | :-: |
| len()               | Gives the total length of the tuple. |
| min()               | Returns item from the tuple with min value. |
| max()               | Returns item from the tuple with max value. |
| sum()               | Add items. |
returns reversed iterator of a sequence 
| tuple(seq)          | Converts a list into tuple. |
| .index()       | Gives the total length of the tuple. |
| .count()       | Gives the total length of the tuple. |

Experiment executing these functions using the tuples `numbers`, `coral` and `kelp` in the following cells.

In [163]:
a_list = [15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3]
a_tuple = tuple(a_list)      # converts the a list into a tuple
print(a_tuple)

(15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3)


In [153]:
              # try the function len()

In [154]:
              # try the function min()

In [155]:
              # try the function max()

In [156]:
a_tuple.index(3.9) # try the method .index()

2

In [157]:
              # try the method .count()

### Tuple Summary

Tuples are an ordered and immutable collection of arbitrary objects that has a fixed-length.
The fact that the tuple data type is a sequenced data type that cannot be modified, offers optimization to your programs. Tuples are a somewhat faster type than lists for Python to process. When others collaborate with you on your code, your use of tuples will convey to them that you don’t intend for those sequences of values to be modified. Besides that, tuples might have advantages for certain data types. Due to their immutability, they can take over the rolse of a *constant* declaration. Most commonly they are used as dictionary keys (in contrast to lists), as keys must be immutable, too. This is helpful, when generating larger dictionaries and to use entire datasets as keys. 

Most sequence operations that work on strings and lists, work also on tuples: 
* Values/items are indexed by integers 
* Slicing, concatenating, repetition  
* List comprehensions are applicable
* Nesting
 
The following page provides a comprehensive overview of tuple operands and functions:
https://www.tutorialspoint.com/python3/python_tuples.htm    

# Part C: Sets

### Introduction

As one of the most fundamental concepts in mathematics, set theory is a branch of mathematical logic that studies sets. Therein, a set is an "*unordered collection of unique and immutable objects that supports operations corresponding to* ***mathematical set theory***". 

<img src="M35_Image_SetExample.png" alt="Illustrating Sets." title="A Set" width="300" />
Figure 1: *A Set of Geometric Objects*

For example, the graphic above shows a collection of geometric objects, with each being distinctly different from the other. Such an collection of distinct objects is considered as an object in its own right. The purpose of sets is to construct and manipulate unsorted collections of unique elements, in order to support mathematical applications or to analyze complex data structures. For the set in Figure 1, we might be interested in finding geometrical objects that have four corners and are of color green. For such an analysis set theory concept *Intersection* would help us (Figure 2). Major concepts in set theory are explained in Table 3.

<img src="M35_Image_SetTheoryOperation.png" alt="Illustrating Set Theory." title="Set Theory" width="300" />

Figure 2: *A Venn Diagram Ilustrating the Intersection of two Sets.* [Source: Wikipedia](https://en.wikipedia.org/wiki/Set_theory) 



Table 3: *Set Theory Concepts* [Source: Wikipedia](https://en.wikipedia.org/wiki/Set_theory) 

| Set Theory Concept | Description |
| :-: | :- |
| Membership | Set theory begins with a fundamental binary relation between an object o and a set A. If o is a member (or element) of A, the notation o ∈ A is used. |
| Subset / Superset | A derived binary relation between two sets is the subset relation, also called set inclusion. If all the members of set A are also members of set B, then A is a subset of B, denoted A ⊆ B. For example, {1, 2} is a subset of {1, 2, 3} , and so is {2} but {1, 4} is not. B is also called superset of A.|
| Union | Union of the sets A and B, denoted A ∪ B, is the set of all objects that are a member of A, or B, or both. The union of {1, 2, 3} and {2, 3, 4} is the set {1, 2, 3, 4}. |
| Intersection | Intersection of the sets A and B, denoted A ∩ B, is the set of all objects that are members of both A and B. The intersection of {1, 2, 3} and {2, 3, 4} is the set {2, 3}. |
| Difference | Set difference of U and A, denoted U \ A, is the set of all members of U that are not members of A. The set difference {1, 2, 3} \ {2, 3, 4} is {1} , while, conversely, the set difference {2, 3, 4} \ {1, 2, 3} is {4}.



Sets can be derived from data lists. For example, a list of numbers may contain repeating entries. However, we can generate a set from a list, if we select each item only once from the original list:

Table 4: *Example for a Set of a List of Numbers*

| Type / Member | 1 | 2 | 3 | 4 | 5 | 6 |
| -:            |:-:|:-:|:-:|:-:|:-:|:-:|
|List:          | 0 | 7 | 2 | 7 | 0 | 4 |
|Set:           | 7 | 0 | 4 | 2 | 1 |   |

Table 5: *Example for a Set of a List of Animals (Strings)*

| Type / Member | 1     | 2   |  3    |  4  |   5  |   6   |
| -:            | :-:   | :-: | :-:   | :-: |  :-: |  :-  :|
| List:         | dog   | cat | mouse | cat | duck | mouse |
| Set:          | mouse | dog | cat   | duck|      |       |

In geographical data analysis, the concepts of sets and set theory can help to find spatial objects that match two conditions, for example:
* Combining two regions' animal species to get a collection
* Receiving coordinates of certain buildings in a certain region
* Finding a French style restaurant near a park

### Sets in Python

As the examples illustrate, sets are unordered, they contain no duplicates. Hence, they are neither maps nor sequences, but they present us with a completely separate category of objects. Consequently, Python provides an object type for *Sets*. 

To generate a set from any sequence, the literal `set()` is used:


In [168]:
x = set('abcde')
print(x)

{'e', 'd', 'c', 'b', 'a'}


In [169]:
type(x)

set

Note that sets contain immutable objects, but sets themselves are mutable. Therefore, sets can embed tuples and strings (which are immutable), but once sets are created, they cannot contain any lists or dictionaries (because they are mutable). However, lists can be used to create a set, exactly like illustrated by the examples of Table 4 and 5, above.

In [278]:
numberList = [0, 7, 2, 7, 0, 4]
numberSet = set(numberList)
print(numberSet)

{0, 2, 4, 7}


In [279]:
AnimalList = ['dog', 'cat', 'mouse', 'cat', 'duck', 'mouse']
AnimalSet = set(AnimalList)
print(AnimalSet)

{'mouse', 'dog', 'cat', 'duck'}


### Set Theory Concepts in Python

The following literals are used to perform set theory concepts:

In [274]:
x = set('abcde')
y = set('bdxyz')

In [192]:
'e' in x        # Membership of 'e' in set x      

True

In [193]:
x < y           # Subset: is x subset of y

False

In [255]:
x > y           # Superset: is x superset of y

False

In [194]:
x | y           # Union of x and y

{'a', 'b', 'c', 'd', 'e', 'x', 'y', 'z'}

In [195]:
x & y           # Intersection of x and y

{'b', 'd'}

In [196]:
x - y           # Difference of x and y

{'a', 'c', 'e'}

Now let's make a bit sense of these concepts and use our geographical examples for a restaurant search. We have a list of restaurants in our town, some larger restaurant chains have two branches:

In [266]:
restaurantsInOurTown = [
    'La Madeleine',
    'Pont Blanc',
    'Olive Garden',
    'Pommes de Terre',
    'Jean"s''LaBoca', 
    'Pont Blanc', 
    'La Madeleine', 
    'Berliner Kueche', 
    'La Madeleine',
    'Olive Garden']

Additional information is given in the form of two set. Once contains all french style restaurants:

In [267]:
setFrench = set(['La Madeleine','Pont Blanc','Jean"s'])

The other one contains all restaurants near a park:

In [239]:
setPark = set(['LaBoca', 'Pont Blanc', 'La Madeleine', 'Berliner Kueche', 'Olive Garden'])

Each of the set theory concepts can solve a different search question:

In [260]:
'La Madeleine' in setPark  # Is the restaurant 'Madeleine' located near a park?

True

In [261]:
setFrench < setPark        # Are french restaurants a category of park restaurants?

False

In [262]:
setFrench > setPark       # Are park restaurants a category of french restaurants?

False

In [263]:
setFrench | setPark        # We are just hungry, any restaurant is fine!

{'Berliner Kueche',
 'Jean"s',
 'La Madeleine',
 'LaBoca',
 'Olive Garden',
 'Pont Blanc'}

In [264]:
setFrench & setPark        # Find a french restaurant near a park!

{'La Madeleine', 'Pont Blanc'}

In [265]:
setPark - setFrench       # Find a restaurant that is near a park but not french style!

{'Berliner Kueche', 'LaBoca', 'Olive Garden'}

### Sets and List Comprehensions

List comprehensions are applicable for sets.

In [284]:
[e*2 for e in x]

['ee', 'dd', 'cc', 'bb', 'aa']

### Set Methods

Find a list of built-in set methods. Study the list and experiment with the methods in the cells below.

Table 6: *Set Operations and Methods*

| Operation | Equivalent | Result |
| :- | :-: | :- |
|`len(s)`            |           | Number of elements in set s (cardinality) |
|`.issubset(t)`      |  `s <= t` | Test whether every element in s is in t   |
|`.issuperset(t)`    | `s >= t`  | Test whether every element in t is in s   |
|`.union(t)`         |`s | t`    | New set with elements from both s and t   |
|`.intersection(t)`  | `s & t`   | New set with elements common to s and t   |
|`.difference(t)`    | `s - t`   | New set with elements in s but not in t   |
|`.symmetric_difference(t)` |`s ^ t` | New set with elements in either s or t but not both |
|`.copy()`    | | New set with a shallow copy of s   |
|`.add(x)`    | | Add element x to set s   |
|`.remove(x)` | | Remove x from set s   |
|`.pop(x)`    | | Remove and return element from s   |
|`.clear()`   | | Remove all elements from set s   |

### `frozensets()`

Frozenset is a class with the characteristics of a set, but once its elements have been assigned, they cannot be changed. Tuples can be seen as immutable lists, while frozensets can be seen as immutable sets.

Sets are mutable and unhashable, which means we cannot use them as dictionary keys. Frozensets are hashable and we can use them as dictionary keys.

To create frozensets, we use the `frozenset()` method. Let us create two frozensets, X and Y:

In [285]:
z = frozenset('bdxyz')
print(z)

frozenset({'y', 'd', 'b', 'z', 'x'})


The frozensets support the use of Python set methods like `copy()`, `difference()`, `symmetric_difference()`, `isdisjoint()`, `issubset()`, `intersection()`, `issuperset()`, and `union()`.