# Introduction to Lists and Sets

This is part of a tutorial on python basic objects.

Authors: ['Arthur de Fluiter', 'Cheng-yu Lam']

This particular document deals with python's `list` and `set`. 

# List

In Python, a list is 
 - *dynamic* a list may grow and shrink. 
 - *ordered* the elements added first are in the list first
 - *non-type specific* any element can be of a different type
 - *expensive to query* if you want to know whether something is in a list, you'll have to go through all elements
 - *cheap to store* relative to other structures, the list is fairly cheap in memory

In [1]:
# Creating an empty list can be done in the following ways
empty_list = list()
empty_list = []

# A list of strings
string_list = ["apple", "orange", "banana"]

# A list of numbers
number_list = [1, 1, 2, 2, 3, 42, 50, 100, 99999]

# A list of varying types
varying_list = ["not", 24, "specified", "what", "type", True]

## Length of a list

To get the length/size of a list, you can use `len(list)`.

In [2]:
print("The length of the number list: %s" % len(number_list))

The length of the number list: 9


## Add an item to a list

To add to an existing list, you can use the `append()` function

In [3]:
test = [1,2,3]
test.append(4)
test

[1, 2, 3, 4]

## Add multiple items to a list

Additionally you sometimes want to add multiple elements to a list, this can be done via the `extend()` function.

This function takes any 'iterable' object, this means any object you can loop over, like:

In [4]:
test = [1,2,3]
test.extend([-1, -2, -3]) # a list
print(test)
test.extend(range(5))     # a range
print(test)
test.extend("abc")        # tries to loop over the string, returning all letters
print(test)

[1, 2, 3, -1, -2, -3]
[1, 2, 3, -1, -2, -3, 0, 1, 2, 3, 4]
[1, 2, 3, -1, -2, -3, 0, 1, 2, 3, 4, 'a', 'b', 'c']


## Get an item from a list

Lists are ordered by their index. The index starts from 0, so the first item has index 0, the second item has index 1, and so on.

To get an item from a list, use `list[index_of_item]`, for example `list[0]`

In [5]:
print("The third item in the number list: %s" % number_list[2])

The third item in the number list: 2


## Get multiple items from a list
You can get multiple connected items from a list with the [slice notation](https://docs.python.org/3.5/tutorial/introduction.html).

In [6]:
print(number_list[0:5])   # items from index 0 to 4 (5-1)
print(number_list[:5])    # items from the beginning to index 4
print(number_list[5:])    # items from index 5 to the end

[1, 1, 2, 2, 3]
[1, 1, 2, 2, 3]
[42, 50, 100, 99999]


## Set an item in a list

Similarly you can also set objects from a list

In [7]:
# set a single element
number_list[2] = 1
print(number_list)

[1, 1, 1, 2, 3, 42, 50, 100, 99999]


## Set multple items in list

You can also use the slice notation to assign part of the list (note that this is fairly slow, compared to append/extend)

In [8]:
print(number_list)

# set multiple elements (elements 0-5 are replaced with [-1,-2,-3])
number_list[:6] = [-1, -2, -3]

print(number_list)

[1, 1, 1, 2, 3, 42, 50, 100, 99999]
[-1, -2, -3, 50, 100, 99999]


## List comprehension

List comprehension is a quick way create a list following an easy pattern.

It makes the code shorter compared with for loops.

We will use the number list above. Let's say we want to get a list from all numbers larger than 50. 

This is how one normally go about doing it:

In [9]:
larger_than_50 = []
for number in number_list:
    if number > 50:
        larger_than_50.append(number)
        
print(larger_than_50)

[100, 99999]


The code below does the same thing:

In [10]:
larger_than_50_improved = [n for n in number_list if n > 50]
print(larger_than_50_improved)

[100, 99999]


#### Note

the structures in list comprehensions can become as complicated as you want, but you should not strive for this as it makes code harder to read.  

The code below demonstrates this with a statement creating all sublists of a range. The resulting code becomes less readable.

In [11]:
nr = 6
list_of_lists = [[x for x in range(start, start + size)] for size in range(1, nr+1) for start in range(nr) if start + size <= nr]
print(list_of_lists)

[[0], [1], [2], [3], [4], [5], [0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5]]


# Sets

Sets are:

- *dynamic* sets can grow and shrink
- *non-type specific* any element can be of a different type
- *unordered* there is no fixed order in which you set or get elements
- *uniqueness* every element is distinct, adding an element twice results in having the element only once
- *cheap to query* opposite to lists, the operation to check whether an element is in a set is cheap
- *expensive to store* sets require a lot of extra data, data structures are at times better stored as lists

**Very important note**: 

Set literals (e.g. `a = {1,2,3}`) use the `{elem, ...}` notation, however curly brackets are also used for dicts. When the python people had to decide whether `{}` was an empty set or dict, they chose the latter. 

**This means an emtpy set cannot be written as `{}`.**

In [12]:
# Create an empty set
empty_set = set() # DO NOT CONFUSE WITH {}, which gives a dict

# A set of strings
string_set = {"apple", "orange", "banana"}

# A set of numbers
number_set = {1, 1, 2, 2, 3, 42, 50, 100, 99999}

# All the items in a set must be unique, duplicate items are eliminated
print(number_set)

{1, 2, 3, 100, 42, 50, 99999}


## Length of a set
Similar to a list, you use `len(set)` to find the length of a set.

In [13]:
print("The length of the number set: %s" % len(number_set))

The length of the number set: 7


## Adding an item to a set

Similar to the `list`'s  append, there is the `add()` function for the `set` which adds an element if it wasnt in there already

In [14]:
primes = set()

primes.add(2)
primes.add(2)
primes.add(3)
primes.add(5)

print(primes)

{2, 3, 5}


## Adding multiple items to a set

Similar to the `list`s `extend` the set has `update` which adds everything from an `iterable` (think lists, ranges etc)

In [15]:
numbers = set()

print(numbers)

numbers.update([1,2,3])

print(numbers)

numbers.update(range(10))

print(numbers)

set()
{1, 2, 3}
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


## Getting item(s) from a set
Since a set is not ordered, it is not possible to get an item from a set using an index. However, you can check if an item is in a set:

In [16]:
print("Is 42 in the number set? %s" % "Yes" if 42 in number_set else "No")

Is 42 in the number set? Yes


## Set comprehension
Set comprehension is similar to list comprehension if you need to get part of a set according to some condition.

In [17]:
set_larger_than_50 = {n for n in number_set if n > 50}
print(set_larger_than_50)

# You can mix list comprehension and set comprehension
print ({n for n in number_list if n < 50})    # A set from filtering a list
print([n for n in number_set if n < 50])      # A list from filtering a set

{100, 99999}
{-3, -1, -2}
[1, 2, 3, 42]


## Set Mathematical operators

In [18]:
S = {1,2,3,4}
T = {2,3}
U = {3,4,5,6}
a = 1

In mathematics, sets have well defined operators namely the following:

Is a in a set S

$$
a \in S
$$

Or its python equivalent:

In [19]:
a in S

True

Is T a subset of S

$$
T \subset S
$$

and T is a subset or equals S

$$
T \subseteq S
$$

In [20]:
print(T <  S) # T is a subset of S
print(S <  S) # S is not strictly a subset of S
print(T <= S) # T still holds
print(S <= S) # S is S, so this also holds

True
False
True
True


Similarly the supersets $\supset$ and $\supseteq$ work with the `>` and `>=` operators

### `set` on `set` operations

We also have the union operator for 2 sets, S and U
$$
S \cup U
$$

![union](https://upload.wikimedia.org/wikipedia/commons/thumb/3/30/Venn0111.svg/320px-Venn0111.svg.png)

And its equivalents

In [21]:
print(S.union(U))

# or for short
print(S | U)

{1, 2, 3, 4, 5, 6}
{1, 2, 3, 4, 5, 6}


Intersection of sets S and U
$$
S \cap U
$$

![intersection](https://upload.wikimedia.org/wikipedia/commons/thumb/9/99/Venn0001.svg/320px-Venn0001.svg.png)

and its python equivalents

In [22]:
print(S.intersection(U))

# or for short
print(S & U)

{3, 4}
{3, 4}


Difference of S and U

$$
S \setminus U
$$

![set](https://upload.wikimedia.org/wikipedia/commons/thumb/5/5a/Venn0010.svg/320px-Venn0010.svg.png)

and its python equivalents

In [23]:
print(S.difference(U))

#or for short
print(S - U)

{1, 2}
{1, 2}


and symmetric difference of S and U

![symmetric difference](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Venn0110.svg/320px-Venn0110.svg.png)

$$
S \bigtriangleup U
$$

And its python equivalent

In [24]:
# and the symmetric difference
print(S.symmetric_difference(U))

# or for short
print(S ^ U)

{1, 2, 5, 6}
{1, 2, 5, 6}
