<a name='sequences'></a>Overview of Built-in Sequences
===



The standard library offers a rich selection of sequence types implemented in C:

- _Container sequences_ : `list`, `tuple`, and `collections.deque` can hold items of different types. 

- _Flat sequences_ : `str`, `bytes`, `bytearray`, `memoryview`, and `array.array` hold items of one type.

**Container sequences** hold references to the objects they contain, which may be of any type, while **flat sequences** physically store the value of each item within its own memory space, and not as distinct objects. 

Thus, flat sequences are more compact, but they are limited to holding primitive values like characters, bytes, and numbers.

Another way of grouping sequence types is by **mutability**:

- _Mutable sequences_: `list`, `bytearray`, `array.array`, `collections.deque`, and `memoryview` 

- _Immutable sequences_: `tuple`, `str`, and `bytes`

<img src="../images/collections_uml.png" width="60%" />

> From the book **"Fluent Python"** by _Luciano Ramalho_ (O'Reilly, 2015)

# A common pattern: Iterating Sequences

Once you have a **sequence** of items, the very most common operations you would like to do is to **iterate**. 

The **most natural** iteration strategy in Python is via the `for` iteration loop!

## The FOR (iteration) loop

The `for` loop statement is the most widely used iteration mechanisms in Python.

* Every **sequence** in Python can be iterated (*element by element*) by a `for` loop
    - a list, a tuple, $\ldots$ (more details will follows)

* In Python, also `while` loops are permitted, but `for` is the one you would see (and use) most of the time!

* The **Pythonic** iteration schema is: 

```python
    for item in sequence:
        # do smth
```    

### FOR Special keywords

Python allows two **keywords** to be used within a `for` loop: **break** and **continue**.

The two keywords have two **different** meanings:

* **Break** used to *immediatly break the loop and exit!*
* **Continue** used to *skip to the **next** iteration step!*

---

# Lists

A list is a collection of items, that is stored in a variable. The items should be related in some way, but there are no restrictions on what can be stored in a list. Here is a simple example of a list, and how we can quickly access each item in the list.

```python 
>>> students = ['bernice', 'aaron', 'cody']
```

### Naming convention

Since lists are collection of objects, it is good practice to give them a plural name. 

If each item in your list is a car, call the list `cars`. 

If each item is a `dog`, call your list `dogs`. 

This gives you a straightforward way to refer to the entire list (`dogs`), and to a single item in the list (`dog`).

In Python, square brackets designate a list. To define a list, you give the name of the list, the equals sign, and the values you want to include in your list within square brackets.

```python 
>>> dogs = ['border collie', 
...         'australian cattle dog', 
...         'labrador retriever']
```

### Accessing one item in a list

Items in a list are identified by their position in the list, starting with zero. 

This will almost certainly trip you up at some point. 

Programmers even joke about how often we all make "off-by-one" errors, so don't feel bad when you make this kind of error.

To access the first element in a list, you give the name of the list, followed by a zero in parentheses.

```python
dogs = ['border collie', 
        'australian cattle dog', 
        'labrador retriever']

dog = dogs[0]
print(dog)
```

The number in parentheses is called the **index** of the item. 

Because lists start at zero, the index of an item is always one less than its position in the list. 

Because of that, **Python** is said to be a [*zero-indexed*](http://en.wikipedia.org/wiki/Zero-based_numbering) 
language (as many others, like `C`, or `Java`)

So to get the second item in the list, we need to use an index of 1, and so on..

```python 
dog = dogs[1]
print(dog)
```

### Accessing the last items in a list
You can probably see that to get the last item in this list, we would use an index of 2. 

This works, but it would only work because our list has exactly three items. 

To get the **last** item in a list, no matter how long the list is, you can use an index of `-1`.

```python 
dog = dogs[-1]
print(dog)
```

This syntax also works for the **second to last item**, the third to last, and so forth.

```python 
dog = dogs[-2]
print(dog)
```

You can't use a negative number larger than the length of the list, however.

```python 
dog = dogs[-4]
```
```
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
----> dog = dogs[-4]
IndexError: list index out of range
```

### Getting the length of a List


Another very common operation you would do with list is to know how _many items_ the list contains. 

To do so, there is a _built-in_ operation in Python via the `len` function:

```python 
print(len(dogs))
```

## Lists are Mutable

**Lists are mutable.**

When the bracket operator appears on the left side of an assignment, it identifies the element of the list that will be assigned.

```python
>>> numbers = [17, 123]
>>> numbers[1] = 5
>>> print(numbers)
[17, 5]
```

The one-eth element of numbers, which used to be 123, is now 5.

You can think of a list as a relationship between indices and elements. 

This relationship is called a **mapping**; each index “maps to” one of the elements.

![List mutable](../images/list.png)

## Traversing a List 

This is one of the most important concepts related to lists. You can have a list with a million items in it, and in three lines of code you can write a sentence for each of those million items. If you want to understand lists, and become a competent programmer, make sure you take the time to understand this section.

We use a loop to access all the elements in a list. A loop is a block of code that repeats itself until it runs out of items to work with, or until a certain condition is met. In this case, our loop will run once for every item in our list. With a list that is three items long, our loop will run three times.

Let's take a look at how we access all the items in a list, and then try to understand how it works.

```python 
dogs = ['border collie', 'australian cattle dog', 'labrador retriever']

for dog in dogs:
    print(dog)
```

We have already seen how to create a list, so we are really just trying to understand how the last two lines work. These last two lines make up a loop, and the language here can help us see what is happening:

    for dog in dogs:

- The keyword "for" tells Python to get ready to use a loop.
- The variable "dog", with no "s" on it, is a temporary placeholder variable. This is the variable that Python will place each item in the list into, one at a time.
- The first time through the loop, the value of "dog" will be 'border collie'.
- The second time through the loop, the value of "dog" will be 'australian cattle dog'.
- The third time through, "dog" will be 'labrador retriever'.
- After this, there are no more items in the list, and the loop will end.

#### A common looping error: a.k.a. The _NON_ Pythonic way to do it!

One common looping error occurs when instead of using the `for each item in the sequence` iteration strategy, the index is used instead (_àla C_):

**Side Note**: we are going to introduce the Python built-in `range` function - take it for granted now. More details on this function later !-)

```python 
>>> dogs = ['border collie', 'australian cattle dog', 'labrador retriever']
>>> for i in range(0, len(dogs)):
...    print(dogs[i])
...

border collie
australian cattle dog
labrador retriever
```

In this example, instead of iterating over the list **element by element**, we iterated over a list of 
integers and used those as index to access the elements within the `dogs` list.

Despite this is not wrong in principle, this solution is considered **not Pythonic** (meaning, **not good**, btw :-)

### Enumerating a list

In the other case in which we would need to iterate a list, and also memorising the position (aka. `index`) 
of the current element, the **Pythonic** way to do it is to leverage on the `enumerate` function .

At each step of the loop, the `enumerate` function returns the pair `(index, element)` and we can use it like in the following example:

```python 
>>> dogs = ['border collie', 'australian cattle dog', 'labrador retriever']
>>> for index, dog in enumerate(dogs):
...    place = str(index)
...    print("Place: " + place + " Dog: " + dog)
...
Place: 0 Dog: Border Collie
Place: 1 Dog: Australian Cattle Dog
Place: 2 Dog: Labrador Retriever
```

To enumerate a list, you need to add an *index* variable to hold the current index. 

So instead of

    for dog in dogs:
    
You have

    for index, dog in enumerate(dogs)
    
The value in the variable *index* is always an integer. If you want to print it in a string, you have to turn the integer into a string:

    str(index)
    

## List Operations

The `+` operator concatenates lists:

```python 
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> c = a + b
>>> print(c)
[1, 2, 3, 4, 5, 6]
```

Similarly, the `*` operator repeats a list a given number of times:

```python
>>> [0] * 4
[0, 0, 0, 0]
>>> [1, 2, 3] * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]
```

The first example repeats `[0]` four times. The second example repeats the list `[1, 2, 3]` three times.

## List Slices

Since a list is a collection of items, we should be able to get any subset of those items. 

For example, if we want to get just the first three items from the list, we should be able to do so easily. 

The same should be true for any three items in the middle of the list, or the last three items, or any `x` items from anywhere in the list. 

These subsets of a list are called *slices*.

To get a subset of a list, we give the position of the first item we want, and the position of the first item we do *not* want to include in the subset. So the slice *list[0:3]* will return a list containing items 0, 1, and 2, but not item 3. 

Here is how you get a batch containing the first three items.

```python 
usernames = ['bernice', 'cody', 'aaron', 'ever', 'dalia']

# Grab the first three users in the list.
first_batch = usernames[0:3]

for user in first_batch:
    print(user)
```

If you want to grab everything up to a certain position in the list, you can also leave the first index blank:

```python 
usernames = ['bernice', 'cody', 'aaron', 'ever', 'dalia']

# Grab the first three users in the list.
first_batch = usernames[:3]

for user in first_batch:
    print(user)
```

When we grab a slice from a list, the original list is not affected:

```python 
usernames = ['bernice', 'cody', 'aaron', 'ever', 'dalia']

# Grab the first three users in the list.
first_batch = usernames[0:3]

# The original list is unaffected.
for user in usernames:
    print(user)
```

We can get any segment of a list we want, using the slice method:

```python 
usernames = ['bernice', 'cody', 'aaron', 'ever', 'dalia']

# Grab a batch from the middle of the list.
middle_batch = usernames[1:4]

for user in middle_batch:
    print(user)
```

To get all items from one position in the list to the **end of the list**, we can leave off the second index:

```python 
usernames = ['bernice', 'cody', 'aaron', 'ever', 'dalia']

# Grab all users from the third to the end.
end_batch = usernames[2:]

for user in end_batch:
    print(user)
```

### Special Case: List Copy

You can use the slice notation to make a copy of a list, 
by leaving out both the starting and the ending index. 

This causes the slice to consist of everything from the first item to the last, which is the entire list.


## List Methods: 

### Adding Items:

We can add an item to a list using the `append()` method. This method adds the new item to the end of the list.

```python 
dogs = ['border collie', 'australian cattle dog', 'labrador retriever']
dogs.append('poodle')

for dog in dogs:
    print(dog + "s are cool.")
```

### Removing Items:

There is a cool concept in programming called "popping" items from a collection. 

Every programming language has some sort of data structure similar to Python's lists. 

All of these structures can be used as _queues_, and there are various ways of processing the items in a queue.

One simple approach is to start with an empty list, and then add items to that list. 

When you want to work with the items in the list, you always take the last item from the list, do something with it, 
and then remove that item. 

The `pop()` function makes this easy. 

It removes the last item from the list, and gives it to us so we can work with it. 

```python 
dogs = ['border collie', 'australian cattle dog', 'labrador retriever']
last_dog = dogs.pop()

print(last_dog)
print(dogs)
```

This is an example of a **first-in, last-out** approach. 

The first item in the list would be the last item processed if you kept using this approach. 

You can indeed **pop** any item you want from a list, by passing the `index` of the item you want to `pop`. 

So we could do a **first-in, first-out** approach by popping the first iem in the list:

```python 
dogs = ['border collie', 'australian cattle dog', 'labrador retriever']
first_dog = dogs.pop(0)

print(first_dog)
print(dogs)
```

## Lists and Strings

A **string** is a sequence of characters and a **list** is a sequence of values, but a list of characters is **not** the same as a string.

To convert from a string to a list of characters, you can use `list`:

```python 
>>> s = 'spam'
>>> t = list(s)
>>> print(t)
['s', 'p', 'a', 'm']
```

The `list` function breaks a string into individual letters. 

If you want to break a string into **words**, you can use the `split` method of strings:

```python 
>>> s = 'pining for the fjords'
>>> t = s.split()
>>> print(t)
['pining', 'for', 'the', 'fjords']
```

An optional argument called a **delimiter** specifies which characters to use as word boundaries. 

The following example uses a **hyphen** as a delimiter:
```python 
>>> s = 'spam-spam-spam'
>>> delimiter = '-'
>>> s.split(delimiter)
['spam', 'spam', 'spam']
```

`join` is the inverse of `split`. 

It takes a list of strings and concatenates the elements. `join` is a string method, so you have to invoke it on the delimiter and pass 
the list as a parameter:

```python 
>>> t = ['pining', 'for', 'the', 'fjords']
>>> delimiter = ' '
>>> delimiter.join(t)
'pining for the fjords'
```

In this case the delimiter is a space character (i.e. `' '`), 
so `join` puts a space between words. To concatenate strings without spaces, you can use the empty string, `''`, as a delimiter.

---


## Deques

The list shortens from the end by one when we pop from it, and we also
get the removed item back. So we can add an item to the end of a list
using `append`, and we can remove an item from the end using `pop`.

It's also possible to do these things in the beginning of a list, but
lists were not designed to be used that way and it would be slow if our
list would be big. 

The `collections.deque` class makes appending and
popping from both ends easy and fast. It works just like lists, but it
also has `appendleft` and `popleft` methods.

```python
>>> names = collections.deque(['theelous3', 'Nitori', 'RubyPinch'])
>>> names
deque(['theelous3', 'Nitori', 'RubyPinch'])
>>> names.appendleft('wub_wub')
>>> names.append('go|dfish')
>>> names
deque(['wub_wub', 'theelous3', 'Nitori', 'RubyPinch', 'go|dfish'])
>>> names.popleft()
'wub_wub'
>>> names.pop()
'go|dfish'
>>> names
deque(['theelous3', 'Nitori', 'RubyPinch'])
>>>
```

The deque behaves a lot like lists do, and we can do `list(names)` if we
need a list instead of a deque for some reason.

Deques are often used as queues. It means that items are always added to
one end and popped from the other end.

---

<a name='tuples'></a>Tuples
===

#### _Tuples as Immutable Lists_

Tuples are basically lists that can never be changed. 

Lists are quite dynamic; they can grow as you append and insert items, and they can shrink as you remove items. You can modify any element you want to in a list. Sometimes we like this behavior, but other times we may want to ensure that no user or no part of a program can change a list. That's what tuples are for.

Technically, lists are *mutable* objects and tuples are *immutable* objects. Mutable objects can change (think of *mutations*), and immutable objects can not change.

<a name='defining_tuples'></a>Defining tuples, and accessing elements
---

You define a tuple just like you define a list, except you use parentheses instead of square brackets. Once you have a tuple, you can access individual elements just like you can with a list, and you can loop through the tuple with a *for* loop:

```python 
colors = ('red', 'green', 'blue')
print("The first color is: " + colors[0])

print("\nThe available colors are:")
for color in colors:
    print("- " + color)
```

If you try to add something to a tuple, you will get an error:

```python
>>> colors = ('red', 'green', 'blue')
>>> colors.append('purple')
```
```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-ed1dbff53ab2> in <module>()
      1 colors = ('red', 'green', 'blue')
----> 2 colors.append('purple')

AttributeError: 'tuple' object has no attribute 'append'
```

The same kind of thing happens when you try to remove something from a tuple, or modify one of its elements. 

**Note:** Once you define a tuple, you can be confident that its values will not change.

## Tuple as Records & Tuple Unpacking

### Tuple as Record

```python 
lax_coordinates = (33.9425, -118.408056)

city, year, pop, chg, area = ('Tokyo', 2003, 32450, 0.66, 8014)

traveler_ids = [('USA', '31195855'), ('BRA', 'CE342567')]
for country, passport_number in traveler_ids:  # tuple unpacking
    print(passport)
```

### Tuple Unpacking

In previous example, we assigned `('Tokyo', 2003, 32450, 0.66, 8014)` to `city`, `year`, `pop`, `chg`, `area` in a single statement. 

Then, in the last line, the `% operator` assigned each item in the passport tuple to one slot in the format string in the print argument. 

Those are two examples of **tuple unpacking**.

#### Multiple Supported Forms of Tuple Unpacking

```python 
>>> t = (1, 99)
>>> r, d = t
>>> print(r, d)
1, 99
```

```python 
>>> t = (1, 99, 77, 12.6, 's')
>>> r, *rest, y = t
>>> print(r)
1
>>> print(v)
's'
>>> print(rest)
[99, 77, 12.6]
```

### Tuple Unpacking in Action (`for` loop)

```python 
values = [('1', '99.3'), ('2', '88.9'), ('3', '79.3'), ('4', '78.9'), ('5', '77.5'),
          ('6', '69.2'), ('7', '58.1'), ('8', '43.3'), ('9', '38.9'), ('10', '33.3')]

for ranking, degree in values:
    print(int(ranking), float(degree))
```
```
1 99.3
2 88.9
3 79.3
4 78.9
5 77.5
6 69.2
7 58.1
8 43.3
9 38.9
10 33.3
```

### Another Example with List of Tuples

#### The `range()` function

The `range()` function helps us generate long lists of numbers in a specified range. 
It accepts three parameters: `start`, `end`, `step`


```python 
numbers = range(0, 20, 2)
print(list(numbers))
for item in enumerate(numbers):
    index, value = item
    print('index: ', index, ' value: ', value )
```
```
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
index:  0  value:  0
index:  1  value:  2
index:  2  value:  4
index:  3  value:  6
index:  4  value:  8
index:  5  value:  10
index:  6  value:  12
index:  7  value:  14
index:  8  value:  16
index:  9  value:  18
```

---

# Named Tuples

The `collections.namedtuple` function is a **factory** that produces subclasses of tuple enhanced with field names and a 
class name which helps debugging.

In [1]:
import collections 

card = collections.namedtuple('Card', ['rank', 'suit'])

In [2]:
## Defining and Using Named Tuples

from collections import namedtuple

City = namedtuple('City', 'name country population coordinates')
tokyo = City('Tokyo', 'JP', 36.933, (35.689722, 139.691667))
tokyo

City(name='Tokyo', country='JP', population=36.933, coordinates=(35.689722, 139.691667))

In [3]:
tokyo.population

36.933

In [4]:
tokyo.coordinates

(35.689722, 139.691667)

In [5]:
tokyo[1]

'JP'

<a name='sets'></a>Sets
===

**Sets** are a relatively new addition in the history of Python, and somewhat underused. 

The `set` type and its immutable sibling `frozenset` first appeared in a module in _Python 2.3_ and were 
promoted to built-ins in _Python 2.6_.

A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.

In [6]:
shapes = ['circle','square','triangle','circle']
set_of_shapes = set(shapes)
set_of_shapes

{'circle', 'square', 'triangle'}

In [7]:
shapes = {'circle','square','triangle','circle'}
for shape in set_of_shapes:
    print(shape)

square
circle
triangle


In [8]:
set_of_shapes.add('polygon') 
print(set_of_shapes)

{'square', 'polygon', 'circle', 'triangle'}


## Sets vs FrozenSets

Set elements **must** be hashable (that is the way to avoid repetitions). 

The `set` type is **not** hashable, but `frozenset` is, so you can have `frozenset` elements inside a `set`.

In [9]:
set_with_frozenset = {frozenset([2, 3, 4]), frozenset([4, 5, 6])}

In [10]:
print(set_with_frozenset)

{frozenset({4, 5, 6}), frozenset({2, 3, 4})}


## Exists (Check)

In [11]:
# Test if circle is IN the set (i.e. exist)
print('Circle is in the set: ', ('circle' in set_of_shapes))
print('Rhombus is in the set:', ('rhombus' in set_of_shapes))

Circle is in the set:  True
Rhombus is in the set: False


## Operations

In addition to guaranteeing uniqueness, the set types implement the essential set operations as infix operators, so, given two sets `a` and `b`, `a | b` returns their **union**, `a & b` computes the **intersection**, and `a - b` the **difference**. 

Smart use of set operations can reduce both the line count and the runtime of Python programs, at the same time making code easier to read and reason about by removing loops and lots of conditional logic.


In [12]:
favourites_shapes = set(['circle','triangle','hexagon'])

# Intersection
set_of_shapes.intersection(favourites_shapes)

{'circle', 'triangle'}

In [13]:
# Equivalently
set_of_shapes & favourites_shapes

{'circle', 'triangle'}

In [14]:
# Union
set_of_shapes.union(favourites_shapes)

{'circle', 'hexagon', 'polygon', 'square', 'triangle'}

In [15]:
# Equivalently
set_of_shapes | favourites_shapes

{'circle', 'hexagon', 'polygon', 'square', 'triangle'}

In [16]:
# Difference
set_of_shapes.difference(favourites_shapes)

{'polygon', 'square'}

In [17]:
# Equivalently
set_of_shapes - favourites_shapes

{'polygon', 'square'}