# Collections and Sequences

Collections are containers with objects inside. Those containers can be ordered or unordered. Python has four native types of colections:

Ordered:
+ Lists, `[...]`
+ Tuples, `(...)`
+ strings, `'...'` 

Unordered:
+ Sets, `{...}`
+ Dictionaries, like `set` but with key-value pairs, `{key:value, ...}`

But there are [others](https://docs.python.org/2/library/collections.html) available in the standart library.

## Lists

Lists are one of Python most useful built-in types.

Like a string, a list is a sequence of values. In a string, the values are characters; in a list,
they can be any type. The values in a list are called elements or sometimes items.

In [1]:
my_guitars = ['Fender', 'Martin', 'Gibson', 'stratocaster', 'les paul']

In [3]:
print('this is a {}, with {} elements'.format(type(my_guitars), len(my_guitars)))

this is a <class 'list'>, with 5 elements


The elements of a list don’t have to be the same type. Lists can have also **nested** lists!

In [4]:
['foo', 2.48, 4, [10, 20]]

['foo', 2.48, 4, [10, 20]]

lists can be also empty

In [5]:
empty_list = []
print(empty_list, len(empty_list))

[] 0


### Lists are mutable

The syntax for accessing the elements of a list is the same as for accessing the characters
of a string (the bracket operator). The expression inside the brackets specifies the index.
Remember that the index start at 0

In [7]:
my_guitars[2]

'Gibson'

Unlike strings, lists are mutable. When the bracket operator appears on the left side of an
assignment, it identifies the element of the list that will be assigned.

In [10]:
numbers = [23, 54, 123]
numbers[1] = 0
numbers

[23, 0, 123]

List index work the same way as string index:
+ Any integer expression can be used as an index.
+ If you try to read or write an element that does not exist, you get an `IndexError`.
+ If an index has a negative value, it counts backward from the end of the list.

In [11]:
numbers[5]

IndexError: list index out of range

In [16]:
numbers[-1]  # reverse the list

123

Lists Also support the `in` operator. It checks the **membership** of the element at the left of `in` with the sequence of the right 

In [17]:
'Fender' in my_guitars

True

In [18]:
'mustang' in my_guitars

False

### Iterating 

In [19]:
# iterate over elements
for guitar in my_guitars:
    print(guitar)

Fender
Martin
Gibson
stratocaster
les paul


In [23]:
list(range(len(numbers)))

[0, 1, 2]

In [24]:
# iterate over index
for i in range(len(numbers)):
    numbers[i]*= 2
numbers   

[46, 0, 246]

What does `range(len([...]))` do? 

`len()` return the number of elements in the list and `range()` creates a range that goes from zero to n-1.
This way we have a way to access the list by its **index**.

In [25]:
for i in range(23):
    numbers[i] = numbers[i] * 2

IndexError: list index out of range

### List operation

The + operator concatenates lists:

In [26]:
a = [1,2,3]
b = [10,20,30]
c = a + b
c

[1, 2, 3, 10, 20, 30]

the `*` operator repeats a list given number of times

In [27]:
[0]*4

[0, 0, 0, 0]

In [28]:
[1, 2, 3] * 3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

### Slicing

List slicing it's a must to master! It allows to access the list elements and subsets via the `[]` operator.

```
list[start:end]
```

In [29]:
l = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

In [30]:
l[2]

'c'

In [31]:
l[0:2]

['a', 'b']

In [32]:
l[:2]

['a', 'b']

In [33]:
l[2:]

['c', 'd', 'e', 'f', 'g', 'h', 'i']

In [34]:
l[1:4]

['b', 'c', 'd']

In [35]:
l[3:]

['d', 'e', 'f', 'g', 'h', 'i']

In [36]:
l[-1]

'i'

In [37]:
l[2:-2]

['c', 'd', 'e', 'f', 'g']

lists also support more complex slicing adding another `:`
```
list[start:stop:stride]
```

In [38]:
l[::2]

['a', 'c', 'e', 'g', 'i']

In [None]:
l[1::2]

In [39]:
l[::-1]

['i', 'h', 'g', 'f', 'e', 'd', 'c', 'b', 'a']

In [40]:
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

In [45]:
l[::-1

['e', 'd', 'c']

### List methods

The `list` class also have its own usefull methods.


| **List method** | **Description**   |
|------|------|
|   *list*.append(*x*)  | Adds item x to the end of the list |
|   *list*.extend(*L*)  | Adds all items in list *L* to the end of the list |
|   *list*.insert(*i,x*)  | Inserts item *x* in position *i* |
|   *list*.remove(*x*)  | Removes first item *x* from the list |
|   *list*.pop(*i*)  | Removes item at index position *i* and returns it |
|   *list*.index(*x*)  | Returns the index position in the list of first item *x* |
|   *list*.count(*x*)  | Returns the number of times *x* appears in the list |
|   *list*.sort()  | Sort all list items, in place |
|   *list*.reverse()  | Reverse all list items, in place |


In [46]:
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

with `append` we add the object in the argument to the end of the list 

In [47]:
l.append('j')
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

with `pop` we **remove and return** an object of the list (given the index). The default value is the last 

In [48]:
l.pop()

'j'

In [49]:
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

`extend` takes and iterable as argument. Extends the list by **appending** elements **from the iterable**

In [50]:
l.append([1,2,3])

In [52]:
l.pop()

[1, 2, 3]

In [54]:
l.append(['a', 'b', 'c'])
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'a', 'b', 'c', ['a', 'b', 'c']]

In [56]:
l = 'a b c d e f g h i'.split(' ')
l.extend('ABCDEF')
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'A', 'B', 'C', 'D', 'E', 'F']

while with append this will happen:

In [57]:
l = 'a b c d e f g h i'.split(' ')
l.append(['a', 'b', 'c'])
l.append('ABCD')
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', ['a', 'b', 'c'], 'ABCD']

In [58]:
l

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', ['a', 'b', 'c'], 'ABCD']

`append` is very useful for creating new lists from an iterator:

In [59]:
my_guitars = ['Fender', 'Martin', 'Gibson', 'stratocaster', 'les paul']
new_list = []  # create an empty list!

for guitar in my_guitars:
    if len(guitar) > 6:
        new_list.append(guitar)


In [60]:
new_list

['stratocaster', 'les paul']

# Tuples

Tuples are inmutable sequences. 
+ Tuples have order
+ Tuples don't support assignment 

In [62]:
t = (1, 23, 3, 1)

In [63]:
t[0] = 2

TypeError: 'tuple' object does not support item assignment

## Tuple methods

| **Tuple method** | **Description**   |
|------|------|
|   *tuple*.count(value)  | return number of occurrences of value |
|   *tuple*.index(value, start, stop)  | return first index of value |

 start and stop are the indexes of the subset you want to check

In [65]:
t

(1, 23, 3, 1)

In [67]:
t.count(2)

0

In [74]:
t.index(1, 1, 4)

3

## Lists vs Tuples

Tuples are for describing multiple properties of one unchanging thing. Lists can be used to store collections of data about completely disparate objects.

Since tuples are immutable, the underlaying data structure is simpler and so are lighter. 

Tuples can also contain arbitrary objects, even lists! 

Let's talk about this weird phenomena:

In [75]:
t = (1,2,['a', 'b', 'c'])
t

(1, 2, ['a', 'b', 'c'])

In [76]:
t[2]

['a', 'b', 'c']

In [77]:
t[2].append('d')

Tuples and lists elements are just arrays of "addresses" to other python objects. So the third element of our tuple `t` is just a pointer to the python object `['a', 'b', 'c']`. As long as the object is the same, manipulating those objects in place will not raise errors. 

# Sets

Sets are created with curly braces `{}` or the function `set()`

Python also includes a data type for sets. A set is an unordered collection **with no duplicate elements**. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.

In [80]:
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}

In [83]:
print(basket)

{'apple', 'orange', 'banana', 'pear'}


In [84]:
'orange' in basket  # fast membership testing

True

In [85]:
basket2 = set(['apple', 'pear', 'banana', 'mango'])
basket2

{'apple', 'banana', 'mango', 'pear'}

In [86]:
basket - basket2 # fruits in basket1 but not in baket2

{'orange'}

In [87]:
basket2 - basket

{'mango'}

In [88]:
basket | basket2  # in one or other or both

{'apple', 'banana', 'mango', 'orange', 'pear'}

In [89]:
basket & basket2  # fruits in both

{'apple', 'banana', 'pear'}

In [90]:
basket ^ basket2

{'mango', 'orange'}

# Dictionaries

Another useful data type built into Python is the dictionary. Dictionaries are sometimes found in other languages as “associative memories” or “associative arrays”. Unlike **sequences, which are indexed by a range of numbers**, dictionaries are **indexed by keys**, which can be any immutable type; strings and numbers can always be keys. The values can be anything

It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique 

In [95]:
# let's make a new basket with fruit prices 
d = {'apple':0.87, 'pear':1.05, 'banana':0.56, 'orange':0.42, 'orange':0.45}
d

{'apple': 0.87, 'pear': 1.05, 'banana': 0.56, 'orange': 0.45}

In [96]:
d['apple']

0.87

In [97]:
d.keys()  # returns a list of the keys

dict_keys(['apple', 'pear', 'banana', 'orange'])

In [98]:
d.values()  # returns lists with values

dict_values([0.87, 1.05, 0.56, 0.45])

In [99]:
d.items()  # returns list of key, value pair (tuples)

dict_items([('apple', 0.87), ('pear', 1.05), ('banana', 0.56), ('orange', 0.45)])

In [102]:
d['mango']

1.43

In [101]:
d['mango'] = 1.43

In [104]:
d.items()

dict_items([('apple', 0.87), ('pear', 1.05), ('banana', 0.56), ('orange', 0.45), ('mango', 1.43)])

In [110]:
tup = list(d.items())[0]

In [113]:
func = lambda i: i[0]

In [114]:
func(tup)

'apple'

{'apple': 0.87, 'pear': 1.05, 'banana': 0.56, 'orange': 0.45, 'mango': 1.43}

### Sort fruits from expensive to cheap

`dict` are unordered, so how can we do this?

In [116]:
sorted(d.items(), key=lambda i: i[0], reverse=False)

[('apple', 0.87),
 ('banana', 0.56),
 ('mango', 1.43),
 ('orange', 0.45),
 ('pear', 1.05)]

Those are the default data structures for containers and sequences. But Python has a a few more in the standard library, in the module `collections`. [Go and take a look!](https://docs.python.org/3.6/library/collections.html?highlight=collections#module-collections)

# List Comprehensions

We have seen the basic data structures and one of the most important features: the iteration with `for` loops. Python’s `for` statement iterates over the items of any sequence. As we have sen before:

In [None]:
my_guitars =  ['Fender', 'Martin', 'Gibson', 'stratocaster', 'les paul']

In [117]:
# for loop to filter the list given a condition

guitars_with_i = [] # initialize empty list
for guitar in my_guitars:
    if 'i' in guitar:
        guitars_with_i.append(guitar)

In [118]:
guitars_with_i

['Martin', 'Gibson']

In [119]:
# for loop to modify the list
lower_guitars = []
for guitar in my_guitars:
    lower_guitars.append(guitar.lower())  # .lower() converts to lowercase 

In [120]:
lower_guitars

['fender', 'martin', 'gibson', 'stratocaster', 'les paul']

**List comprehensions** provide a concise way to create lists and are one of the goodies of Python!

In [121]:
l = [x**2 for x in range(10)]
l

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [122]:
l = []
for x in range(10):
    if condition :
        l.append(x**2)

In [124]:
l

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [125]:
[x for x in l if x%2 == 0]  # filtering !

[0, 4, 16, 36, 64]

They look compact, readable and concise:

In [126]:
[guitar for guitar in my_guitars if 'i' in guitar]

['Martin', 'Gibson']

the general structure is always:
```python
[ expression for item in list if conditional ]
```

which is equivalent to:

```python 
for item in list:
    if conditional:
        expression
```

list comprehensions are also faster than `for` loops:

In [127]:
%%timeit

l = []
for i in range(100000):
    l.append(i)

8.67 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [128]:
%%timeit

l = [i for i in range(100000)]

3.28 ms ± 34 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Other uses of comprehensions

Python has this amazing comprehension functionality for all the collections!

### `set` comprehensions

 expression between `{...}`
```python
{ expression for item in list if conditional }
```

In [129]:
{x.lower() for x in my_guitars}

{'fender', 'gibson', 'les paul', 'martin', 'stratocaster'}

### `dict` comprehensions

e.g. swap key:value for value:key of a `dict`
```python
{ key:value for v1, v2 in list_of_tuples }
```

In [None]:
d = {'apple':0.87, 'pear':1.05, 'banana':0.56, 'orange':0.45, 'mango':1.46}
d

In [130]:
{v:k for k, v in d.items()} 

{0.87: 'apple', 1.05: 'pear', 0.56: 'banana', 0.45: 'orange', 1.43: 'mango'}

filter the fruits dictionary to get a new `dict` with expensive fruits (i.e. > 1 €)

In [132]:
{k:v for k, v in d.items() if v > 1}

{'pear': 1.05, 'mango': 1.43}