# Week 2: Sequences, Function and Files

This week we discuss three of the most fundamental topics in Python. First we start with sequences, which are the most used built-in data structures. We will learn about different sequences, __Lists__, __Tuples__, __Dictionaries__ and __Sets__.
We then discuss how and when to create user defined functions. At the end we will learn some basic file loading and saving functionalities.  

## Sequences: Python's built-in Data structures

### Lists

Sequences are containers in which we store data. They are fairly simple but powerful. In week 1 we saw an example of a list in which we stores a sequence of numbers. Lists are variable-length and can be modified. We define using the __[ ]__ or using the __list()__ type function.

In [37]:
list_a = [2,3,4,None,"Simon"]

In [38]:
list_a

[2, 3, 4, None, 'Simon']

Lists can store any data type including lists or any other sequence. Here is an example of a list of lists, also known as ___nested___ lists:


In [39]:
List_b = [list_a,[2,4],['One','Two','Three']]

The __list()__ function is more commonly (but not only) used to transform a generator, such as he the __range()__ function into a list of numbers.

In [40]:
generator = range(2,20,2)

In [41]:
generator

range(2, 20, 2)

In [42]:
type(generator)

range

In [43]:
list_c = list(generator)

In [44]:
list_c

[2, 4, 6, 8, 10, 12, 14, 16, 18]

It could also be used to transform other sequences into lists. If you try it on a string you get something similar:

In [45]:
s = "Hello World"

In [46]:
list(s)

['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']

#### Adding and Removing elements

Elements can be added to the end of the list by using the method __.append()__.

In [47]:
list_c.append(100)

In [48]:
list_c

[2, 4, 6, 8, 10, 12, 14, 16, 18, 100]

Using the __insert__ method, you can insert an element in specific location in the list.

In [49]:
list_c.insert(5,'Middle')

In [50]:
list_c

[2, 4, 6, 8, 10, 'Middle', 12, 14, 16, 18, 100]

The insertation index must be between 0 and the length of the list, inclusive. The length could be returned by the using the function __len()__

In [51]:
len(list_c)

11

The inverse of the __insert()__ method is the __pop()__ method. It removes and returns an element from a list based on the index.

In [52]:
list_c.pop(5)

'Middle'

In [53]:
list_c

[2, 4, 6, 8, 10, 12, 14, 16, 18, 100]

Elements can be removed by value using __remove()__ method.

In [54]:
list_c.remove(100)

In [55]:
list_c

[2, 4, 6, 8, 10, 12, 14, 16, 18]

You can check whether a list contains a certain element by using the following syntax:

In [56]:
8 in list_c

True

In [57]:
99 not in list_c

True

#### Concatenating and Combining Lists 

Two lists could be concatenated using the add sign __+__.

In [58]:
list_d = list_c + list_a

In [59]:
list_d

[2, 4, 6, 8, 10, 12, 14, 16, 18, 2, 3, 4, None, 'Simon']

Multiplying a list by scalar will concatenate many copies of the list:

In [60]:
list_e = [1,2,1]
3*list_e

[1, 2, 1, 1, 2, 1, 1, 2, 1]

A more efficient way to merge two lists is to use the extend __method()__.

In [61]:
list_c.extend([3,4,5,8])

In [62]:
list_c

[2, 4, 6, 8, 10, 12, 14, 16, 18, 3, 4, 5, 8]

#### Sorting 

The sorting method .sort() is a very useful functionality with a wide range of options. The followings are just a couple of examples:

In [63]:
list_1 = [1,4,3,6,3]


In [64]:
list_1.sort()

In [65]:
list_1

[1, 3, 3, 4, 6]

In [66]:
list_2 = ['chair','axe','table','hammer']
list_2.sort()
list_2

['axe', 'chair', 'hammer', 'table']

There is also the option to pass another argument as per list.sort()?. 

In [67]:
list.sort?

In [68]:
list_2.sort(key=len) # Will sort according to length

In [69]:
list_2

['axe', 'chair', 'table', 'hammer']

#### Indexing and Slicing

One can apply indexing to a list by using the below syntax:

In [70]:
list_2[0]

'axe'

Indexing starts with Zero and ends with the length of the list. In the example above we by indexing the list by 0 we return the first element of the list. If the index is higher than the length Python will return an error.

In [71]:
list_2[15]

IndexError: list index out of range

Indexing could be also used to change the value of an element in a list:

In [72]:
list_2[3]='book'
list_2

['axe', 'chair', 'table', 'book']

You can also select a section of list and most sequences by using the slicing notation, which consists of a start index, followed by a semi-column:, followed by an ending index:

In [73]:
list_1[2:4] # in this case we have selected the sequence that starts with index two and end with index 4

[3, 4]

If you skip the start or the end it will the sequencing will default to the beginning or the end of the list:

In [74]:
list_1[2:]

[3, 4, 6]

In [75]:
list_1[:2]

[1, 3]

Using a negative index will slice the list relative to the end:

In [76]:
list_1[-2]

4

In [77]:
list_1[:-2]

[1, 3, 3]

In [78]:
list_1[1:-2]

[3, 3]

Another method for slicing is the _step_ functionality which looks like this:

In [79]:
list_3 = list(range(20))

In [80]:
list_3

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

In [81]:
list_3[::2]

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Using two consecutive semi-columns will slice out every other element of the list.

### Tuples

Unlike a list, a tuple is a fixed length, ___immutable___ sequence of Python. Tuples are created by using the brackets __( )__ or the __tuple()__ function.

In [82]:
tuple_1 = (1,2,3)

In [83]:
tuple_1

(1, 2, 3)

In [84]:
tuple_2 = tuple(list_2)

In [85]:
tuple_2

('axe', 'chair', 'table', 'book')

The function __tuple()__ operates in a very similar manner as the __list()__ function we saw earlier.

__Indexing__ and __slicing__ also applies to tuples in the same way we saw with lists.

In [86]:
tuple_2[2]

'table'

In [87]:
tuple_2[:2]

('axe', 'chair')

__Immutable Objects : These are of in-built types like int, float, bool, string, unicode, tuple. In simple words, an immutable object can't be changed after it is created. Mutable Objects : These are of type list, dict, set . Custom classes are generally mutable.__

As such once a tuple is created you cannot change any of its elements:

In [88]:
tuple_3 = ("one","two",3)

In [89]:
tuple_2[0] = 1

TypeError: 'tuple' object does not support item assignment

The other extend and concatenations functionalities we have seen with lists apply to tuples as well such as  __.extend()__ and the __+__ operator.

#### Unpacking tuples

If you try to assign tuples to variables, Python will unpack the content of a tuple and distributed among the variables:

In [91]:
tup = ('a','b','c')

In [92]:
x,y,z = tup

In [93]:
x

'a'

In [94]:
y

'b'

In [95]:
z

'c'

This works with __nested__ tuples as well:

In [96]:
tup = (8,9, (10,11))

In [97]:
a,b,(c,d) =tup

In [98]:
a 

8

In [99]:
d

11

A common use of tuple unpacking is when iterating over sequences of tuples or lists

In [100]:
seq = [(1,2,3),(4,5,6),(7,8,9)]

In [101]:
for a,b,c in seq:
    print ('a={}, b={},c={}'.format(a,b,c))

a=1, b=2,c=3
a=4, b=5,c=6
a=7, b=8,c=9


### Built-in Useful Sequence Functions

#### sorted()

Unlike the __.sort()__ method we saw earlier, the function __sorted()__ returns a new sorted list instead of modifying the original sequence.

In [102]:
list_new = [1,9,8,4,7]

In [103]:
list_sorted = sorted(list_new)

In [104]:
list_sorted

[1, 4, 7, 8, 9]

In [105]:
list_new

[1, 9, 8, 4, 7]

#### zip()

__zip()__ "pairs" up the elements of a number of sequences into a list of tuples:

In [106]:
lst_1 = [4,5,7]
lst_2 = [8,9,9]

In [107]:
zipped = zip(lst_1,lst_2)
list(zipped)

[(4, 8), (5, 9), (7, 9)]

#### reversed()

__reversed()__ iterates over the elements f a sequence in reverse order.

In [108]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

remember that __reversed()__ is a generator(like __range()__), which mean it does not create the sequence until it materialized with the function __list()__.

### Dictionaries

__dict__ are data collections of _key value_ pairs, where _key_ and _value_ Python objects. There are usually created using the __{ }__. For an example:

In [109]:
profile = {"Name":"James","Age":29,"Graduate":True}
profile

{'Age': 29, 'Graduate': True, 'Name': 'James'}

Instead of the indexing we saw with lists and tuples, values are reteived or added by using the keys.

In [110]:
profile['Name']

'James'

To add a new value a key is entered into the brackets and the value is assigned on the right hand side of the equal sign:

In [111]:
profile['City'] = 'London'

In [112]:
profile

{'Age': 29, 'City': 'London', 'Graduate': True, 'Name': 'James'}

The methods __.pop()__, which we have seen already with lists and tuples, can also be used in the same way

In [113]:
ret = profile.pop('City')
ret

'London'

In [114]:
profile

{'Age': 29, 'Graduate': True, 'Name': 'James'}

We could also use the keyword __del__ to remove elements as below:

In [115]:
del profile['Age']

In [116]:
profile

{'Graduate': True, 'Name': 'James'}

The __.keys()__ and __.values()__ methods could be used to retrieve the keys or the values into a list.

In [117]:
list(profile.keys())

['Name', 'Graduate']

In [118]:
list(profile.values())

['James', True]

#### Creating dictionaries from other sequences

It is common to be in situation where you have two sequences that you want to pair up element-wise in a dict:

In [119]:
matching = dict(zip(range(5),reversed(range(5))))
matching

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

### Sets

Like sets in math, a set is a collection of __unique__ elements. They can be created in two different ways:

In [120]:
a = set([1,3,3,4,5,4])
a

{1, 3, 4, 5}

In [121]:
b = {2,3,3,4,5,4}
b

{2, 3, 4, 5}

They support math set operations such as union, intersection and differences.

In [122]:
a.union(b)

{1, 2, 3, 4, 5}

In [123]:
a|b # another way to get union

{1, 2, 3, 4, 5}

In [124]:
a.intersection(b)

{3, 4, 5}

In [125]:
a&b # another way to get intersection

{3, 4, 5}

## Optional: Sequence Comprehensions

Python fans swear by the List Comprehension feature. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements and applying filters. They follow the following basic form:

#### result = [ _expression_ for _value_ in _collection_ if _condition_]

This is equivalent to the following __for loop__:

```python
result = []
for value in collection:
    if condition:
        result.append(expression)

```

The filter condition as the end could be skipped at the end, sticking to the expression. For example we can take a list of strings, filter out the words longer 5 letters and applying upper case to them:

In [126]:
list_s = ['Osaka','Tokyo','Kyoto','Kobe','Yokohoma','Hiroshima' ]

In [127]:
result = [x.upper() for x in list_s if len(x)>5]
result

['YOKOHOMA', 'HIROSHIMA']

## Functions

Every time you anticipate rusing a piece of code over and over again you should consider structuring it into a function that can be called and recalled as many times as needed. It is also an important way to organize your code and make it more readable.

Functions are declared using the __keyword__ def and returned from with the __return__ keyword. For example:

In [128]:
def my_func(x):
    if x < 10:
        return x
    else:
        return 2*x


In [129]:
my_func(1)

1

In [130]:
my_func(20)

40

If the function end without a __return__ keyword, it will return the None value.

We can elaborate my function to make look as follows:

In [131]:
def my_func(x, y=10):
    if x < y:
        return x
    else:
        return 2*x

In the second example above we have two types of inputs, also known as arguments. The first one x, is known as a positional argument, is a mandatory argument. The function will not work without it. 

The 'y' argument on the other hand is a keyword argument. It is optional and in case it is not provided it will default to the value assigned when the function is defined. In this case if y is not provided it will default to 10. The main restriction on function arguments is that the keyword argument must follow the positional arguments. You can specify keyords arguments in any order; this freed you from having to remember which order the function argument were specified in and only what their names are.

It is possible to call the function in any of the below forms:

In [132]:
my_func(5)

5

In [133]:
my_func(5,y=2)

10

In [134]:
my_func(5,2)

10

#### Namespace, Scope and Local Functions

Functions can access variables in two different scopes: _global_ or _local_. The scope of the name is also known as namespace. Any variable assigned within the function are by default considered within the _local_ namespace. The local namespace is created as soon as the function is created and it is deleted as soon as the task of the function is finished. Variables defined outside the function are considered global, they can be modified by the function but they are created and survive the function. 

You can see the difference with the two examples below:

In [135]:
def func():
    k=[] # Empty list
    for i in range(5):
        k.append(i)
    return k

In [136]:
func()

[0, 1, 2, 3, 4]

In [137]:
k

NameError: name 'k' is not defined

In [138]:
k=[] # Empty list
def func():
 
    for i in range(5):
        k.append(i)
    return k

In [139]:
func()

[0, 1, 2, 3, 4]

In [140]:
k

[0, 1, 2, 3, 4]

The second definition of __func__ is an example where the function uses a global variable which is defined outside the function. This said a global variable could be defined inside the function:


In [141]:
a = None
def bind_a_variable():
    global a
    a = []
bind_a_variable()
print(a)

[]


### Lambda Functions

Lambda functions are also known as anonymous functions. They consist of one statement and they are declared with being given a name.

Below we re-write a function into a lambda function:

In [142]:
def simple_function(x):
    return x**3

Is equivalent to:

In [143]:
output = lambda x:x**3

Lambda functions are useful with data analysis because there are many cases where data transformation function will take functions as arguments. 

It is often clearer and more efficient to pass on a lambda functions as opposed to creating a new function. What follows is an example. Say you want to sort the below list of strings based on the number of distinct letters it has:

In [144]:
list_str = ['books','pen','window','class','a','table']

In [145]:
list_str.sort(key=lambda x: len((list(x))))

In [146]:
list_str

['a', 'pen', 'books', 'class', 'table', 'window']

Lambda could also be used for 'currying' which is a jargon (based on the famous computer scientist Haskell Curry) that means driving a function from another function by turning one argument (i.e. variable) into a constant. For example:

In [147]:
def addition_f (x,y):
    return (x+y)

In [148]:
add_2 = lambda y: addition_f(2,y)

In [149]:
add_2

<function __main__.<lambda>>

In [150]:
add_2(4)

6

## Files

Opening a file for reading or writing in Python is fairly straightforward.

In [151]:
path = 'examples/text.txt'

In [152]:
f = open(path)

In [153]:
f

<_io.TextIOWrapper name='examples/text.txt' mode='r' encoding='cp1252'>

In [154]:
for line in f:
   pass

In [155]:
lines = [x.rstrip() for x in open(path)]

In [156]:
lines

['', 'Risk Dairies', 'Example', 'third line', '']

In [157]:
f.close()