# Lists
A list is a mutable sequence of heterogenous objects and can be defined as follows:

In [1]:
g7_countries = ["U.S.", "U.K.", "France", "Germany", "Italy", "Canada", "Japan"]

You can obtain the length of a list using the function ***len***:

In [2]:
len(g7_countries)

7

You can add an element to a list with the function ***append***:

In [3]:
bills = ["$10", "$20", "$30", "$10", "$10"]

In [6]:
bills.append("$8.77")

In [7]:
bills

['$10', '$20', '$30', '$10', '$10', '$7.77', '$8.77']

Delete an element from a list with the ***remove*** method:

In [8]:
bills.remove('$10')

In [9]:
bills

['$20', '$30', '$10', '$10', '$7.77', '$8.77']

Finally, you can add one list to another by using the `+` operator or by using the `extend` method.

In [10]:
g8_countries = g7_countries + ["Russia"]

In [11]:
g8_countries

['U.S.', 'U.K.', 'France', 'Germany', 'Italy', 'Canada', 'Japan', 'Russia']

It's possible to access list items by their ***index***. The first index is 0:

In [12]:
bills[0]

'$20'

In the same way, we can modify a list item by accessing it's index:

In [13]:
bills

['$20', '$30', '$10', '$10', '$7.77', '$8.77']

In [14]:
bills[0] = '$50'

In [15]:
bills

['$50', '$30', '$10', '$10', '$7.77', '$8.77']

Lists are heterogenous sequences since a single list can contain different types of objects:

In [16]:
import os

In [18]:
various = ['Lamp', 10, os]
various

['Lamp', 10, <module 'os' from '/usr/lib/python2.7/os.pyc'>]

# Tuples
Like lists, tuples are heterogenous, ordered sequences:

In [24]:
address_port = ('http://127.0.0.1', 9099)

In [21]:
singleton_tuple = (555,)
type(singleton_tuple)

tuple

Unlike lists, tuples are ***immutable***, that is that you can't modify them once they've been created. They have no `append`, `remove`, or `extend` methods, and trying to assign directly will throw an error:

In [26]:
try:
    address_port[0] = 1

except TypeError:
    print("You can't assign a value to a tuple!")

You can't assign a value to a tuple!


## Packing and unpacking

Variables can be easily grouped, and numerous variables can be created from a tuple. We call these operations ***packing*** and ***unpacking***.

For example, you can assign the tuple defined above to two variables in the following way:

In [6]:
address_port = ('192.168.40.16', 8888)
address, port = address_port
items = ["aaa", "5", "sldkjfsldkjfl"]
print(';;;'.join(items))

aaa;;;5;;;sldkjfsldkjfl


In [28]:
address

'http://127.0.0.1'

In [29]:
port

9099

Packing is done automatically in the case of functions that return several variables, for example, a function returning the coordinates of a point in a cartesian map could resemble the following:

In [7]:
point = (5, 10)

def jump_point(x, y):
    return (x*2, y*2)

In [9]:
jump_point(*point)

(10, 20)

In [11]:
def greet_person(name, uid, greeting):
    return f"{greeting} {name}, your uid is {uid}!"

In [13]:
person = {
    "uid": 42,
    "name": "John",
    "greeting": "Howdy",
}

greet_person(**person)

'Howdy John, your uid is 42!'

In [1]:
ppp = 777

def get_coordinates(point):
    """Gets the coordinates of a point

    After a very complex set of operations, the coordinates
    of a point are obtained and returned to the user"""
    
    global ppp 
    ppp = 111
    
    # very complex set of operations
    x = 10
    y = 20
    return x, y

ppp

777

In [2]:
get_coordinates("test")
ppp

111

In [1]:
def really_return_a_tuple(x, y):
    return y, x

really_return_a_tuple(55, 22)

(22, 55)

In [4]:
def do_stuff(a, b, c):
    return c, b, a

l = (1, 2, 3)
do_stuff(*l)

(3, 2, 1)

## Why use tuples instead of lists?

In general, tuples are used in Python to group values having a strong link between themselves and that are not subject to change, like the IP address and port in the example above.

Another example would be that of geographic coordinates in the context of mapping.

The Python interpreter uses tuples in many places, for example when a function must return several values.

## Strings are sequences
In Python, strings are really ordered sequences of characters.

In [40]:
name = "Savoir-faire Linux"
name 2

SyntaxError: invalid syntax (<ipython-input-40-463d0be93f55>, line 2)

## Methods common to all types of sequences

<table>
    <tr>
        <td>Operation</td>
        <td>Result</td>
    </tr>
    <tr>
        <td>x in s</td>
        <td>True if an item of s is equal to x, else False</td>
    </tr>
    <tr>
        <td>x not in s</td>
        <td>False if an item of s is equal to x, else True</td>
    </tr>
    <tr>
        <td>s + t</td>
        <td>the concatenation of s and t</td>
    </tr>
    <tr>
        <td>s * n, n * s</td>
        <td>n shallow copies of s concatenated</td>
    </tr>
    <tr>
        <td>s[i]</td>
        <td>ith item of s, origin 0</td>
    </tr>
    <tr>
        <td>s[i:j]</td>
        <td>slice of s from i to j</td>
    </tr>
    <tr>
        <td>s[i:j:k]</td>
        <td>slice of s from i to j with step k</td>
    </tr>
    <tr>
        <td>len(s)</td>
        <td>length of s</td>
    </tr>
    <tr>
        <td>min(s)</td>
        <td>smallest item of s</td>
    </tr>
    <tr>
        <td>max(s)</td>
        <td>largest item of s</td>
    </tr>
    <tr>
        <td>s.index(i)</td>
        <td>index of the first occurence of i in s</td>
    </tr>
    <tr>
        <td>s.count(i)</td>
        <td>total number of occurences of i in s</td>
    </tr>
<table>
    
* [Source](http://docs.python.org/2/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange)

### Slicing

In Python, it's possible to get a ***sub-sequence*** from a sequence using ***slicing*** operations.

The syntax is as follows:
`s[i:j:k]`

Where:

* ***i*** is the start index
* ***j*** is the end index
* ***k*** is the step

Here is a list of 10 numbers:

In [36]:
list = [1,2,3,4,5,6,7,8,9,10]

We can get the sequence of the first 5 numbers in the following way:

In [37]:
list[0:5]

[1, 2, 3, 4, 5]

We can get the four last numbers this way:

In [38]:
list[5:10]

[6, 7, 8, 9, 10]

***Note***: Each of these parameters is optional. The default value of ***i*** is 0, the default value of ***j*** is n, where n is the sequence lenght, and the defalut value of ***k*** is 1.

We could thus write the operations above in the following way:

In [40]:
list[:5] == list[0:5] and list[5:] == list[5:10]

True

The step determines the increment. To skip every other number:

In [41]:
list[::2]

[1, 3, 5, 7, 9]

In [42]:
list[:5]

[1, 2, 3, 4, 5]

The step can also be used to reverse the sequence:

In [43]:
list[::-1]

[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

## Other sequence operations

#### Obtain the maximum and minimum of a sequence
We can obtain the minimum and maximum of a sequence using the functions ***min*** and ***max***:

In [44]:
min(list)

1

In [45]:
max(list)

10

#### Index, number of occurences

Use the ***index*** function to get the index of an item in a sequence.
Use the ***count*** function to get the number of occurences of an item in a sequence

In [47]:
# Count the number of $10 bills
bills.count('$10')

2

In [48]:
# Position of the letter "i" in Savoir-faire Linux
"Savoir-faire Linux".index("i")

4

#### Verify if an object equal to x is present in s
`x in s`

In [2]:
"Belgium" in g7_countries

False

In [3]:
g7_countries

['U.S.', 'U.K.', 'France', 'Germany', 'Italy', 'Canada', 'Japan']

In [43]:
"Canada" in g7_countries and "France" in g7_countries or "Johnistan" in g7_countries

True

#### Verify if an object equal to x is absent from x
`x not in s`

In [51]:
"Belgium" not in g7_countries

True

In [53]:
"U.K." not in g7_countries

False

#### Concatenate two different sequences
`s + t`

In [54]:
g7_countries + g8_countries

['U.S.',
 'U.K.',
 'France',
 'Germany',
 'Italy',
 'Canada',
 'Japan',
 'U.S.',
 'U.K.',
 'France',
 'Germany',
 'Italy',
 'Canada',
 'Japan',
 'Russia']

In [4]:
bills_jan = ('$10', '$20', '$30',)

In [5]:
bills_feb = ('$5', '$7', '$6',)

In [6]:
bills_jan + bills_feb

('$10', '$20', '$30', '$5', '$7', '$6')

In [170]:
"Savoir-faire" + " Linux"

'Savoir-faire Linux'

***Important!*** The two sequences must be of the same type

In [60]:
type(g7_countries)

list

In [61]:
type(bills_jan)

tuple

In [62]:
try:
    g7_countries + bills_jan
except TypeError:
    print "Can't concatenate different string types!"

Can't concatenate different string types!


#### Multiplying a sequence
`s * n`

In [7]:
bills_jan * 3

('$10', '$20', '$30', '$10', '$20', '$30', '$10', '$20', '$30')

## Exercises

Count the number of occurences of the letter "i":

In [4]:
name = "Savoir-faire Linux"
cars = ['ferrari', 'ferrari', 'pinto']

In [2]:
name.count('i')

3

In [5]:
cars.count('ferrari')

2

### Numbers

For the following numbers

In [48]:
numbers = ["1", "2", "3", "4", "5"]

Write a function that takes a list as an argument, for each number, prints whether it is odd or even

In [8]:
iseven([5, 6, 8, 99, 101, 12312, 55])

Value: 5 False
Value: 6 True
Value: 8 True
Value: 99 False
Value: 101 False
Value: 12312 True
Value: 55 False


### First Names
In this exercise, we will calculate statistics on different first names.

The _names.txt_ file contains 11627 first names. For each name, the information is organized as follows:

    name;sex;language;frequency

##### 1. Open the file (names.txt) and create a list of names

In [17]:
with open('files/names-utf8.txt') as f:
    f.readline()
    names = []
    for l in f:
        names.append(l.split(';')[0])
        
print(names[:10])

['aaliyah', 'aapeli', 'aapo', 'aaren', 'aarne', 'aarón', 'aaron', 'aatami', 'aatto', 'aatu']


##### 2. Create a 4-tuple for each entry and add it to a new list.

In [21]:
namedata = []
transdict = {ord(' '): None}
with open('files/names-utf8.txt') as f:
    f.readline()
    for l in f:
        linedata = l.rstrip().split(';')
        linedata[1] = tuple(linedata[1].split(','))
        
        languages = []
        for lang in linedata[2].split(','):
            languages.append(lang.strip())
        linedata[2] = tuple(languages)
        linedata[3] = float(linedata[3])
        
        namedata.append(tuple(linedata))
        
namedata[:10]

[('aaliyah', ('f',), ('english (modern)',), 0.0),
 ('aapeli', ('m',), ('finnish',), 0.0),
 ('aapo', ('m',), ('finnish',), 0.0),
 ('aaren', ('m', 'f'), ('english',), 0.0),
 ('aarne', ('m',), ('finnish',), 0.0),
 ('aarón', ('m',), ('spanish',), 0.0),
 ('aaron', ('m',), ('english', 'biblical'), 1.37),
 ('aatami', ('m',), ('finnish',), 0.0),
 ('aatto', ('m',), ('finnish',), 0.0),
 ('aatu', ('m',), ('finnish',), 0.0)]

##### 3. Find the longest name

In [22]:
max_name_len = 0
longest_name = ''
for i in namedata:
    lname = len(i[0])
    if lname > max_name_len:
        max_name_len = lname
        longest_name = i[0]
        
print(f"The longest name is {longest_name} (length: {max_name_len})")

The longest name is mictlantecuhtli (length: 15)


##### 4. Find the most popular male name

In [28]:
maxpop = 0
maxpopname = ''
for name, sex, lang, pop in namedata:
    if ('m' in sex and pop > maxpop):
        maxpopname = name
        maxpop = pop
            
print(f"The most popular male name is {maxpopname} (pop: {maxpop})")

The most popular male name is les (pop: 15809.49)


##### 5. Count the number of spanish names in the list

In [29]:
nspanish = 0
for name, sex, lang, pop in namedata:
    if 'spanish' in lang:
        nspanish += 1
            
print(f"The number of spanish names is {nspanish}")

The number of spanish names is 553


### References 

* [5.6 - Sequence types in python](http://docs.python.org/library/stdtypes.html#typesseq)
* [5.6.4 - Mutable sequence types](http://docs.python.org/library/stdtypes.html#typesseq-mutable)
