# Python lists and tuples

## Lists: simple, nested, empty

A **list** is a collection of any number (0, 1, 2, ...) of elements (possibly of different types).  
In a list, each element has its position. The index numbers the positions. Note, that the first position has index zero.  
**Lists are mutable.** After a list is created, it can be changed - elements can be added, removed or modified.

Let's start with a list of elements of the same type:

In [1]:
dailyKCal = [ 2330, 1990, 2150, 2290, 1920, 2370, 2050 ]
dailyKCal

[2330, 1990, 2150, 2290, 1920, 2370, 2050]

In [2]:
days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days

['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

In [3]:
goodMoods = [ True, True, False, True, True, False, True ]
goodMoods

[True, True, False, True, True, False, True]

The `type(...)` function can be used to check whether an object is of type `list`:

In [4]:
type( goodMoods )

list

In [5]:
type( [ 2330, 1990, 2150 ] )

list

The number of elements in a list can be obtained with the `len(...)` function:

In [6]:
len(days)

7

A list may also contain elements of different types:

In [7]:
dataDay1 = [ 2330, "Mon", True ]

In particular, lists can be nested. Here are two examples of lists containing other lists:

In [8]:
dataByDays = [ [ 2330, "Mon", True ], [ 1990, "Tue", True ], [ 2150, "Wed", False ] ]
print( len(dataByDays) )
dataByDays

3


[[2330, 'Mon', True], [1990, 'Tue', True], [2150, 'Wed', False]]

In [9]:
dataByVars = [ dailyKCal, days, goodMoods ]
print( len(dataByVars) )
dataByVars

3


[[2330, 1990, 2150, 2290, 1920, 2370, 2050],
 ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'],
 [True, True, False, True, True, False, True]]

Note, a list can also be empty. A new empty list can be created as follows:

In [10]:
emptyList = []
print( len(emptyList) )
emptyList

0


[]

## Tuples

A **tuple** is also a collection of any number (0, 1, 2, ...) of elements (possibly of different types).  
In a tuple, each element has its position. The index numbers the positions. Note, that the first position has index zero.  
**Tuples are immutable** - once a tuple is created, it is not possible to change its elements.

Tupes are created with a syntax similar to lists. For tuples use `(...)` instead of `[...]`:

In [11]:
dailyKCal = ( 2330, 1990, 2150, 2290, 1920, 2370, 2050 )
dailyKCal

(2330, 1990, 2150, 2290, 1920, 2370, 2050)

In [12]:
type( dailyKCal )

tuple

In [13]:
len( dailyKCal )

7

Because `(` and `)` are also used in arithmetics, a special notation (with an extra `,`) is needed to create a tuple with exactly one element:

In [14]:
singleDayTuple = ( "Mon", )
type( singleDayTuple )

tuple

Note, that omitting the extra `,` leads to an object of a different type. Compare:

In [15]:
type( ( 1 ) )      # int

int

In [16]:
type( ( 1, ) )     # tuple with one element

tuple

An empty tuple is `()`.

In [17]:
emptyTuple = ()
print( type( emptyTuple ) )
print( len( emptyTuple ) )
emptyTuple

<class 'tuple'>
0


()

Note the assignment of a tuple to multiple variables:

In [18]:
x, y, z = ( "a", True, 0 )
print( x )
print( y )
print( z )

a
True
0


## Concatenation are repetition

Let's assume that each list below describes some preparation steps:

In [20]:
americano = [ "espresso", "hot water" ]
caffe_latte = [ "espresso", "steamed milk" ]
latte_macchiato = [ "steamed milk", "espresso" ]
apple_pie_set = [ "apple pie", "whipped cream" ]


Then, steps for a larger order can be combined using list concatentation (`+`) and list repetition (`*`) operators:

In [21]:
order_steps = 2 * americano + caffe_latte + 3 * apple_pie_set + latte_macchiato
order_steps

['espresso',
 'hot water',
 'espresso',
 'hot water',
 'espresso',
 'steamed milk',
 'apple pie',
 'whipped cream',
 'apple pie',
 'whipped cream',
 'apple pie',
 'whipped cream',
 'steamed milk',
 'espresso']

Concatenation of tuples and tuple repetition works.
Concatenating a tuple with a list leads to an error.

In [24]:
#( 1, ) + [ 2 ]     # Leads to TypeError

## Access/change of a single element

Let's define a list (or a tuple):

In [25]:
days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days

['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

The first element of a list/tuple can be accessed at index zero:

In [26]:
days[0]

'Mon'

To calculate the total number of elements (length) in a list/tuple use `len(...)`:

In [27]:
len( days )

7

Since the numbering of elements starts from zero, the last element of a list/tuple can be accessed as follows:

In [28]:
days[ len(days)-1 ]

'Sun'

Negative index allows accessing elements relative to the end. Another way to access the last element is:

In [29]:
days[ -1 ]

'Sun'

Usage of an index beyond the range of elements present in a list/tuple raises an error exception:

In [31]:
#days[7]                # IndexError: tuple index out of range 
                         # valid indexes are 0,1,2,3,4,5,6

In lists the elements can be modified:

In [32]:
days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days[0] = "MONDAY"       # works fine, days is a list
days

['MONDAY', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

But the tuples are immutable - the following code raises an error exception:

In [34]:
days = ( "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" )
#days[0] = "MONDAY"     # TypeError: 'tuple' object does not support item assignment

## Access/change of multiple elements

The slice operator `[n:m]` applied to a list/tuple gets its elements from the positions `n`...`m-1` and creates a new list/tuple containing only them: 

In [36]:
# ----- the following few code cells work both for lists and tuples -----
days = ( "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" )
#days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days[2:5]                # note: Elements with indexes 2,3 and 4 (but not 5).
                         #       "Mon" has index 0.

('Wed', 'Thu', 'Fri')

`[:m]` denotes indexes from the beginning (`0`) till `m-1`:

In [37]:
workingDays = days[:5]
workingDays

('Mon', 'Tue', 'Wed', 'Thu', 'Fri')

Similarly, `[n:]` denotes indexes from `n` till the last:

In [38]:
weekendDays = days[5:]
weekendDays

('Sat', 'Sun')

Consequently, a slice `[:]` makes a separate copy of the whole list:

In [39]:
copiedDays = days[:]

Slicing `[n:m:step]` can take an extra `step` argument: 

In [40]:
days[1:6:2]

('Tue', 'Thu', 'Sat')

The `step` argument can be negative:

In [41]:
days[6:1:-2]

('Sun', 'Fri', 'Wed')

In lists, several elements can be modified as follows:

In [42]:
# ----- modification here, so it can't be a tuple -----
days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days[2:3] = [ "TUESDAY", "WEDNESDAY" ]
days


['Mon', 'Tue', 'TUESDAY', 'WEDNESDAY', 'Thu', 'Fri', 'Sat', 'Sun']

Note, that elements can be not only modified but also added/removed (i.e. the total length of the list changes):

In [43]:
# ----- modification here, so it can't be a tuple -----
days = [ "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun" ]
days[2:3] = [ "TUESDAY", "moon-eclipse-night!!!", "WEDNESDAY" ]
days

['Mon',
 'Tue',
 'TUESDAY',
 'moon-eclipse-night!!!',
 'WEDNESDAY',
 'Thu',
 'Fri',
 'Sat',
 'Sun']

## Shallow and deeper copying

Study the following example:

In [44]:
v = [ 1, 2, 3 ]   # []      allocates a new list
                  # 1,2,3   fills the list with 1, 2, 3
                  # v =     makes v point to the list
w = v             # w = v   makes w point to the same thing as v
v[0] = 100        #         changes the element of the list
v

[100, 2, 3]

Because both `v` and `w` point to the same list, `w` has been also changed:

In [45]:
w

[100, 2, 3]

When the above behaviour is not desired, a deeper `copy()` needs to be enforced:

In [46]:
v = [ 1, 2, 3 ]
w = v.copy()      # a new list is allocated, also possible: w = v[:]
                  # the elements of v are appended to the new list
v[0] = 100
v

[100, 2, 3]

Now, the change of `v` has not affected `w`:

In [47]:
w

[1, 2, 3]

## List comprehensions

Let's introduce a very powerful mechanisms allowing to perform operations on all elements of lists, tuples (or other iterable objects).  

The following **list comprehension** code `[x**2 for x in someNums]`:
- outputs a list: it is a list comprehension because of surrounding brackets `[` ... `]`
- iterates through `someNums` (an object which can be iterated over)
- in each iteration `x` is set to a subsequent element value (`for x in`)
- the result of expression `x**2` is stored to the output list

In [48]:
someNums = [ 1, -2, 3, -4, 5 ]        # an iterable object
[x**2 for x in someNums]              # a list comprehension

[1, 4, 9, 16, 25]

Here, each iteration produces a tuple with two elements `(x, x**2)`. So the result is a list of tuples:

In [49]:
someNums = [ 1, -2, 3, -4, 5 ]        # an iterable object
[(x,x**2) for x in someNums]          # each element is a tuple

[(1, 1), (-2, 4), (3, 9), (-4, 16), (5, 25)]

Note, that the list comprehension notation also allows for filtering (here: `x<0`):

In [50]:
someNums = [ 1, -2, 3, -4, 5 ]        # an iterable object
[x for x in someNums if x<0]          # only negative elements are processed

[-2, -4]

Finally, observe, that the variable `x` has the scope local to the list comprehension.  
That local `x` never exists outside of the comprehension, and it does not affect the value of another variable `x` defined outside of the comprehension:

In [None]:
x = "I am the outside x"              # a global variable
someNums = [ 1, -2, 3, -4, 5 ]        # an iterable object
[x for x in someNums if x<0]          # local x is used, not the global one
x                                     # the global x is not changed

## Self-study tasks

### Understand errors

Try the following cells to get familiar with error messages.  
Explain the errors.

In [52]:
nums = ( 1, 2, 3 )
#nums[1] = 22              # what's wrong here?
#you can't change the value of a tuple

In [54]:
nums = [ 1, 2, 3 ]
#nums[3] = 4               # what's wrong here?
#the index is out of range

In [56]:
txt = "Statistics"
#txt[0] = "s"              # what's wrong here?
#str object does not support item assignment

### Understand difference

Why `a` differs from `b`?

In [57]:
v = [ 1, 2, 3 ]
a = v
v = [ 4, 5, 6 ]
a

[1, 2, 3]

In [58]:
v = [ 1, 2, 3 ]
b = v
v[:] = [ 4, 5, 6 ]
b
#the difference is that in the first case, v is assigned to a new list, while in the second case, the elements of the list are changedxf e

[4, 5, 6]

### Changing (or not) lists

Let's assume that a vector `v` with several random numbers is given (an example below).  
Check Python `list` manuals to find programmatic ways in place marked with `...` to produce the various goals requested below.

In [60]:
v = [ 5, 2, 1, 4, 3 ]
v.sort()
v
# Here v should be sorted in ascending order [ 1, 2, 3, 4, 5 ]

[1, 2, 3, 4, 5]

In [62]:
v = [ 5, 2, 1, 4, 3 ]
v.sort(reverse=True)
v
# Here v should be sorted in descending order [ 5, 4, 3, 2, 1 ]

[5, 4, 3, 2, 1]

In [67]:
v = [ 5, 2, 1, 4, 3 ]
w = v.copy()
v.sort()
print(w)
print(v)
# Here v should be sorted [ 1, 2, 3, 4, 5 ] but w should still be [ 5, 2, 1, 4, 3 ]

[5, 2, 1, 4, 3]
[1, 2, 3, 4, 5]


In [68]:
v = [ 5, 2, 1, 4, 3 ]
v.reverse()
v
# Here v should be reversed [ 3, 4, 1, 2, 5 ]

[3, 4, 1, 2, 5]

In [69]:
v = [ 5, 2, 1, 4, 3 ]
w = v.copy()
v.reverse()
print(w)
print(v)
# Here v should be reversed [ 3, 4, 1, 2, 5 ] but w should still be [ 5, 2, 1, 4, 3 ]

[5, 2, 1, 4, 3]
[3, 4, 1, 2, 5]


In [70]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
v.remove("a")
v
# Here, the third element should be deleted from v: [ "eeeee", "bb", "dddd", "ccc", "bb" ]

['eeeee', 'bb', 'dddd', 'ccc', 'bb']

In [71]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
v.remove("bb")
v
# Here, the first element with value "bb" should be removed from v: [ "eeeee", "a", "dddd", "ccc", "bb" ]
# Any ideas how to filter out all "bb" elements?

['eeeee', 'a', 'dddd', 'ccc', 'bb']

In [73]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
#v[2] = "F"
v.insert(2, "F")
v
# Insert element "F" to v at index 2

['eeeee', 'bb', 'F', 'a', 'dddd', 'ccc', 'bb']

In [91]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
w = [ "ffffff", "g", "ffffff" ]
index = 2
for i in w:
    v.insert(index, i)
    index += 1
print(v)
len(v)
# Insert ("slice in") elements of w to v at index 2 (so, that v gets length 9 and v[2:5] has elements from w).

['eeeee', 'bb', 'ffffff', 'g', 'ffffff', 'a', 'dddd', 'ccc', 'bb']


9

In [92]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
v.append("F")
v
# Here, a single new element "F" should be *appended* to the end of the list

['eeeee', 'bb', 'a', 'dddd', 'ccc', 'bb', 'F']

In [95]:
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
w = [ "ffffff", "g", "ffffff" ]
v.extend(w)
print(v)
len(v)
# Extend the list by append elements of w to v at the end of v. (so, length of v should be 9!)
# What goes wrong with `append(w)` here?

['eeeee', 'bb', 'a', 'dddd', 'ccc', 'bb', 'ffffff', 'g', 'ffffff']


9

### Checking membership

In [97]:
v = [ "ababab", "baaab", "bbbaa", "aabba", "aaaab", "abbaa", "aabbb", "abaaa", "aaaaa", "bbbab", "bbbqb", "aaaqb", "bbbbq" ]
w = "bbbab"
for i in range(len(v)):
    if v[i] == w:
        print("Found at index", i)
        break
# How to programmatically find whether the value of w is *in* the iterable list v?
# The result should be True or False

Found at index 9


In [121]:
v = [ "ababab", "baaab", "bbbaa", "aabba", "aaaab", "abbaa", "aabbb", "abaaa", "aaaaa", "bbbab", "bbbqb", "aaaqb", "bbbbq" ]
w = "abaab"
for i in range(len(v)):
    if v[i] == w:
        print("Found at index", i)
    elif i == len(v)-1:
        print(False)
    
# How to programmatically find whether the value of w is *not in* the iterable list v?
# The result should be True or False

False


### Understand conversions

In [103]:
lst = [1,2,3,"x","y","z"]
tuple( lst )              # the argument can be any object which can be iterated over

(1, 2, 3, 'x', 'y', 'z')

In [104]:
tpl = (1,2,3,"x","y","z")
list( tpl )               # the argument can be any object which can be iterated over

[1, 2, 3, 'x', 'y', 'z']

In [105]:
tuple( "Statistics" )     # "Statistics" can be iterated over

('S', 't', 'a', 't', 'i', 's', 't', 'i', 'c', 's')

In [106]:
list( 'Data Science' )

['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']

### Practice comprehensions and try a generator

Let's write a simple comprehension first:

In [128]:
xs = [ 0, 1, 2, 3, 4, 5 ]
ys = [x**2 + x + 1 for x in xs]
ys

#for x in xs:
#    x = x**2 + x + 1
#   ys.append(x)
#    print(ys)
# write a comprehension to transform 
#   elements of xs according to formula:
#   y = x^2 + x + 1

[1, 3, 7, 13, 21, 31]

Next, read about `range(...)` and use it to rewrite the definition of `xs` in the above code.  
Check the type of `xs`. Note, that `xs` created by `range(...)` is not a list but the comprehesion still works (once).  
Can you explain the mechanism (keyword: iterable)?

In [137]:
xs = list(range(6))                 # it should correspond to these numbers [ 0, 1, 2, 3, 4, 5 ]
ys = [x**2 + x +1 for x in xs]      # as before: y = x^2 + x + 1
ys

[1, 3, 7, 13, 21, 31]

Now, write a comprehension to remove all elements equal to `toRemove` from `v`:

In [142]:
toRemove = "bb"
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
w = [ e for e in v if e != toRemove]     # w should not have elements equal the value of toRemove
w

['eeeee', 'a', 'dddd', 'ccc']

Next, generalize the last comprehension to handle a situation when `toRemove` contains more than one element.  
Read about the `in` operator for lists and about `not` logical operator.

In [144]:
toRemove = [ "bb", "a" ]
v = [ "eeeee", "bb", "a", "dddd", "ccc", "bb" ]
w = [ e for e in v if e not in toRemove ]     # w should *not* have elements *in* toRemove list
w

['eeeee', 'dddd', 'ccc']

In [145]:
v = [ "ababab", "baaab", "bbbaa", "aabba", "aaaab", "abbaa", "aabbb", "abaaa", "aaaaa", "bbbab", "bbbqb", "aaaqb", "bbbbq" ]
w = [ "aaaqb", "abbaa", "ababab" ]
all([ e in v for e in w ])
# Write a statement checking whether *all* elements of w are *in* v.
# The result should be a single True or False value.

True

In [146]:
v = [ "ababab", "baaab", "bbbaa", "aabba", "aaaab", "abbaa", "aabbb", "abaaa", "aaaaa", "bbbab", "bbbqb", "aaaqb", "bbbbq" ]
w = [ "abaaa", "bbbab", "qbbbq" ]
any([ e not in v for e in w ])
# Write a statement checking whether *any* element of w is *not in* v.
# The result should be a single True or False value.

True

### Comprehensions with tuple elements

`zip(...)` can be used to build tuples out of elements of two iterables. Tuples of elements at the same positions can be iterated over. It also works for more than two lists.

In [148]:
heights = [ 173, 179, 167, 195, 173, 184, 162, 169 ]  # 8 persons
weights = [ 57, 58, 62, 84, 64, 74, 57, 44 ]          # same 8 persons, same order
zip( heights, weights )                # this is a generator of ( height, weight ) tuples
list( zip( heights, weights ) )        # when converted to list you see the tuples
BMI = [ w / (h/100)**2 for h, w in zip( heights, weights ) ]
BMI
# Write a statement generating list of BMIs for the 8 persons.
# Hint: [ ... for h, v in ... ]

[19.045073340238563,
 18.101807059704754,
 22.230987127541326,
 22.090729783037478,
 21.383941996057334,
 21.85727788279773,
 21.719250114311837,
 15.405623052414134]

`enumerate(...)` adds information at which index an element is:

In [151]:
heights = [ 173, 179, 167, 195, 173, 184, 162, 169 ]  # 8 persons
enumerate( heights )                   # this is a generator of ( index, element ) tuple pairs
tuple( enumerate( heights ) )          # do you understand this result?

((0, 173),
 (1, 179),
 (2, 167),
 (3, 195),
 (4, 173),
 (5, 184),
 (6, 162),
 (7, 169))

### Indexing nested list

Let's consider the following nested list:

In [154]:
nestedList = [ "a", [ "ba", [ "bba", "bbb" ], "bc", [ "bda", "bdb" ], "be" ], "c", [ [ "daa", "dab" ] ] ]
nestedList

['a',
 ['ba', ['bba', 'bbb'], 'bc', ['bda', 'bdb'], 'be'],
 'c',
 [['daa', 'dab']]]

Without running the code do you know what will be the result of each of the following indexing?