# 5 STRUCTURED TYPES, MUTABILITY, AND HIGHERORDER FUNCTIONS

The programs we have looked at thus far have dealt with three types of objects: `int,float, and str`. 

The numeric types `int` and `float` are scalar(标量) types. That is to say, objects of these types have `no accessible internal structure`. 

In contrast, `str` can be thought of as a `structured, or non-scalar`, type. One can use indexing to extract individual characters from a string and slicing to extract substrings.

In this chapter, we introduce `four additional structured types`. 

* One, <b style="color:blue">tuple</b>(元组）, is a rather simple generalization of `str`. 

* The other three—<b style="color:blue">list</b>(列表）

* **range**</b>   and 

* <b style="color:blue">dict</b>(字典） —are more **interesting**. 

We also return to the topic of <b style="color:blue">functions</b> with some examples that illustrate the utility of being able to treat functions in the same way as other types of objects.

## 5.3 Lists and Mutability

Like a tuple,a **list** is an <b>ordered sequence</b> of values, where each value is <b>identified by an index </b>. 

The syntax for expressing literals of type list is similar to that used for tuples; 

the difference is that we use <b>square brackets []</b> rather than parentheses(). 

So, for example, the code,

In [None]:
L = ['I did it all', 4, 'love']  # square brackets []

for i in range(len(L)):
    print(L[i])

for li in L:
    print(li)


The <b>empty list</b> is written as <b>[]</b>

<p> <b>Singleton lists</b> are written <b>without comma</b> before the closing bracket.

In [None]:
Lempty=[]   #empty list

Lonly1=[10] # singleton list: without comma

print('empty list:',Lempty)

print(type(Lonly1))
print(Lonly1)

Occasionally, the fact that Square brackets  $[]$ are used for 

* 1 **literals** of type list

* 2 **indexing** into lists, and

* 3 **slicing** lists

can lead to some `visual confusion`. 

For example:the expression `[1,2,3,4][1:3][1]`, which evaluates to 3, uses the square brackets in three different ways. 


In [None]:
print([1,2,3,4])  #  literals of typel ist

print([1,2,3,4][1:3]) # slicing list

print([1,2,3,4][1:3][1]) # licing list,then indexing into sliced list


This is rarely a problem in practice, because most of the time lists are `built incrementally` rather than `written as literals`.

----

### lists are` mutable`

Lists differ from tuples in one hugely important way:

<b style="color:blue">lists are mutable</b>

**tuples and strings** are `immutable`

There are many operators that can be used to create objects of these immutable types, and variables can be bound to objects of these types.
But objects of `immutable` types **cannot be modified `after they are created`**. 


On the other hand, objects of type `list`  **can be modified `after they are created`**.


The `distinction` between <b>mutating an object</b> and <b>assigning an object to a variable</b> may, at first, appear subtle. However, if you keep repeating the mantra, 

“In Python a variable is merely a name, i.e., a label that can be attached to an object,” 

it will bring you clarity. When the statements

In [None]:
Techs = ['MIT', 'Caltech']
Ivys = ['Harvard', 'Yale', 'Brown']

are executed, the interpreter creates two new lists and binds the appropriate variables to them, as pictured in Figure 5.1.

![ Figure 5.1](./img/fig51.PNG)

The assignment statements

In [None]:
Univs = [Techs, Ivys]

Univs1 = [['MIT', 'Caltech'], ['Harvard', 'Yale', 'Brown']]

also create new lists and bind variables to them. The elements of these lists are themselves lists. The three print statements


In [None]:
print('Univs =', Univs) 
print('Univs1 =', Univs1)
print(Univs == Univs1)

produce the output

It appears as if `Univs` and `Univs1` are bound to <b style="color:blue">the same value</b>. 

But `appearances` can be `deceiving`. 

As the following picture illustrates, `Univs` and `Univs1` are bound to quite **different** values.
![fig52](./img/fig52.PNG) 

That `Univs` and`Univs1` are bound to different objects can be verified using the built-in Python function 

* <b style="color:blue">id</b>, which returns  <b style="color:blue">a unique integer identifier</b> for an object. 

This function allows us to **test for `object equality`**. When we run the code

In [None]:
print(Univs == Univs1) # test value equality

print(id(Univs) == id(Univs1)) #test object equality

print('Id of Univs =', id(Univs))

print('Id of Univs1 =', id(Univs1))

The **elements of `Univs`** are **not copies of the lists** to which `Techs` and `Ivys` are bound, but are rather the `themselves`.

The **elements of `Univs1`** are lists that contain <b>the same `elements`</b> as the lists in `Univs`,but they are **not the same `lists`**.

We can see this by running the code


In [None]:
print('Ids of Techs, Univs[0]', id(Techs), id(Univs[0]))

print('Ids of Ivys, Univs[1]', id(Ivys), id(Univs[1]))

print('Ids of Univs[0] and Univs[1]', id(Univs[0]), id(Univs[1]))

print('Ids of Univs1[0] and Univs1[1]', id(Univs1[0]), id(Univs1[1]))

<b style="color:blue;font-size:150%"> Why does this matter? It matters because lists are mutable</b>

Consider the code

In [None]:
# Techs = ['MIT', 'Caltech']

Techs.append('RPI') #through the variable Techs 
print(Techs)

The **append** method has **a side effect.** 

* Rather than create a `new` list, it **`mutates the existing` list Techs** by adding a new element,`the string 'RPI'`, to the end of it.

The Figure depicts the state of the computation after append is executed.
![fig53](./img/fig53.PNG) 


The object to which **Univs**is bound still contains the `same two lists`, but the `contents` of
one of those lists has been `changed`. Consequently, the print statements

In [None]:
print('Univs =', Univs)
print('Univs1 =', Univs1)

What we have here is something called **aliasing**(别名). 

There are `two distinct paths` to the same list object. 

One path is through the variable `Techs` and the other through the `first element` of the list object to which `Univs` is bound. 

One can `mutate` the object via `either` path, and the effect of the mutation will be visible through both paths. 

This can be `convenient`, but it can also be `treacherous`. 

* **Unintentional aliasing leads to programming errors that are often enormously hard to track down**.

As with tuples, a **for** statement can be used to iterate over the elements of a list. For example

In [None]:
for e in Univs:
    print('Univs contains', e) # list 
    print('   which contains')
    for u in e:
        print('    ', u)   # the elements of a list.

### append VS `concatenation(+) or extend`

When we `append` one list to another, e.g., `Techs.append(Ivys`), the `original structure is maintained`. I.e., the result is `a list that contains a list`. 

In [None]:
Techs = ['MIT', 'Caltech']
Ivys = ['Harvard', 'Yale', 'Brown']
Techs.append(Ivys)
Techs

Suppose we do not want to maintain this structure, but want to add the elements of one list into another list. We can do that by using list `concatenation` or the `extend` method, e.g.,

* Concatenating lists：+  

* Combining lists： extend

In [None]:
L1 = [1,2,3]
L2 = [4,5,6]

#  +  creates a new list
L3 = L1 + L2 
print('L3= L1 + L2,L3 ', L3)

print('id L1=',id(L1))
print('id L2=',id(L2))
print('a new list:id L3=',id(L3))

# extend : add items in the list L2 to the end of list L
L1.extend(L2) # 1
print('L1.extend(L2) ,L1 =', L1)
print('mutated L1: id L1.extend(L2)=',id(L1))

# append : dd objects e to the end of L1
L1.append(L2) 
print('L1.append(L2), L1 =', L1)
print('mutated L1: id L1.append(L2)=',id(L1))

Notice that:

* the operator(concatenation): `+` does not have a side effect. It creates **a new list** and returns it. 


* In contrast,**extend** and **append** each **mutated** L1. 


### The List's Methods

The list data type has some more methods. Here are all of the methods of list objects:

https://docs.python.org/tutorial/datastructures.html#more-on-lists


#### `list`function
The `list` function is frequently used in data processing as a way to materialize an iterator or generator expression:


In [None]:
gen = range(10)
gen

In [None]:
list(gen) 


![fig54](./img/fig54.PNG) 

Note that: `all of these` except <b style="color:blue">count</b> and <b style="color:blue">index</b> `mutate` the list.

#####   Supplementary List member Functions

* L.clear(): remove all the items from the lst and return None; same as del L[:].

* L.copy(): return a copy of L; same as L[:]


In [None]:
L = [1,2,3,3,1]
L.count(3)

In [None]:
L.index(2)

####  Adding and removing elements

* append,insert

* pop

`Elements` can be appended to the end of the list with the `append` method:

In [None]:
L = [1,2,3]
L.append(3) 
L

In [None]:
L = [1,2,3]
L.append([3,4]) 
L

Using `insert` you can insert an element at a specific `location` in the list:

In [None]:
L = [1,2,3]
L.insert(1, 'red')
L

The insertion index must be between `0 and the length of the list`, inclusive.

The inverse operation to insert is `pop`, which `removes` and returns `an element` at a particular `index`:

In [None]:
L.pop(2)

In [None]:
L

`Elements` can be `removed` by `value` with **remove**, which locates the `first such value` and removes it from the last:

In [None]:
L=[1, 'red', 2, 3]
L.append('red')
L

In [None]:
L.remove('red')
L

Check if a list contains a value using the **in** keyword:

In [None]:
'red' in L

The keyword **not** can be used to negate in:

In [None]:
'red' not in L

**NOTE** : Checking whether a list contains a value is **a lot slower** than doing so with **dicts and
sets** (to be introduced shortly), as Python makes a linear scan across the values of the
list, whereas it can check the others (based on hash tables) in constant time.

#### combining lists:extend

If you have a list already defined, you can append multiple `elements` to it using the `extend` method:

In [None]:
L = [1,2,3]
L.extend(['3',4]) 
L

Using `extend` to append `elements` to an `existing` list, especially if you are building up a `large` list, is usually preferable.Thus,

```python
everything = []
for chunk in list_of_lists:
    everything.extend(chunk)
```
is **faster** than the concatenative alternative:
```python
everything = []
for chunk in list_of_lists:
    everything = everything + chunk
```

#### Sorting:sort
You can sort a list `in-place (without creating a new object)` by calling its `sort` function:

In [None]:
a = [7, 2, 5, 1, 3]
a.sort()
a

`sort` has a few options that will occasionally come in handy. One is the ability to pass a secondary sort **key**—that is, `a function` that produces a value to use to sort the objects.

For example, we could sort a collection of strings by their `lengths`

In [None]:
b = ['saw', 'small', 'He', 'foxes', 'six']
b.sort(key=len)
b

#### reverse

reverse the order of element in L

In [None]:
L=[1,2,3]
L.reverse()
L

### Slicing

Recap:
> [Lecture1-2-02_INTRODUCTION_TO_PYTHON: Slicing String](./Lecture1-2-02_INTRODUCTION_TO_PYTHON.ipynb)
>
>**Strings are one of several sequence types in Python** 
>
>**They `share` the following operations with `all sequence` types.**
>
>* **Slicing** is used to extract substrings of arbitrary length. If s is a string, the expression <b>s[start:end] </b> denotes the >substring of s that starts at index start and ends at index <b>end-1</b>.

You can select sections of most sequence types by using slice notation, which in its basic form consists of `start:stop` passed to the indexing operator `[]`:

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[1:5]

Slices can also be `assigned` to with a sequence:

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[3:4]

In [None]:
seq[3:4] = [6, 3]
seq

the element of seq[3:4] `[7]` is replaces by the  `[6, 3]`

While the element at the `start` index is `included`, the `stop` index is `not included`, so that the number of elements in the result is `stop - start`.

Either `the start or stop can be omitted`, in which case they default to the start of the sequence and the end of the sequence, respectively:

the start is omitted 

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[:5]

the stop is omitted

In [None]:
seq[3:]

`Negative` indices slice the sequence `relative to the end`:

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[-4:]

In [None]:
seq[-6:-2]

The Last item

In [None]:
seq[-1:]

A `step` can also be used after a second colon to, say, take `every other` element:


In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[::2]

A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:

In [None]:
seq[::-1]

### 5.3.1 Cloning

It is usually prudent to <b>avoid mutating a list over which one is iterating</b>. Consider, for example, the code

In [None]:
def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
       Removes any element from L1 that also occurs in L2"""
    for e1 in L1:
       
        # display mutation：L1.remove(e1)
        print('Current Item=',e1) 
        print('Current len(L1)=',len(L1))  
       
        print('L1=',L1,'\n')
        
        if e1 in L2:
            L1.remove(e1) # mutation：L1.remove(e1)

L1 = [1,2,3,4]
L2 = [1,2,5,6]

removeDups(L1, L2)
# 1,2
# L1=[3,4]
print('\n removeDups L1 =', L1)

#### 1 One way to <b>avoid this kind of problem is to use slicing to clone</b> 

     make a copy of the list and write 
     
```python     
     for e1 in L1[:]:
```

In [None]:
def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
       Removes any element from L1 that also occurs in L2"""
 
    for e1 in L1[:]: # use slicing to clone
        
        print('Current Item=',e1) 
        print('Current len(L1)=',len(L1))  
       
        print('L1=',L1,'\n')
        
        if e1 in L2:
            L1.remove(e1)

L1 = [1,2,3,4]
L2 = [1,2,5,6]
removeDups(L1, L2)
print('\n removeDups L1 =', L1)

* <b>newL1 = L1</b> merely have introduced <b>a new name for L1</b>

  * Assignment statements in Python do not copy objects, they create bindings between a target and an object.

In [None]:
#Page 63-64
def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
       Removes any element from L1 that also occurs in L2"""
    
    newL1=L1  # Assignment statements in Python do not copy objects, 
              # they create bindings between a target and an object.
    
    for e1 in newL1:
        
        print(len(L1))  # display mutation
        print('L1=',L1)
        
        if e1 in L2:
            L1.remove(e1)

L1 = [1,2,3,4]
L2 = [1,2,5,6]
removeDups(L1, L2)
print('\n removeDups L1 =', L1)

#### 2  The expression <b>list(l)</b> returns a copy of the list l. 

In [None]:
#Page 63-64
def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
       Removes any element from L1 that also occurs in L2"""
    
    newL1=list(L1)  # a copy of the list L1
    
    for e1 in newL1:
        
        print(len(L1))  # display mutation
        print('L1=',L1)
        
        if e1 in L2:
            L1.remove(e1)

L1 = [1,2,3,4]
L2 = [1,2,5,6]
removeDups(L1, L2)
print('\n removeDups L1 =', L1)

### Further Reading

**1 Python 8.10 copy — Shallow and deep copy operations**

https://docs.python.org/3/library/copy.html

For collections that are mutable or contain mutable items, 

a copy is sometimes needed so one can change one copy without changing the other.

This module provides generic shallow and deep copy operations (explained below).

<p>Interface summary:
<ul>
<li>copy.copy(x): Return a shallow copy of x.
<li>copy.deepcopy(x): Return a deep copy of x.
</ul>

* A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
* A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

**2. The Python Standard Library by Example 2.8 copy—Duplicate Objects**

In [None]:
import copy

def removeDups(L1, L2):
    """Assumes that L1 and L2 are lists.
       Removes any element from L1 that also occurs in L2"""
    
    newL1=copy.deepcopy(L1)  # a copy of the list L1
    
    for e1 in newL1:
        
        print(len(L1))  # display mutation
        print('L1=',L1)
        
        if e1 in L2:
            L1.remove(e1)

L1 = [1,2,3,4]
L2 = [1,2,5,6]
removeDups(L1, L2)
print('\n removeDups L1 =', L1)

<strong style="color:blue;font-size:200%">Cloning Methods</strong>

* slicing：L1[:]
* List(L1)
* copy.copy(L1),copy.deepcopy(L1)

### 5.3.2 List Comprehension

List comprehension provides a concise way to apply an operation to the values in a sequence.

It creates a new list in which each element is the result of applying a given operation to a value from a sequence 

In [None]:
L = [x**2 for x in range(1,7)]
print(L)

In [None]:
L =[]
for x in range(1,7):
    L.append(x**2)
print(L)

The `for` clause in a list comprehension can be <b>followed</b> by one or more 

* <b>if </b> statements 

* <b>for</b> statements 

that are applied to the values produced by the `for` clause.

* `if` statements

In [None]:
mixed = [1, 2, 'a', 3, 4.0]
print([x**2 for x in mixed if type(x) == int])

* `for` statements 

In [None]:
print([x*y for x in [1,2,3] for y in  [1,2,3]])

#### Further Reading：Python Tutorial

* 5.1.3 List Comprehensions https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

* 5.1.4 Nested List Comprehensions https://docs.python.org/3/tutorial/datastructures.html#nested-list-comprehensions


Remember that somebody else may need to read your code

* **subtle** is not usually a desirable property 

## 5.4 Functions as Objects

In Python, functions are **first-class objects**.That means that they can be treated `like objects of any other type`, e.g., int or list. They have types, e.g.,

In [None]:
type(abs)

In [None]:
type(removeDups)

they can appear in expressions, e.g., as the right-hand side of an assignment statement or as an argument to a function;they can be elements of lists; etc.

Using `functions as arguments` allows a style of coding called **higher-order programming**. It can be particularly convenient in conjunction with lists, as shown in

In [None]:
%%file functionsCh4.py

def factI(n):
    """Assumes that n is an int > 0
      Returns n!"""
    result = 1
    while n > 1:
        result = result * n
        n -= 1
    return result
   
def factR(n):
    """Assumes that n is an int > 0
      Returns n!"""
    if n == 1:
        return n
    else:
        return n*factR(n - 1)

def fib(n):
    """Assumes n an int >= 0
       Returns Fibonacci of n"""
    if n == 0 or n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)



In [None]:
from functionsCh4 import *

def applyToEach(L, func):
    """Assumes L is a list, func a function
       Mutates L by replacing each element, e, of L by f(e)"""
    for i in range(len(L)):
        L[i] = func(L[i])
      
L = [1, -2, 3.33]
print('L =', L)
print('\nApply abs to each element of L.')

applyToEach(L, abs)

print('L =', L)

print('\nApply int to each element of', L)

applyToEach(L, int)

print('L =', L)

print('\nApply factorial to each element of', L)
#  functionsCh4.py
applyToEach(L, factR)

print('L =', L)

print('\nApply Fibonnaci to each element of', L)
# functionsCh4.py
applyToEach(L, fib)

print('L =', L)

The function `applyToEach` is called `higher-order` because it has an `argument` that is itself `a function`

### map: a built-in higher-order function in Python

* the <b>simplest form</b> ：

  * the first argument to `map` is <b>a unary function</b>, a function that has only <b>one parameter</b> 
  * the second argument is any ordered collection of values  suitable as arguments to the first argument.

In [None]:
for i in map(fib, [2, 6, 4]):
    print(i)

In [None]:
list(map(factR, [1, 2, 3]))

In [None]:
l=[]
for i in [1,2,3]:
    l.append(factR(i))
l

* **More generally** 

  * the first argument to `map` can be of <b>function of n arguments</b>, in which case it must be followed by <b>n subsequent ordered collections</b>

In [None]:
help(min)

In [None]:
#Page 64
L1 = [1, 28, 36]
L2 = [2, 57, 9]

print(list(map(min, L1, L2)))  # min

In [None]:
L1 = [1, 28, 36]
L2 = [2, 57, 9]
lmin=[]
for i in range(3):
    lmin.append(min(L1[i],L2[i]))
print(lmin)

### lambda

Python supports the creation of `anonymous` functions (i.e., functions that are not bound to a name), using the reserved word **lambda**. 

The general form of a lambda expression is
```python
lambda <sequence of variable names>: <expression>
```
For example, the lambda expression `lambda x, y: x*y` returns a function that returns  the product of its two arguments.


In [None]:
adder = lambda x, y: x+y
print(adder(3,6))

 Lambda expressions are frequently used as arguments to higher-order functions. For example, the code

In [None]:
L = []
for i in map(lambda x, y: x**y, [1 ,2 ,3, 4], [3, 2, 1, 0]):
    L.append(i)
print(L)


### Further Reading

* The Python Standard Library: Built-in Functions `map`: https://docs.python.org/3/library/functions.html#map

* The Python Language Reference ：6.13 `Lambdas` https://docs.python.org/3/reference/expressions.html#lambda


## 5.5 Strings, Tuples, Ranges, and Lists

We have looked at four different sequence types: `str, tuple, range, and list`. 

They are similar in that objects of of these types can be operated upon as described in the Figure
<img src="./img/fig56.PNG"/>

Some of their other similarities and differences are summarized in the Figure

![fig57](./img/fig57.PNG)



Python programmers tend to use <b style="color:blue">lists</b> far more <b style="color:blue">often</b> than <b style="color:blue">tuples</b>. 

Since `lists` are **mutable**, they can be **constructed incrementally** during a computation.For example, the following code incrementally builds a list containing all of the `even` numbers in another list.

In [None]:
L=[1, -2, 3.33,4]
evenElems = []
for e in L:
    if e%2 == 0:
        evenElems.append(e)
        
print(evenElems)

### Built-in Methods of strings

Since strings can contain only characters, there are <b>many built-in methods</b> that make life easy

Keep in mind that since strings are immutable these all return values and have no side effect.
<p>
<img src="./img/fig58.PNG"/>

In [None]:
s='David Guttag plays basketball David'
s.find('David')

In [None]:
s.rfind('David')

In [None]:
s="David Guttag plays basketball     "  # trailing whitespace space
s.rstrip()

###  split

One of the more useful built-in methods is `split`, which takes two strings as arguments. The second argument specifies a separator that is used to split the first argument into a sequence of substrings. For example,

* s.split(d): Splits `s` using `d` as a delimiter


In [None]:
print('My favorite professor--John G.--rocks'.split(' '))
print('My favorite professor--John G.--rocks'.split('-'))
print('My favorite professor--John G.--rocks'.split('--'))

In [None]:
s='David*Guttag*plays*basketball'
s.split('*')

In [None]:
s

####  whitespace  characters:

The second argument is optional. If that argument is omitted the first string is split using arbitrary strings of whitespace characters (space, tab, newline, return, and formfeed).

If `d` is omitted,
```python
s.split()
```
the substrings are seperated by  whitespace  characters:

|space| tab |newline | return |formfeed|
|:---:|----:|-------:|-------:|------:|
|  space    |  \t |  \n  | \r    |  \f  |
 

In [None]:
s='David\t Guttag \n plays\r basketball\f whitespace characters '
s.split()   

In [None]:
print(s)

#### s.split(d) to read plain text files:

* Data Table,Dict and List

* [Lecture5-1-18_UNDERSTANDING_EXPERIMENTAL_DATA](./Lecture5-1-18_UNDERSTANDING_EXPERIMENTAL_DATA.ipynb)
