# Jupyter Notebook Tutorial

+ Run selected cell: CRTL + ENTER
+ insert a cell above: a
+ insert a cell below: b
+ delete a cell: x
+ code auto complete: tab
+ review function API: SHIFT + tab

# Python Review

- Python script is case sensitive
- Indentation is used to denote for code block
- \# is the line comment, ''' ''' is the paragraph comment
- A Python identifier is a sequence of characters that consists of letters, digits and underscores (_). An identifier must start with a letter of an underscore. It cannot start with a digit

In [None]:
# this is my first comment
"""comment1
comment2
comment3
"""
# comment1
# comment2
# comment3

#### Indentation
- Python uses whitespace (tabs or spaces) to structure code block instead of using braces as in many other languages
- Python statements also do not need to be terminated by semicolons. Semicolons can be used, however, to separate mutiple statements on a single line

In [None]:
x = 1
if x > 0:
    x = x + 1
    print(x)
else:
    x = x - 1
    print(x)

In [None]:
x = 3.14
id(x)

In [None]:
x = 'hello'

In [None]:
a = 1; b = 2; c = 3

In [None]:
s = 'hello'
for i in s:
    print(i, end=' ')

In [None]:
print(1, 2, 3, sep='+')

### Line Continuation
In Python code, a statement can be continued from one line to the next in two different ways: implicit and explicit line continuation
- Implicit line continuation
  - Any statement containing opening parentheses (`(`), brackets (`[`), or curly braces (`{`) is presumed to be incomplete until all matching parentheses, brackets, and braces have been encountered -   - Until then, the statement can be implicitly continued across lines without raising an error
  
- Explicit line continuation
  - To indicate explicit line continuation, you can specify a backslash (`\`) character as the final character on the line
  - The backslash character must be the last character on the line. Not even whitespace is allowed after it
  - In that case, Python ignores the following newline, and the statement is effectively continued on next line

In [None]:
x = (1 + 2 + 3 + 4 + 5 + 
max(33, 66, 99))
x

In [None]:
x = 1 + 2 + 3 + 4 + 5 + \
max(33, 66, 99)
x

#### Everything is Object
An important characteristic of the Python language is the consistency of its object model. Every number, string, data structure, function, class, module, etc. is a Python object. **Each object has an associated type and internal data**

#### Dynamic Typing and Type Checking
Python uses dynamic typing—it determines the type of the object a variable refers to while executing your code. You can use `type()` or `isinstance()` to check the type of an object

A variable in Python is actually a reference to an object. For `x = 7`, we say `x` is an integer variable that holds value 7. Strictly speaking, `x` is a variable that references an `int` object for value 7. To check if two variables reference the same object, use the `is` keyword. `is not` is also perfectly valid if you want to check that two objects are not the same

<div>
<img src="attachment:f1.png" width="260"/>
</div>

In [None]:
a = 6.0
type(a)

In [None]:
isinstance(a, int)

In [None]:
#isinstance() can accept a tuple of types
isinstance(a, (int, float, str))

In [None]:
a = [1, 2, 3]
b = a
c = list(a)
a is b

In [None]:
id(a)

In [None]:
id(b)

In [None]:
id(c)

In [None]:
a is not c

Since `list` always creates a new Python list (i.e., a copy), we can be sure that `c` is distinct from `a`. Comparing with `is` is not the same as the `==` operator

In [None]:
a == c

A very common use of `is` and `is not` is to check if a variable is `None`, since there is only one instance of `None`

In [None]:
a = None
a is None

### Data Type in Python
Every number in Python is an object. For example, when we define an integer in Python, such as x = 100, x is not just a "raw" integer. It's actually a pointer to a compound C structure, which contains several values.
This means that there is some overhead in storing an integer in Python as compared to an integer in a compiled language like C.

<div>
<img src="attachment:f1.png" width="350"/>
</div>

A single integer in Python actually contains four pieces (see the structure in the cell below):
    
`struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long ob_digit[1];
};`

- ob_refcnt, a reference count that helps Python silently handle memory allocation and deallocation
- ob_type, which encodes the type of the variable
- ob_size, which specifies the size of the following data members
- ob_digit, which contains the actual integer value that we expect the Python variable to represent

#### Standard Python Scalar Type
- `int`: arbitrary precision signed integer
- `float`: double precision (64-bit) floating point number
- `bytes`: raw ASCII bytes
- `str`: string type, holds Unicode (UTF-8 encoded) strings
- `bool`: Boolean type, `True` or `False`
- `None`: Python's "null" value (only one instance of the `None` object exists)

### String
A string is a sequence of Unicode characters. String literals can be enclosed in matching single quotes (') or double quotes ("). **Python strings are immutable**. Python does not have a data type for characters. A single-character string represents a character. 
You can use the + operator add two numbers. The + operator can also be used to concatenate (combine) two strings.
#### Testing Strings
<div>
<img src="attachment:f1.png" width="500"/>
</div>

In [1]:
# the space character is not alnum
s = 'welcome to python3'
s.isalnum()

False

In [None]:
s1 = ' \t \n'
s1.isspace()

In [None]:
''.isspace()

In [None]:
'3abc'.isidentifier()

In [1]:
'for'.isidentifier()

True

In [None]:
'abc$abc'.islower()

In [None]:
'123456'.isdigit()

#### Searching for Substrings
<div>
<img src="attachment:f2.png" width="500"/>
</div>

In [None]:
s

In [None]:
s.endswith('thon')

In [None]:
s.count('o')

In [None]:
s.find('o')

#### Converting Strings
<div>
<img src="attachment:f1.png" width="500"/>
</div>

In [None]:
s = 'welcome to python3'

In [None]:
s1 = s.capitalize()
s1

In [None]:
s

In [None]:
s.title()

In [None]:
s.replace('python', 'java')

In [None]:
s

In [None]:
s.replace('o', 'x', 2)

`replace` will substitute occurences of one pattern for another. It is commonly used to delete patterns, too, by passing an empty string

In [None]:
s = '$360'
s.replace('$', '')

#### Striping Whitespace Characters
<div>
<img src="attachment:f2.png" width="500"/>
</div>

In [3]:
s = '   Welcome to Python\t'
s1 = s.lstrip()
s1

'Welcome to Python\t'

In [None]:
s

In [2]:
s = 'www.tiktok.com'
s.strip('wcom.')

'tiktok'

#### Two More Methods
- `split() # Splits the string at the specified separator, and returns a list`
- `splitlines() # Splits the string at line breaks (`\n`) and returns a list`

In [None]:
items = 'Jane John Peter Susan'.split()
items

In [None]:
lines = """hello world
DSA 5100
is fun"""
lines

In [None]:
lines.splitlines()

In [4]:
# Extract 2021 from string 'November 1, 2021'
items = '11/01/2021'.split('/')
items[2]

'2021'

In [2]:
s1 = 'abc@ucmo.edu'
result = s1.split('@')
result[0]

'abc'

In [None]:
val = 'a,b,  guido'
val.split(',')

`split` is often combined with `strip` to trim whitespace (including line breaks)

In [None]:
pieces = [x.strip() for x in val.split(',')]
pieces

These substrings can be concatenated together using `+`

In [None]:
first, second, third = pieces

In [None]:
first + '::' + second + '::' + third

A faster and more Pythonic way is to pass a list or tuple to the `join` method on the string '::'

In [None]:
'::'.join(pieces)

In [None]:
'::'.join('hello')

In [None]:
'::'.join('hello', 'world')

In [None]:
'::'.join(3)

Other methods are concerned with locating substrings. Using Python's `in` keyword is the best way to detect a substring, though `index` and `find` can also be used

In [None]:
val

In [None]:
'guido' in val

In [None]:
# if not found, an exception is raised
val.index(':')

In [None]:
# if not found, -1 is returned
val.find(':')

### Formatted String Literals
f-strings are string literals that have an `f` at the beginning and curly braces containing expressions that will be replaced with their values. Because f-strings are evaluated at runtime, you can put any and all valid Python expressions in them

In [None]:
name = 'John'
age = 36
print(f'Hello {name}! You are {age}.')

In [None]:
print(f'Hello {name.lower()}! You are {age}.')

In [None]:
# using format method
print('Hello {}! You are {}'.format(name, age))

In [None]:
print('{1} {0} {1}'.format('Happy', 'Birthday'))

### List

A list is a sequence defined by the list class. Lists are used to store multiple items in a single variable. A list can contain the elements of the same type or mixed types. The elements in a list are separated by commas and are ecnclosed by a pair of brackets ([ ]). 

**List items are ordered, changeable, and allow duplicate values**

List items are indexed, the first item has index `[0]`, the second item has index `[1]` etc.

In [None]:
# For convenience, you may create a list using the following syntax

list1 = [] # Same as list()
list2 = [2, 3, 4] # Same as list([2, 3, 4]) 
list3 = ["red", "green"] # Same as list(["red", "green"])

### `list()` Function
You can define a list with the built-in `list()` function

```python
list(<iter>)
```

The argument `<iter>` is an `iterable` which contains objects to be included in the list

In [4]:
# Creating Lists

list1 = list() # Create an empty list
list2 = list([2, 3, 4]) # Create a list with elements 2, 3, 4
list3 = list(["red", "green", "blue"]) # Create a list with strings
list4 = list(range(3, 6)) # Create a list with elements 3, 4, 5
list5 = list("abcd") # Create a list with characters a, b, c

In [5]:
range(3, 6)

range(3, 6)

In [6]:
list(range(3, 6))

[3, 4, 5]

In [7]:
list5 = list("abcd")
list5

['a', 'b', 'c', 'd']

In [8]:
l1 = list(True)
l1

TypeError: 'bool' object is not iterable

In [None]:
list([2, 3, 4])

In [None]:
list(2, 3, 4)

In [2]:
list(3)

TypeError: 'int' object is not iterable

### `del` Statement
The `del` statement can removes the specified index from a list. It can also delete a slice of the list or the list completely. In fact, it can delete any varibale from the interactive session

In [9]:
list3

['red', 'green', 'blue']

In [10]:
del list3[0]
list3

['green', 'blue']

In [None]:
del list2

In [None]:
list2

The `list` function is frequently used in data processing as a way to **materialize an iterator or generator expression**

In [11]:
range(6)

range(0, 6)

In [12]:
list(range(6))

[0, 1, 2, 3, 4, 5]

### Copy a List
You cannot copy a list simply by typing `list2 = list1`. Because `list2` will only be a reference to `list1`, and changes made in `list1` will automatically also be made in `list2`

There are ways to make a copy, one way is to use the built-in List method `copy()`. It returns a shallow copy

In [None]:
list1 = ['apple', 'banana', 'cherry']
list2 = list1
list2[0] = 'orange'
list1

In [None]:
list1 = ['apple', 'banana', 'cherry']
list2 = list1.copy()
print(id(list1), id(list2))

In [None]:
print(id(list1[0]), id(list2[0]))

In [None]:
id('apple')

In [None]:
list2[0] = 'orange'
list2

In [None]:
list1

In [None]:
print(id(list1[0]), id(list2[0]))

In [None]:
id('orange')

### Common Operations for Lists

- `x in s # True if element x is in list s`
- `x not in s # True if element x is not in list s`
- `s1 + s2 # concatenate two lists s1 and s2`
- `s * n`, `n * s` `# n copies of list s concatenated`
- `s[i] # ith element in list s`
- `s[i : j] # slice of list from index i to j - 1`
- `len(s) # length of list s (number of elements in s)`
- `min(s) # smallest element in list s`
- `max(s) # largest element in list s`
- `sum(s) # sum of all elelemnts in list s`
- `for loop # traverse elements from left to right in a for loop`
- `<`, `<=`, `>`, `>=`, `==`, `!=` `# compare two lists`

### Index Operator [ ]

An element in a list can be assessed through the index operator by using the syntax `s[index]`. List indexes are `0` based. `s[index]` can be used just like a variable so it is also known as an index variable. Python also allows the use of negative numbers as indexes to reference positions relative to the end of the list. For example, `s[-1] = s[-1 + len(s)]`

<div>
<img src="attachment:f1.png" width="600"/>
</div>

In [None]:
s2 = [1, 2, 3, [1, 2, ['one', 'two', 'three']]]

In [None]:
s2[3][2][2]

In [None]:
s2[-1][-1][-1]

### List Slicing [start : end : step]

- The index operator allows you to select an element at the specified index
- The slicing operator returns a slice of the list using syntax `list[start:end]`
- The slice is a sublist from index `start` to index `end - 1`
- If `start >= end`, `list[start:end]` returns an empty list
- If `end > len(list)`, Python will use the length of the list for `end` instead
- The starting index or ending index may be omitted. In that case, the starting index is `0` and the ending index is the list length `len(list)`
- The step is optional. If not specified, the default step is 1
- **When a negative step is used, Python steps backward through the list**
- **In Python list slicing, slices will be copies (shallow copies)**

In [13]:
s = [1, 2, 3, 4, 5, 6]

In [14]:
s[1:4:1]

[2, 3, 4]

In [15]:
s[:4]

[1, 2, 3, 4]

In [None]:
s[1:]

In [None]:
s[5]

In [None]:
s[-1]

In [None]:
s[-2]

In [None]:
s[1::2]

In [None]:
s[:]

In [None]:
s1 = [1, 2, 3, 4, 5, 6]

In [None]:
s1[1:1]

In [None]:
s1[1:2]

In [None]:
s1[1]

In [None]:
s1[:100]

In [None]:
s1_sub = s1[1:3]
s1_sub

In [None]:
s1_sub[0] = 99

In [None]:
s1_sub

In [None]:
s1

In [None]:
s2 = [[1, 2, 3], 4, 5, 6]

In [None]:
s2_sub = s2[:2]
s2_sub

In [None]:
s2_sub[0][0] = 99
s2_sub

In [None]:
s2

A clever use of this is to pass `-1` which has the effect of reversing a list

In [None]:
s

In [None]:
s[::-1]

In [None]:
s[1:4:-1]

In [None]:
#delete a slice
del s[0:3]

In [None]:
s

In [None]:
s = [1, 2, 3, 4, 5, 6]

In [None]:
del s[1::2]
s

### Difference between Strings and Lists
The `[:]` syntax works for both strings and lists. However, there is an important difference
- If `s` is a string, `s[:]` returns a reference to the same object
- if `s` is a list, `s[:]` returns a new object that is a copy of `s`

In [None]:
s = 'hello'
s[:]

In [None]:
s[:] is s

In [None]:
s = [1, 2, 3]
s[:]

In [None]:
s[:] is s

### Modifying Multiple List Values
Slice assignment can be used to change several contiguous elements in a list at one time

```python
a[start:end] = <iterable>
```

* The number of elements inserted need not be equal to the number replaced. Python just grows or shrinks the list as needed
* You can insert multiple elements in place of a single element - just use a slice that denotes only one element
* Use `a[start:end] = []` to delete multiple elements out of the middle of a list
* You can also insert elements into a list without removing anything. Simply specify a slice of the form `[n:n]` (a zero-length slice) at the desired index

In [None]:
s = [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
s[1:5]

In [None]:
 s[1:5]= [1.1, 2.2, 3.3, 4.4, 5.5, 6.6]
s

In [None]:
s[1:2] = [1.1, 1.2, 1.3]
s

Note that this is not the same as replacing the single element with a list

In [None]:
s[1] = [1.1, 1.2, 1.3]
s

In [None]:
s[1:5] = []
s

In [None]:
# the item at the index and the remaining list elements are pushed to the right
s[1:1] = [33, 44, 55, 66]
s

In [None]:
s[1:1]

### The `+`, `*`, `+=`, `*=`, `in` and `not in` Operators

In [None]:
list1 = [1, 6]
list1

In [None]:
list2 = [2, 9]
list2

In [None]:
list1 + list2

In [None]:
list1 * 3

In [None]:
3 * list1

In [None]:
1 in list1

In [None]:
6 not in list1

In [None]:
list1 += list2 # list1 = list1 + list2
list1

When the left operand of `+=` is a list, the right operand must be an iterable

In [8]:
a_list = []
for number in range(1, 6):
    a_list += [number] # a_list = a_list + [number]
a_list

[1, 2, 3, 4, 5]

In [None]:
a_list = []
for number in range(1, 6):
    a_list += number
a_list

In [None]:
letters = []
letters += 'Python' # string is an iterable
letters

### List Methods

<div>
<img src="attachment:f1.png" width="500"/>
</div>

In [6]:
s1 = [1, 2, 3, 4, 5, 6]
s1

[1, 2, 3, 4, 5, 6]

In [None]:
# the item at the index and the remaining list elements are pushed to the right
s1.insert(1, 99)
s1

In [7]:
s1.append(-1)
s1

[1, 2, 3, 4, 5, 6, -1]

- When the `+` operator is used to concatenate to a list, if the target operand is an iterable, then its elements are broken out and appended to the list individually
- The `append(`) method does not work that way! If an iterable is appended to a list with `append()`, it is added as a single object
- `extend()` also adds to the end of a list, but the argument is expected to be an `iterable`. The items in `<iterable>` are added individually
- `extend()` behaves like the `+` operator. More precisely, since it modifies the list in place, **it behaves like the `+=` operator**

In [None]:
s = ['a', 'b']
id(s)

In [None]:
id(s + [1, 2, 3])

In [None]:
s + [1, 2, 3]

In [None]:
s = ['a', 'b']
id(s)

In [None]:
s += [1, 2, 3] # s = s + [1, 2, 3]
id(s)

In [None]:
s

In [None]:
#appending tuple to list
s = ['a', 'b']
s += (1, 2, 3)
s

In [None]:
#appending string to list
s = ['a', 'b']
s += 'hello'
s

In [None]:
s = ['a', 'b']

In [None]:
s.append([1, 2, 3])
s

In [9]:
s = ['a', 'b']
s.extend([1, 2, 3])
s

['a', 'b', 1, 2, 3]

In [None]:
s = ['a', 'b']
s.extend('hello')
s

In [None]:
s = ['a', 'b']
s.append('hello')
s

In [None]:
s = ['a', 'b']
s.extend(3)
s

In [None]:
s1

In [None]:
s1.sort(reverse=True)
s1

In [None]:
s1 = [34, 67, 99, -1, 66]
s1

In [None]:
s1.reverse()
s1

In [None]:
s1[::-1]

In [None]:
str1 = 'hello'
str1.reverse()

In [None]:
str1[::-1]

In [None]:
s1 = [1, 3, 5, 6, 3, 8, 9, 10]
s1

In [None]:
# a ValueError occurs if the value is not in the list
s1.index(-1)

In [None]:
#specify the starting index of a search
s1.index(3, 2)

In [12]:
list1 = [1, 2, 3, 4, 3, 5, 6]
list1

[1, 2, 3, 4, 3, 5, 6]

In [None]:
list1.remove(3)
list1

In [13]:
n = list1.pop(2)
n

3

In [14]:
n = list1.pop()
n

6

In [None]:
list1.clear()
list1

In [None]:
del list1

In [None]:
list1

### Difference Between `remove()` and `pop()`

- `pop()` specifies the index of the item to remove, rather than the object itself in `remove()`
- `pop()` returns a value: the item that was removed, `remove()` returns `None`

### Sorting
You can sort a list **in place** by calling its `sort()` function. `sort()` has a few options that will occasionally come in handy. One is the ability to pass a secondary sort key - a function that produces a value to use to sort the objects. For example, we can sort a collection of strings by their lengths

In [18]:
b = ['saw', 'small', 'He', 'foxes', 'six']
b.sort(key = len)
b

['He', 'saw', 'six', 'small', 'foxes']

### Traversing a List
The elements in a Python list are iterable. Python supports a convenient `for` loop, which enables you to traverse the list sequentially without using an index variable

In [None]:
list1 = [1, 2, 3, 4, 5, 6]

for i in list1:
    i = i * 2
    print(i, end = ' ')
    
print()
print(list1)

When iterating Python lists, arrays and dictionaries, we are working with a **copy** of each element, not the element itself

In [None]:
# use index to iterate the list

for i in range(len(list1)):
    list1[i] = list1[i] * 2
    print(list1[i], end = ' ')
    
print()
print(list1)

In [None]:
# use while loop to iterate the list

i = 0

while i < len(list1):
    list1[i] = list1[i] * 2
    print(list1[i], end = ' ')
    i += 1
    
print()
print(list1)

## Built-in Sequence Functions
### `enumerate()`
It's common when iterating over a squence, you want to keep track of the index of the current item. Python has a built-in function `enumerate` which returns a list of `(i, value)` tuples
```python
for index, value in enumerate(collection):
   # do something with value
```

In [15]:
enumerate(list1)

<enumerate at 0x106389990>

In [16]:
list(enumerate(list1))

[(0, 1), (1, 2), (2, 4), (3, 3), (4, 5)]

In [17]:
for i, v in enumerate(list1):
    print(i, v)

0 1
1 2
2 4
3 3
4 5


### `sorted()`
The `sorted()` function returns a **new** sorted list from the elements of any sequence. The original sequence is unmodified. The `sorted()` function acceptes the same arguments as the `sort()` method on lists.

In [None]:
list1 = [7, 1, 2, 6, 0, 3, 2]

In [None]:
sorted(list1)

In [None]:
list1

In [19]:
sorted('Horse Race1')

[' ', '1', 'H', 'R', 'a', 'c', 'e', 'e', 'o', 'r', 's']

### `reversed()`
`reversed()` iterates over the elements of a sequence in reverse order. `reversed()` is a generator so that it does not create the reversed sequence until materialized (e.g., with `list` or a `for` loop)

In [16]:
reversed(range(10))

<range_iterator at 0x107df6ac0>

In [17]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [18]:
list(reversed('hello'))

['o', 'l', 'l', 'e', 'h']

### List Comprehension
List comprehensions provide a concise way to create items from sequence. A list comprehension consists of brackets containing an expression followed by a `for` clause, then zero or more `for` or `if` clauses. The result will be a list resulting from evaluating the expression.

**Syntax**

`newlist = [expression for item in iterable if condition == True]`

The return value is a new list, leaving the old list unchanged. The condition is like a filter that only accepts the items that evaluate to `True`. The condition is optional and can be omitted. The iterable can be any iterable object, like a list, tuple, set etc. 

In [None]:
list1 = [x for x in range(0, 5)] # Returns a list of 0, 1, 2, 3, 4
list1

In [None]:
list(range(0, 5))

In [None]:
list1 = []
for i in range(0, 5):
    list1.append(i)
    
list1

In [None]:
list1 = [x * 2 for x in range(0, 5) if x < 3] # Returns a list of 0, 2, 4
list1

In [None]:
list1 / 2

In [None]:
list2 = [0.5 * x for x in list1] 
list2

In [None]:
list3 = [x for x in list2 if x < 1.5]
list3

In [None]:
s = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in s if len(x) > 2]

### Nested List Comprehension
The `for` parts of the list comprehension are arranged according to the order of nesting, and any filter condition is put at the end as before

In [None]:
# to get a single list containing all names with two or more e's in them
data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

# for loop approach
names_of_interest = [] 
for names in data: 
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)
names_of_interest

In [None]:
# nested list comprehenion approach
result = [name for names in data for name in names if name.count('e') >= 2]
result

In [None]:
# flatten a list of tuples of integers into a simple list of integers
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

flattened = []
for tup in some_tuples:
    for x in tup:
        flattened.append(x)
flattened

In [None]:
r1 = [x for tup in some_tuples for x in tup]
r1

It is important to distinguish the syntax just shown from a list comprehension inside a list comprehension. This produces a list of lists, rather than a flattened list of all of the inner elements

In [None]:
# a list comprehension inside a list comprehension
[[x for x in tup] for tup in some_tuples]

### Shallow Copies vs Deep Copies
- The difference between shallow and deep copying is only relevant for mutable objects (objects that contain other objects, like lists or class instances)
- Shallow copy: copy the elements' references but not the objects they point to
- Deep copy: actually copy the referenced objects themselves

In [None]:
# copy() returns a shallow copy
a = [1, 2, 3]
b = [11, 22, 33]
data = [a, b]
data1 = data.copy()

In [None]:
a[0] = 99
data

In [None]:
data1

```python
a = [1, 2, 3]
b = [11, 22, 33]
data = [a, b]
data_mycopy = data
```

<div>
<img src="attachment:f1.png" width="300"/>
</div>

```python
import copy
data_copy = copy.copy(data)
```

<div>
<img src="attachment:f2.png" width="300"/>
</div>

```python
data_deepcopy = copy.deepcopy(data)
```

<div>
<img src="attachment:f3.png" width="300"/>
</div>

In [None]:
import copy
a = [1, 2, 3]
b = [11, 22, 33]
data = [a, b]
data1 = copy.deepcopy(data)

In [None]:
a[0] = 99
data

In [None]:
data1

### List Comparison
You can compare entire list **element-by-element** using comparison operators. Since lists are ordered, lists that have the same elements in a different order are not the same

In [None]:
a = [1, 2, 3]
b = [1, 3, 2]
c = [1, 2, 3, 4]
d = [1, 2, 6]
e = [1, 1, 6]

In [None]:
a == b

In [None]:
a == c

In [None]:
a < c # a has fewer elements than c

In [None]:
c >= b 

In [None]:
c < d

In [None]:
a > e

In [None]:
a = ['foo', 'bar', 'baz', 'qux']
b = ['baz', 'qux', 'bar', 'foo']
a == b

In [None]:
a is b

### Tuples
Tuples are like lists except they are immutable. Once they are created, their contents cannot be changed. In other words, you cannot add new elements, delete elements, replace elements or reorder the elements in the tuple. If the contents of a list in your application do not change, you should use a tuple to prevent data from being modified accidentally. Furthermore, tuples are more efficient than lists. **Tuples and lists are semantically similar and can be used interchangeably in many functions**
<br>

You create a tuple by enclosing its elements inside a pair of ( ). The elements are separated by commas. You can create an empty tuple and create a tuple from a list. Tuples are sequences. The common operations for sequences (those listed in the list section) can be used for tuples.

**Tuple items are ordered, unchangeable, and allow duplicate values**

Tuple items are indexed, the first item has index `[0]`, the second item has index `[1]` etc.


### `tuple()` Function
You can define a tuple with the built-in `tuple()` function

```python
tuple(<iter>)
```

The argument `<iter>` is an `iterable` which contains objects to be included in the tuple

In [None]:
# Creating Tuples

t1 = () # Create an empty tuple

t2 = (1, 3, 5) # Create a tuple with three elements

t3 = 1, 3, 5 # Create a tuple with three elements without ()

In [None]:
# Create a tuple from a list
t4 = tuple([2 * x for x in range(1, 5)]) 

# Create a tuple from a string
t5 = tuple("abac") # t5 is ('a', 'b', 'a', 'c')

In [None]:
t5 = tuple("abac")
t5

In [None]:
tuple([1, 2, 3])

In [None]:
tuple(1, 2, 3)

In [None]:
tuple(1)

In [None]:
t4 = tuple([2 * x for x in range(1, 5)]) 
t4

Elements can be accessed with square brackets `[]` as with most other sequence types

In [None]:
t4[1]

### Tuples May Contain Mutable Objects
While the objects stored in a tuple may be mutable themselves, once the tuple is created, it is not possible to modify which object is stored in each slot

In [None]:
tup = tuple(['foo', [1, 2], True])
tup

In [None]:
tup[0] = 'hi'

If an object inside a tuple is mutable, such as a list, you can modify it in place

In [None]:
tup[1].append(3)

In [None]:
tup

In [None]:
tup[1] = [1, 2, 3, 5]

In [None]:
tup

When creating a tuple with only one item, remember to include a comma after the item, otherwise it will not be identified as a tuple.

In [None]:
# Wrong way to define a tuple with one element
t6 = ('Apple')
type(t6)

In [None]:
t6 = ('Apple', )

In [None]:
type(t6)

The `del` keyword can delete the tuple completely

In [None]:
del t6[0]

In [None]:
del t6

In [None]:
print(t6)

### `+`, `+=`, `*`  and `*=` Operators
As with lists, you can concatenate tuples using the `+` or `+=` operator to produce longer tuples. **For a string or tuple, the item to the right of `+=` must be a string or tuple, respectively - mixed types causes a `TypeError`**. Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple

In [None]:
(3, None, 'foo') + (6, 0) + ('bar',)

In [None]:
('foo', 'bar') * 3

In [None]:
tuple1 = (10, 20, 30)
tuple2 = (40, 50)

In [None]:
tuple1 += tuple2 # tuple1 = tuple1 + tuple2
tuple1

In [None]:
tuple2

In [None]:
tuple1 += [1, 2, 3]
tuple1

In [None]:
#appending tuple to list
numbers = [1, 2, 3, 4, 5]

In [None]:
numbers += (6, 7)
numbers

In [None]:
numbers += 'hello'
numbers

In [None]:
tuple2 *= 3 # tuple2 = tuple2 * 3
tuple2

### `in` and `not in` Operators
To check whether a tuple contains a value, use `in` or `not in` operator

In [None]:
fruits = ('apple', 'banana', 'cherry', 'strawberry', 'raspberry')

In [None]:
'apple' in fruits

In [None]:
'kiwi' not in fruits

### Tuple Unpacking
When we create a tuple, we normally assign values to it. This is called "packing" a tuple. In Python, we are also allowed to extract the values back into variables. This is called tuple unpacking

- `()` on the left side of the equal sign is optional
- the number of variables must match the number of values in the tuple
- if not, you must use an asterisk to collect the remaining values as a list

In [None]:
fruits = ('apple', 'banana', 'cherry')

In [None]:
n1, n2, n3 = fruits

print(n1)
print(n2)
print(n3)

In [None]:
x = 1
y = 2

In [None]:
x, y = y, x

If the number of variables is less than the number of values, you can add an * to the variable name and the values will be assigned to the variable as a list

In [None]:
fruits = ('apple', 'banana', 'cherry', 'strawberry', 'raspberry')

n1, n2, *n3 = fruits

print(n1)
print(n2)
print(n3)

If the asterisk is added to another variable name than the last, Python will assign values to the variable until the number of values left matches the number of variables left

In [None]:
(n1, *n2, n3) = fruits

print(n1)
print(n2)
print(n3)

Even sequences with nested tuples can be unpacked

In [None]:
tup = (1, 2, (3, 6))
a, b, c = tup
c

A common use of tuple (variable) unpacking is iterating over sequences of tuples or lists

In [None]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
    print('a = {0}, b = {1}, c = {2}'.format(a, b, c))

### Sequence Unpacking
Similar as tuple unpacking, you can unpack any sequence's elements by assigning the sequence to a comma-separated list of variables

In [None]:
fruits = ['apple', 'banana', 'cherry', 'strawberry', 'raspberry']

n1, n2, *n3 = fruits

print(n1)
print(n2)
print(n3)

In [None]:
#unpacking a string
s1, s2 = 'hi'

In [None]:
s1

In [None]:
s2

In [None]:
#unpacking a sequence produced by range
n1, n2, n3 = range(10, 40, 10)

In [None]:
n1

In [None]:
n2

In [None]:
n3

### Tuple Methods
Python has two built-in methods that you can use on tuples

- `count()`: Returns the number of times a specified value occurs in a tuple
- `index()`: Searches the tuple for a specified value and returns the position of where it was found

In [None]:
a = (1, 2, 2, 2, 3, 2)
a.count(2)

In [None]:
a.index(2)

### Sets
Sets are like lists to store a collection of items. Unlike lists, the elements in a set are unique and are not placed in any particular order. If your application does not care about the order of the elements, using a set to store elements is more efficient than using lists. The syntax for sets is braces { }. A set can be created in two ways: via the `set()` function or via a set literal with curly braces. **The elements in a set can be objects of different types**

**Set is a collection which is unordered and unindexed. No duplicate members**


### Construct a Set
You can define a set with the built-in `set()` function or with curly braces (`{}`)

```python
set(<iter>)
{<obj>, <obj>, ..., <obj>}
```

- The argument to `set()` is an iterable. It generates a list of elements to be placed into the set.
- The objects in curly braces are placed into the set intact (each `<obj>` becomes a distinct element of the set), even if they are iterable

In [None]:
# Creating Sets

s1 = set() # Create an empty set

s2 = {1, 3, 5} # Create a set with three elements

s3 = set((1, 3, 5)) # Create a set from a tuple

# Create a set from a list
s4 = set([x * 2 for x in range(1, 10)]) 

# Create a set from a string
s5 = set("abac") # s5 is {'a', 'b', 'c'} 

In [None]:
s5

In [None]:
set(1, 2, 3)

In [None]:
set(1)

In [None]:
# the following is not allowed to creat an empty set
s  = {}
type(s)

A set can contain the elements of the same type or mixed types

In [None]:
s = {1, 2, 3, 'one', 'two', 'three'}
s

### Hashable
- Each elelmet in a set must be hashable
- Each object in Python has a hash value and **an object is hashable if its hash value never changes during its lifetime**
- All immutable objects are hashable, but not all hashable objects are immutable
- Objects created from user defined classes are hashable by default
- Lists, sets and dictionaries are not hashable
- You cannot add a list element, a set element or a dictionary element to a set

In [None]:
hash(6)

In [None]:
hash('hello')

In [None]:
hash([1, 2, 3])

In [None]:
hash({1, 2, 3})

In [None]:
hash((1, 2, (2, 3)))

In [None]:
hash((1, 2, [2, 3]))

In [None]:
s = {1, 2, 3, [11, 22, 33]}

In [None]:
s = {1, 2, 3, {1, 2, 3}}

### `+` and `*` Operators

In [None]:
s = {1, 2, 3}

In [None]:
s * 3

In [None]:
s + s

#### Python Set Operations
- `s.add(x)`: add element `x` to the set `s`
- `s.clear()`: reset the set `s` to an empty state, discarding all of its elements
- `s.remove(x)`: remove element `x` from the set `s`. A `KeyError` occurs if `x` is not in the set
- `s.discard(x)`: remove element `x` from the set `s`. Does not cause an exception if `x` is not in the set
- `s.pop()`: remove an arbitray set element and return it. A `KeyError` occurs if the set is empty when you call `pop()`

In [None]:
# Manipulating and Accessing Sets

s3.add(6)

len(s3)

min(s3)

max(s3)

sum(s3)

3 in s3

s3.remove(5)

s3.discard(5)

In [None]:
s3 = set((1, 3, 5))
s3.add(6)
s3

In [None]:
len(s3)

In [None]:
3 in s3

In [None]:
s3.pop()

Sets are equal if and only if their contents are equal

In [None]:
{1, 2, 3} == {3, 2, 1}

In [None]:
{1, 1, 1, 2, 3, 4, 5, 6, 6, 6}

In [None]:
# find unique values
set([1, 1, 1, 2, 3, 4, 5, 6, 6, 6])

In [None]:
# find unique character in a string
s1 = 'aabcdeffg'
len(set(s1))

#### Set Logical Operations
Python provides the methods for performing set union, intersection, difference and symmetric difference operations

- `union`: the union of two sets is a set that contains all the elements from both sets. You can use the `union` method or the `|` operator to perform this operation
- `intersection`: the intersection of two sets is a set that contains the elements that appear in both sets. You can use the `intersection` method or the `&` operator to perform this operation
- `difference`: the difference of two sets is a set that contains the elements in set1 but not in set2. You can use the `difference` method or the `-` operator to perform this operation
- `symmetric difference`: the symmetric difference (or exclusive or) of two sets is a set that contains the elements in either set, bot not in both sets. You can use the `symmetric_difference` method or the `^` operator to perform this operation

In [None]:
s1 = {1, 2, 4}
s2 = {1, 3, 5}
s1 | s2

In [None]:
s1 = {1, 2, 4}
s2 = {1, 3, 5}
s1 & s2

In [None]:
s1 = {1, 2, 4}
s2 = {1, 3, 5}
s1 - s2

In [None]:
s1 = {1, 2, 4}
s2 = {1, 3, 5}
s1 ^ s2

#### Set Comprehension
A set comprehension looks like the equivalent list comprehension except with curly braces instead of square brackets

**Syntax**

`newset = {expression for item in iterable if condition == True}`

For example, if we wanted a set containing just the lengths of the strings contained in a list, we could easily compute this using a set comprehension

In [None]:
s = ['a', 'as', 'bat', 'car', 'dove', 'python']
unique_lengths = {len(x) for x in s}
unique_lengths

### Dictionary
A dictionary is a collection that stores the elements along with the keys. The keys are like an indexer. It enables fast retrieval, deletion and updating of the value by using the key. **A dictionary is a collection which is ordered** (As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered), **changeable and does not allow duplicates**. Dictionaries cannot have two items with the same key

**Dictionary order is guaranteed to be insertion order. Dictionaries preserve insertion order**. Note that updating a key does not affect the order. Keys added after deletion are inserted at the end

Although access to items in a dictionary does not depend on order, Python does guarantee that the order of items in a dictionary is preserved. When displayed, items will appear in the order they were defined, and iteration through the keys will occur in that order as well. Items added to a dictionary are added at the end. If items are deleted, the order of the remaining items is retained

You can create a dictionary by enclosing the items inside a pair of curly brace(`{ }`). Each item consists of a key, followed by a colon, followed by a value. The items are separated by commas. A dictionary cannot contain duplicate keys. The key must be of a hashable type such as `int`, `float`, `str` or `tuple`. The value can be of any type. The Python class for dictionaries is `dict`

- To add an item to a dictionary, use the syntax `dictionaryName[key] = value`. If the key is already in the dictionary, the preceding statement replaces the value for the key. 
- To retrieve a value, simply write an expression using `dictionaryName[key]` 
- To delete an item from a dictionary, use the syntax `del dictionaryName[key]`
- To delete the whole dictionary, use syntax `del dictionaryName`
- to find the number of items in a dictionary, use `len(dictionaryName)`
- to check if a dict contains a key, use `key` in `dictionaryName`
- to retrieve all the keys contained in a dictionary, use `dictionaryName.keys()`
- to retrieve all the values contained in a dictionary, use `dictionaryName.values()`
- to retrieve all the items (key value pairs) contained in a dictionary, use `dictionaryName.items()`

In [27]:
### Creating Dictionaries

students = {} # Create an empty dictionary
students = {"111-11-1111":'John', "222-22-2222":'Frank'} # Create a dictionary

The built-in function `len` returns the number of key-value pairs in a dictionary

In [None]:
len(students)

It is common to end up with two sequences that you want to pair up elemnet-wise in a dict. Since dict is essentially a collection of 2-tuples, the `dict()` function accepts a list of 2-tuples
```python
dict(iterable) # new dictionary initialized as if via:
    d = {}
    for k, v in iterable:
        d[k] = v
```

The `dict()` constructor builds dictionaries directly from sequences of key-value pairs

In [None]:
# create dict from list of tuples
dict([('john', 111), ('jason', 222), ('jack', 333)])

When the keys are simple strings, it is sometimes easier to specify pairs using keyword arguments

In [None]:
# no '' needed for the string
dict(john = 111, jason = 222, jack = 333)

In [None]:
students['333-33-3333'] = "Grace" # Add a new item
students

In [None]:
students['111-11-1111'] = 'John Smith' # modify an item
students

In [None]:
del students['222-22-2222'] # Delete an item
students

In [20]:
dict1 = {'k1':{'k2':{'k3':[1, 2, 'get me']}}, 'k5': 3}

In [21]:
dict1['k1']['k2']['k3'][-1]

'get me'

### Looping Items
You can use `for` loop to traverse all keys in the dictionary

In [None]:
for key in students:
    print(key + ":" + students[key])

### Testing Whether a Key is in a Dictionary
You can use the `in` and `not in` operator to determine whether a key is in the dictionary

In [None]:
'333-33-3333' in students

### Dictionary Comprehension
**Syntax**

`newdict = {key-expr: value-expr for item in iterable if condition == True}`

In [None]:
# for loop approach
dict2 = {}
dict1 = {'k1': 1, 'k2': 2, 'k3': 3}
for k, v in dict1.items():
    dict2[k] = 3 * v
dict2

In [None]:
# for loop approach
dict2 = {}
dict1 = {'k1': 1, 'k2': 2, 'k3': 3}
for k in dict1:
    dict2[k] = 3 * dict1[k]
dict2

In [None]:
# dictionary comprehension
dict1 = {'k1': 1, 'k2': 2, 'k3': 3}
dict2 = {k: 3 * v for k, v in dict1.items()}
dict2

In [None]:
# dictionary comprehension
dict1 = {'k1': 1, 'k2': 2, 'k3': 3}
dict2 = {k: 3 * dict1[k] for k in dict1}
dict2

In [None]:
# create a lookup map of strings to their locations on the list
s = ['a', 'as', 'bat', 'car', 'dove', 'python']
loc_mapping = {val : index for index, val in enumerate(s)}
loc_mapping

### Equality Test
You can use the `==` and `!=` operator to test whether two dictionaries contain the same items (regardless of the order of the items in a dictionary). You cannot use the comparison operator (`>`, `>=`, `<` and `<=`) to compare dictionaries because the items are not ordered

In [None]:
d1 = {'red': 1, 'green': 2}
d2 = {'green': 2, 'red': 1}
d1 == d2

### Dictionary Methods
<div>
<img src="attachment:f1.png" width="500"/>
</div>

In [28]:
students

{'111-11-1111': 'John', '222-22-2222': 'Frank'}

In [29]:
students.keys()

dict_keys(['111-11-1111', '222-22-2222'])

In [None]:
students.values()

In [30]:
students.items()

dict_items([('111-11-1111', 'John'), ('222-22-2222', 'Frank')])

#### `update()` Method
- You can merge one dict into another using the `update()` method
- The `update()` method changes dicts in-place so any existing keys in the data passed to `update()` will have their old values replaced
- If the input argument `<obj>` of the `update()` is a dictionary, `update()` merges the entries from `<obj>` into dictionary. For each key in `<obj>`
  - If the key is not present in dictionary, the key-value pair from `<obj>` is added to dictionary
  - If the key is already present in dictionary, the corresponding value in dictionary for that key is updated to the value from `<obj>`

In [None]:
d1 = {'a': 'hello', 'b': [1, 2, 3], 'c': (3, 6)}
d1.update({'b': 'foo', 'd': 'bar', 'e': 'baz'})
d1

Method `update()` can convert keyword arguments into key-value pairs to insert

In [None]:
d1.update(f='hello')
d1

#### `get()` Method
The dict method `get` can take a default value to be returned. By default, `get` will return `None` if the key is not present

In [None]:
print(d1.get('f'))

In [None]:
d1['f']

In [None]:
# for the get method, you can specify an optional value to return if the specified key does not exist. Default value None
print(d1.get('z', -1))

In [None]:
d1['z']

#### `pop()` Method

In [None]:
d1.pop('z')

In [None]:
d1.pop('z', -1)

In [None]:
d1.pop('f')

In [None]:
d1

#### `popitem()` Method
- `popitem()` removes the last key-value pair added from the dictionary and returns it as a tuple
- In Python versions less than 3.6, `popitem()` would return an arbitrary (random) key-value pair since Python dictionaries were unordered before version 3.6

In [None]:
d1.popitem()

### Summary

<div>
<img src="attachment:f1.png" width="600"/>
</div>

### Lambda Function
A lambda function is a small anonymous function. A lambda function can take any number of arguments, but **can only have a single expression**. Semantically, they are just syntactic sugar for a normal function definition

**Syntax** <br>

> ```python
lambda arguments : expression

The expression is executed and the result is returned

In [None]:
def f(a, b):
    return a * b

In [None]:
lambda a, b : a * b

In [None]:
x = lambda a, b : a * b
print(x(5, 6))

In [None]:
def myfunc(n):
    return lambda a : a * n

mydoubler = myfunc(2)
mytripler = myfunc(3)

print(mydoubler(11))
print(mytripler(11))

In [None]:
# sort a collection of strings by the number of distinct letters in each string
s = ['foo', 'card', 'bar', 'aaaa', 'abab']
s.sort(key=lambda x: len(set(x)))
s

In [None]:
pairs = [(1, 'one'), (4, 'four'), (2, 'two'), (3, 'three')]
pairs.sort(key=lambda x: x[0])
pairs

Strings are compared by their characters’ underlying numerical values (*lexicographical order*), and lowercase letters have higher numerical values than uppercase letters. Assume that we’d like to determine the minimum and maximum strings using *alphabetical order*

In [None]:
colors = ['Red', 'black', 'Blue']
print(max(colors), min(colors))

In [None]:
max(colors, key=lambda s: s.lower())

In [None]:
min(colors, key=lambda s: s.lower())

### `map()` Function
- The `map()` function executes a specified function for each item in an iterable
- Function `map()`’s first argument is a function that receives one value and returns a new value
- The second argument is an iterable of values to map. Function `map()` uses lazy evaluation - the function returns an iterator, so map’s results are not produced until you iterate through them

**Syntax**
> ```python
map(function, iterables)

- `function`: Required. The function to execute for each item
- `iterable`: Required. A sequence, collection or an iterator object. You can send as many iterables as you like, just make sure the function has one parameter for each iterable

### `filter()` Function
- The `filter()` function returns an iterator where the items are filtered through a function to test if the item is accepted or not
- `filter()`’s first argument must be a function that receives one argument and returns `True` if the value should be included in the result
- `filter()` returns an iterator, so filter’s results are not produced until you iterate through them - lazy evaluation

**Syntax**
> ```python
filter(function, iterable)

- `function`: A Function to be run for each item in the iterable
- `iterable`: The iterable to be filtered 

### `zip()` Function
- Built-in function `zip()` enables you to iterate over multiple iterables of data at the same time
- It pairs up the elements of a number of lists, tuples or other sequences to create a list of tuples. - The `zip()` function returns a zip object, which is an iterator that produces tuples containing the elements at the same index in each iterables
- If the passed iterators have different lengths, the iterator with the least items decides the length of the new iterator

**Syntax**
> ```python
zip(iterator1, iterator2, iterator3 ...)

`iterator1, iterator2, iterator3 ...`: Iterator objects that will be joined together. The zip() function takes iterables (can be zero or more), aggregates them in a **tuple**, and returns it
    
The `*` operator can be used in conjunction with `zip()` to unzip the list of tuples. It can unpack multiple tuples contained in a list

`zip(*zippedList)`

In [23]:
#def myfunc(n):
  #return len(n)

x = map(len, ('apple', 'banana', 'cherry'))
x

<map at 0x106303460>

In [24]:
list(x)

[5, 6, 6]

In [None]:
def myfunc(a, b):
    return a + b

# x = map(myfunc, ('apple', 'banana', 'cherry'), (' orange', ' lemon', ' pineapple'))
x = map(lambda a, b: a + b, ('apple', 'banana', 'cherry'), (' orange', ' lemon', ' pineapple'))

print(x)

#convert the map into a list, for readability
print(list(x))

In [22]:
ages = [5, 12, 17, 18, 24, 32]

def myFunc(x):
    if x < 18:
        return False
    else:
        return True

adults = filter(myFunc, ages)

print(adults)

for x in adults:
    print(x)

<filter object at 0x106302cb0>
18
24
32


In [None]:
a = ("John", "Charles", "Mike")
b = ("Jenny", "Christy", "Monica", "Vicky")

x = zip(a, b)

print(x)

#use the list() function to display a readable version of the result

print(list(x))

In [None]:
def times3(var):
    return var * 3

In [None]:
s = [1, 2, 3, 4, 5, 6]

In [None]:
map(times3, s)

In [None]:
list(map(times3, s))

In [None]:
s = ['a', 'as', 'bat', 'car', 'dove', 'python']
set(map(len, s))

In [None]:
s = [1, 2, 3, 4, 5, 6]
list(map(lambda var: var * 3, s))

In [None]:
filter(lambda item: item % 2 == 0, s)

In [None]:
list(filter(lambda item: item % 2 == 0, s))

A very common use of `zip()` is simultaneously iterating over multiple sequences, possibly also combined with `enumerate()`

In [None]:
seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']
for i, (a, b) in enumerate(zip(seq1, seq2)):
    print('{0}: {1}, {2}'.format(i, a, b))

Given a zipped sequence, `zip()` can be applied in a clever way to unzip the sequence. Another way to think about this is converting a list of rows into a list of columns

In [None]:
coordinate = ['x', 'y', 'z']
value = [1, 2, 3]

result = zip(coordinate, value)
result_list = list(result)
print(result_list)

In [None]:
c, v =  zip(*result_list)
print('c =', c)
print('v =', v)

### Functions as Objects
Since Python functions are objects, many constructs can be easily expressed that are difficult to do in other languages. Suppose we are doing some data cleaning and need to apply a bunch of transformation to the following list of strings

In [None]:
states = ['   Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda', 'south   carolina##', 'West virginia?']

In [None]:
# approach 1
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value)
        value = value.title()
        result.append(value)
    return result

clean_strings(states)

In [None]:
# approach 2
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = []
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

clean_strings(states, clean_ops)

In [None]:
#approach 3
for x in map(remove_punctuation, states):
    print(x)

### Generators
A generator is a convenient way, similar to writing a normal function, to construct a new iterable object. Whereas normal functions execute and return a single result at a time, generators return a sequence of multiple results lazily, pausing after each one until the next one is requested. To create a generator, use the `yield` keyword instead of `return` in a function

In [None]:
def squares(n = 10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
        yield i ** 2 

When you actually call the generator, no code is immediately executed

In [None]:
gen = squares()
gen

In [None]:
next(gen)

In [None]:
next(gen)

In [None]:
next(gen)

In [None]:
next(gen)

The `list()` built-in function is frequently used in data processing as a way to materialize an iterator or generator expression

In [None]:
list(gen)

In [None]:
next(gen)

It is not until you request elements from the generator that it begins executing its code

In [None]:
for x in gen:
    print(x, end = ' ')

### Generator Expressions
Another even more concise way to make a generator is by using a generator expression. This is a generator analogue to list, set and dict comprehensions. To create a generator expression, enclose what would otherwise be a list comprehension within parentheses instead of brackets

In [None]:
gen = (x ** 2 for x in range(100))
gen

- Generator expressions can be used instead of list comprehensions as function argument
- Note that the function call's parentheses also act as the generator expression's parentheses
- These expressions are designed for situations where the generator is used right away by an enclosing function
- Generator expressions are more compact but less versatile than full generator definitions and tend to be more memory friendly than equivalent list comprehensions

In [None]:
sum(x ** 2 for x in range(100))

In [None]:
dict((i, i **2) for i in range(5))