# Intro to Python3 - Class02

### Class agenda:
##### 4. Collections: `list`, `tuple`, `set`, and `dictionary`
##### 5. Iteration: loops and comprehensions
##### 6. Writing functions

##### Before we start new class, a few tips/resouces:
- Jupyter Notebook [quickstart](https://jupyter.readthedocs.io/en/latest/content-quickstart.html): learn about how to use the basic functions and shortcuts.
- Windox v.s. Unix commands [cheat sheet](https://www.lemoda.net/windows/windows2unix/windows2unix.html): if you are not fluent in both (like me:) feel free to use this guide to understand what I'm trying to do and use the right command for your OS.
- Git [interactive tutorial](https://learngitbranching.js.org/?locale=en_US): very easy to follow and you can try all the `git` commands in web browser.
- Try to get comfortable with [using command line/terminal](https://www.codecademy.com/articles/command-line-commands), this resource gives you some general ideas about the basics.
- More on [virtual environment](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-environments) if you are not sure why/how we set up `py3basics` after downloading Anaconda.
- Use [issues](https://github.com/emma-oc/ds-class-intro/issues) in class repo for Q&A.

[How to exit jupyter notebook in terminal?](https://stackoverflow.com/questions/10162707/how-to-close-ipython-notebook-properly)

- hit `control+C` twice
- use the `Close ant halt` under `File` menu in Jupyter notebook
- open another terminal and run the following:
    ```
    jupyter notebook list # list all the port number for running apps
    jupyter notebook stop [port number] 
    # port number is not required if you're only running one app under default port 8888
    ```

#### Exercise 0.
Follow the instructions in [git tutorial](https://github.com/emma-oc/ds-class-intro/blob/master/class01/git_setup.md#task-for-the-class) to create your commit and push changes to your forked repo.

### 4. Collections: `list`, `tuple`, `set`, and `dictionary`

Python has numerious build-in type of collections to help programmer to manage their data. They have different behavior and serve for different purpose

#### 4.1 List
Lists are the basic data container for accessing data that is stored according to postion. They are created using brackets `[ ]` or `list()` function.

Note: list belongs to the category of `sequence` data types, meaning the order of items in the list matters. 

List can contain mixed data types.

In [1]:
one_list = list("Data Science")
num_list = [101, 3, 7, 9, 10]
empty_list = list()
empty_list = []
mixed_list = [1,2,'3',False,'five']

In [2]:
print(one_list)
print(num_list)
print(empty_list)
print(mixed_list)
print(type(mixed_list[2]), type(mixed_list[3]))

['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']
[101, 3, 7, 9, 10]
[]
[1, 2, '3', False, 'five']
<class 'str'> <class 'bool'>


##### Slicing and indexing

Like we already mentions for `str`, `list` also can be indexed, meaning you can refer to item in list by their position/order. Like all other indices in python, they start with `0`.

In [3]:
print(one_list[0])
print(one_list[-1])

D
e


We've already seen some slicing in strings, for lists, slicing works similarly

In [4]:
print(num_list[2:-1])
print(one_list[::-1])

[7, 9]
['e', 'c', 'n', 'e', 'i', 'c', 'S', ' ', 'a', 't', 'a', 'D']


List can be nested, meaning elements of a list can be lists too. You can use multi indexing for multi-dimension list to access items.

In [5]:
nested_list = [ list(range(5)), ['a', 'b', ['c'*8]] ]
print(nested_list)

[[0, 1, 2, 3, 4], ['a', 'b', ['cccccccc']]]


In [6]:
# what does the following mean?
print(nested_list[1][2][0][-1])

c


In [7]:
# how to index the number 4 in the list above? how about 'cccccccc'?
# solution
print(nested_list[0][-1])
print(nested_list[1][2][0])

4
cccccccc


##### List methods

Lots of methods operate on lists and modify the object in place, and they do not return a value. This is different from a standard function with a `return` so you can assign the returned value to a variable name.

This also means two things: 1) lists are [mutable](https://docs.python.org/3/glossary.html#term-mutable), 2) once you apply a method there os no way to revert the change. We will look at some examples below.

In [8]:
print(one_list)

['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']


In [9]:
one_list.reverse()
print(one_list)

['e', 'c', 'n', 'e', 'i', 'c', 'S', ' ', 'a', 't', 'a', 'D']


In [10]:
# can you do this? is it behaving similarly as you expected?
one_list = one_list.reverse()
print(one_list)

None


In [11]:
one_list = ['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']

one_list.sort()
print(one_list)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't']


In [12]:
one_list.index('a')

3

In [13]:
one_list + num_list

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't', 101, 3, 7, 9, 10]

In [14]:
print(one_list * 2)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't', ' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't']


In [15]:
'i' in one_list

True

In [16]:
one_list.append('Data Science')
print(one_list)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't', 'Data Science']


In [17]:
one_list.remove('Data Science')
print(one_list)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 't']


The `del` statement can also be used to remove slices from a list or clear the entire list (which we did earlier by assignment of an empty list to the slice).

In [18]:
del one_list[-1]
print(one_list)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']


In [19]:
# index
popped = one_list.pop(2)
print(popped)
print(one_list)

S
[' ', 'D', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']


In [20]:
# we can put it back
one_list.insert(2, popped)
print(one_list)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']


List can contain anything, your dataframe or you model object could be stored in lists, and it is defintely one of the most widely used collections. [Read more on lists](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) on your own.

#### 4.2 Tuples

Tuple is a collection of Python objects similar to list, except tutples are immutable. Tuple is created using `()`. Tuples are a little bit faster to use than lists, as you will not make any updates or delete anything. It can also be really helpful if you know the data is not supposed to change, using a tuple can protect youself against accidentally changing it.

In [21]:
one_tuple = ('Data Science') # use () instead of [] to create a tuple
print(one_tuple)
one_tuple = (list('Data Science'))
print(one_tuple)
# why the difference?

mixed_tuple = (1, 2, 3, 5, "a", [7,8,9])
print(mixed_tuple[-1][0])
# there are weird things though...
mixed_tuple[-1][0] = 10
print(mixed_tuple)
mixed_tuple[-2] = 6

Data Science
['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']
7
(1, 2, 3, 5, 'a', [10, 8, 9])


TypeError: 'tuple' object does not support item assignment

The indexing and slicing work similarly for tuples as they do for lists. However as tuples are immutable, some the list methods may not work on them. You can read more about [tuple methods](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences).

#### 4.3 Sets

Sets are also a kind of collection in python, it differs from list and tuples mainly because it has no order or position. Unlike the previous two, it is not a member of `sequence`. Also, a set does not allow multiple copy of same item, it is either contain `0` or `1` copy of item of a kind. Sets are defined by `set()` or `{}`.

Set can have items of multiple data types. Set can not be indexed as it's not ordered. Set is mutable, but the elements contained must be immutable (e.g. can't have a list as an item in set).

In [22]:
one_set = set('Data Science')
print(one_set) # ordered
print(list('Data Science'))

print(set(['abc', 1.3, True]))

{'n', 'S', 't', ' ', 'e', 'D', 'a', 'i', 'c'}
['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']
{1.3, True, 'abc'}


In [23]:
one_set.add('!')
print(one_set)

{'n', 'S', 't', ' ', 'e', 'D', '!', 'a', 'i', 'c'}


In [24]:
one_set[0]

TypeError: 'set' object is not subscriptable

In [25]:
'!' in one_set

True

Set objects also support mathematical operations like union, intersection, difference, and symmetric difference. These are defined similarly as in math.

In [26]:
a = set('abracadabra')
b = set('alacazam')
print(a - b)                              # letters in a but not in b
print(a | b)                              # letters in a or b or both
print(a & b)                              # letters in both a and b
print(a ^ b)                              # letters in a or b but not both

{'b', 'd', 'r'}
{'z', 'm', 'd', 'l', 'r', 'b', 'a', 'c'}
{'a', 'c'}
{'z', 'm', 'd', 'l', 'r', 'b'}


In [27]:
# what are these?
print(a.difference(b))
print(a.union(b))
print(a.intersection(b))
print(a.symmetric_difference(b))

{'b', 'd', 'r'}
{'z', 'm', 'd', 'l', 'r', 'b', 'a', 'c'}
{'a', 'c'}
{'z', 'm', 'd', 'l', 'r', 'b'}


#### 4.4 Dictionary

Python dictionaries are used whenever you have a collection of objects that you want to access by name rather than index. Dictionary can be defined by `dict()` or `{key:value}`, notice here a dictionary will have to have `key:value` pairs separated by comma `,` in `{}`. Key is the name you'll index the value by. Dictionaries are referred to as “associative memories” or “associative arrays” in some other languages.

A dictionary is a collection which is unordered, mutable and can be indexed.

The name you use to refer to an item is called the `key`, and the item itself is called the `value`. Dictionaries are just a way of organizing a collection of key-value pairs. Since `key` is used for indexing, all keys are unique - if a `key` already exists in a dictionary, you cannot add a second one with the same `key` value. 

`key` and `value` can be lots of data types, even collections. However, you can’t use lists as keys, since lists can be modified in place using index assignments, slice assignments, or methods like `append()` and `extend()`.

Below are 3 ways to generate a dictionary.

In [35]:
# one_dict = dict(data='science', science=0, 0='data')
one_dict = dict(data='science', science=0)
print(one_dict)

{'data': 'science', 'science': 0}


In [36]:
print(one_dict['science'])
print(one_dict['data'])

0
science


In [37]:
print(one_dict.keys())

dict_keys(['data', 'science'])


In [38]:
# you can also do this...
two_dict = {'data':'science', 'science':0, (0,1):one_dict}
print(two_dict)
print(two_dict[0])

{'data': 'science', 'science': 0, (0, 1): {'data': 'science', 'science': 0}}


KeyError: 0

In [39]:
print(two_dict.keys())
print(two_dict.values())
print(two_dict.items())

dict_keys(['data', 'science', (0, 1)])
dict_values(['science', 0, {'data': 'science', 'science': 0}])
dict_items([('data', 'science'), ('science', 0), ((0, 1), {'data': 'science', 'science': 0})])


In [40]:
# you can't index by position
print(two_dict.values()[0])

TypeError: 'dict_values' object is not subscriptable

In [41]:
# what if i have to? (but seriously why...)
list(two_dict.values())[0]

'science'

In [42]:
three_dict = {}
type(three_dict)

dict

In [43]:
# append
three_dict['data'] = 'science'
three_dict['science'] = 0
three_dict[0] = one_dict

In [44]:
print(three_dict)

{'data': 'science', 'science': 0, 0: {'data': 'science', 'science': 0}}


In [45]:
# can you do this?
three_dict[(1,2)] = [1,2]
print(three_dict[(1,2)])

[1, 2]


In [46]:
# can you do this?
three_dict[one_dict] = (1,2)
print(three_dict[[1,2]])

TypeError: unhashable type: 'dict'

Strings, ints, floats, tuples all worked but lists didn't. The error says `unhashable type`, what does that mean?

Essentially, to make dictionaries function properly, Python need a way to convert an object to a number really fast. This function is called hash and strings, ints, floats, tuples all have an implementation of this that Python can use. Unfirtunately, lists do not come with a hash function.

You can access value by `dictionary[key]` or `dictionary.get(key)`. But value assignment is only available in the first way. Why?

In [47]:
print(three_dict['data'])
print(three_dict.get('data'))

science
science


In [48]:
three_dict['data']

'science'

In [49]:
three_dict['data'] = 'nonsense'
three_dict.get('data') = 'nonsense'
# get(key) returns a value, you canno assign value to a value

SyntaxError: cannot assign to function call (<ipython-input-49-6455d93a8ede>, line 2)

Dictionaries are very special in python with flexbility and readability. It's usually used to store paramters and other paired values. Since we powered through this section, I'd recommend reading [dict() documentation](https://docs.python.org/3/library/stdtypes.html#dict) if you still have lots of questions.

In general, refering to objects by name is a better coding practice than refering to them by position, this is recomended as a general good practice whenever possible. It certainlly makes your code much more readable.

#### 4.5 Some functions for collections

There are lots of functions that can take collections as input. Most common ones include `len()` and `sum()`. 

`len()` checks the length of a collection

In [51]:
one_dict

{'data': 'science', 'science': 0}

In [52]:
print(len(one_list))

print(len(one_set))

print(len(one_dict))

11
10
2


In [53]:
len('asjdfla')

7

`sum()` calculates the sum of all items in a collection. As you can imagine, this only apply to numerical collections.

In [54]:
print(sum(num_list))
num_set = {1,2,3,4,5}
sum(num_set)
print(sum(one_list))

130


TypeError: unsupported operand type(s) for +: 'int' and 'str'

#### 4.6 `copy` of collections

Assignment statements in Python do not copy objects, they create bindings between a target and an object. 

For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other. This module provides generic shallow and deep copy operations (detailed explained [here](https://docs.python.org/2/library/copy.html)).

In [55]:
print(one_list)
a = one_list
print(a)

[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']
[' ', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']


Here both `one_list` and `a` are mutable type list, so `a=one_list` does not copy `one_list`. Instead, `a` is pointed to the object labeld as `one_list`. Both labels are now binded to the same list. We can try if this is true.

In [56]:
a[0] = 'd'
print(one_list)

['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n']


This confirmed our thought. Also you can confirm by compare the two.

In [57]:
a is one_list

True

What if you want two objects instead of two lables? You can explictly make copy instead of assigning value. There are deep and shallow copies. 
- shallow copy: two labels, two objects, but share the same underlying elements. And when collections share a mutable item (list, set, dict, etc.) the item can be changed through both references.
- deep copy: two labels, two completed separated objects. No sharing. Actually, you're very unlikely to need to use deep copy...

In [58]:
# shallow copy
one_list.append([1,2,3])
a = one_list.copy()
print(one_list)
print(a)
print(one_list is a)

['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', [1, 2, 3]]
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', [1, 2, 3]]
False


In [59]:
a[0] = 'D'
print(a)
print(one_list)

['D', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', [1, 2, 3]]
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', [1, 2, 3]]


In [60]:
a[-1][0] = 'magic'
print(a)
print(one_list)

['D', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]


In [61]:
a[-1] = 'magic'
print(a)
print(one_list)

['D', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', 'magic']
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]


In [62]:
# deep copy
import copy
deep_a = copy.deepcopy(one_list)
print(one_list)
print(deep_a)
print(one_list is deep_a)

['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]
False


In [63]:
deep_a[-1][0] = 1
print(deep_a)
print(one_list)

['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', [1, 2, 3]]
['d', 'D', 'S', 'a', 'a', 'c', 'c', 'e', 'e', 'i', 'n', ['magic', 2, 3]]


The key concept of `mutability` and `copy` exists in other data structure too, like `dataframe` and `series`. Keep in mind you are operating on different labels most of the time when you are using value assignment. 

#### Exercise 4. 

1. [Read more on `list`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) and select 2 list methods not mentioned in class to give examples of using them.

2. How can you check if a `key` is already in a dictionary? Give an example.

In [64]:
'key' in one_dict.keys()

False

3. [Explore how `Counter`](https://docs.python.org/3/library/collections.html#collections.Counter) works as another type of collection. Give 2 examples to use `Counter`.

In [68]:
from collections import Counter
word = 'Date Scientist'
print(Counter(word))
# more operation can be find in https://realpython.com/python-counter/

c = Counter({'red': 4, 'blue': 2})
c

Counter({'t': 3, 'e': 2, 'i': 2, 'D': 1, 'a': 1, ' ': 1, 'S': 1, 'c': 1, 'n': 1, 's': 1})


Counter({'red': 4, 'blue': 2})

4. The Fibonacci Sequence is the series of numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...

The next number is found by adding up the two numbers before it.
Find the last element of the `fibolist`

In [69]:
fibolist=[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811]

# code your solution here

fibolist[-1]

317811

5. Calculate the sum of `fibolist`

In [70]:
# code your solution here
sum(fibolist)

832039

6. Calculate and append the next fibonacci number to `fibolist`.

In [71]:
# code your solution here
fibolist.append(fibolist[-1]+fibolist[-2])
fibolist

[0,
 1,
 1,
 2,
 3,
 5,
 8,
 13,
 21,
 34,
 55,
 89,
 144,
 233,
 377,
 610,
 987,
 1597,
 2584,
 4181,
 6765,
 10946,
 17711,
 28657,
 46368,
 75025,
 121393,
 196418,
 317811,
 514229]

7. Create a reversed copy of fibolist without permanently reversing fibolist liteslf.

In [72]:
# code your solution here
fibo_copy = fibolist.copy()
fibo_copy.reverse()
print(fibo_copy)
print(fibolist)

[514229, 317811, 196418, 121393, 75025, 46368, 28657, 17711, 10946, 6765, 4181, 2584, 1597, 987, 610, 377, 233, 144, 89, 55, 34, 21, 13, 8, 5, 3, 2, 1, 1, 0]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229]


8. Check if `29473` is a fibonacci number, and assign the anser to variable `fibo_29473`

In [73]:
# code your solution here
fibo_29473 = 29473 in fibolist
fibo_29473

False

9. Create two sets. Calculate the union, intersection, difference and symmetric difference for the two sets.

In [74]:
# code your solution here
a = set('abracadabra')
b = set('alacazam')
print(a - b)                              # letters in a but not in b, - for difference
print(a | b)                              # letters in a or b or both, | fpr union
print(a & b)                              # letters in both a and b, & for intersection
print(a ^ b)                              # ^ for symmetric

{'b', 'd', 'r'}
{'z', 'm', 'd', 'l', 'r', 'b', 'a', 'c'}
{'a', 'c'}
{'z', 'm', 'd', 'l', 'r', 'b'}


10. Creat a dictionary using at least two different methods.

In [77]:
# code your solution here
# refer to class material above for 3 ways
from collections import defaultdict
d1 = defaultdict(int)
d2 = {}

print('d1 type:{0}'.format(type(d1)))
print('d2 type:{0}'.format(type(d2)))

d1 type:<class 'collections.defaultdict'>
d2 type:<class 'dict'>


11. Check if `2` and `9999` are fibonacci numbers. Create a dictionary `fibo_dict` with the numbers as keys, and boolean value=`True` if the number is fibonacci number, and `False` otherwise.

In [78]:
# code your solution here
fibo_dict = {}
num = [2, 9999]
for n in num:
    fibo_dict[n] = n in fibolist

In [79]:
fibo_dict

{2: True, 9999: False}

## 5. Iteration: loops and comprehensions

We have already learned `if-elif-else` statement to create conditional code for control. Also we now know collections to store data and information. Now we can piece them together and make it powerful with iterations. Here we will cover two methods first.

### 5.1 `for` loops
```
for item in iterable(collection):
____excecute
```

In [252]:
one_list

['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']

In [255]:
one_list = list('Data Science')
for c in one_list:
    print(c*4)
    
print(c)    

DDDD
aaaa
tttt
aaaa
    
SSSS
cccc
iiii
eeee
nnnn
cccc
eeee
e


As we learned, collections usually contain multiple (sometimes just one) items, the `for` statement iterates over the iterable item `one_list`, and the indented code block is executed for each item in the list.

`for` loop is simple yet super powerful. We can combine it with the conditional statement we learned now.

In [264]:
one_list = list('Data Science')
print(one_list)
string = ''
upper = []
lower = []
other = []
# c = 0
for c in one_list:
    string += c
#     string = string + c
    if c.isupper():
        upper.append(c)
        #
    elif c.islower():
        lower.append(c)
    else:
        other.append(c)
        
one_dict = {'upper':upper, 'lower':lower, 'other':other, 'string':string}

['D', 'a', 't', 'a', ' ', 'S', 'c', 'i', 'e', 'n', 'c', 'e']


In [269]:
# replace these initialization with one line
string = ''
upper = []
lower = []
other = []
# inistialize a dictionay directly
one_dict = {}

{'upper': ['D', 'S'],
 'lower': ['a', 't', 'a', 'c', 'i', 'e', 'n', 'c', 'e'],
 'other': [' '],
 'string': 'Data Science'}

In [276]:
for i in range(10): # iterate for 10 time
    print('c'*i)


c
cc
ccc
cccc
ccccc
cccccc
ccccccc
cccccccc
ccccccccc


In [279]:
for c in num_list: # 
    print(c)

101
3
7
9
10


In [None]:
# what is this final dict?
print(one_dict)

`for` loop will work on any [iterable](https://docs.python.org/3/glossary.html) object. Iterables are `An object capable of returning its members one at a time`. Examples are most of what we have learned in collections, and some others like file object etc. Iterables can be used in a for loop and in many other places where a sequence is needed (`zip()`, `map()`, ...).

The `for` statement will execute the indented code block until the items in the iterable is exhausted. You don't need to know the size/length of the iterable ahead of time.

#### Use `zip()` to make an iterator that aggregates elements from each of the iterables

Documentation: https://docs.python.org/3.3/library/functions.html#zip

- can be applied on >= 2 iterables
- generated iterable takes the length of shortest iterable

In [328]:
x = ['four', 'five', 'six']
y = [ 4,      5,      6]
z = [ 0, 2 ]
zipped = zip(x, y, z)
print(list(zipped))
# print(list(zipped))

[('four', 4, 0), ('five', 5, 2)]


In [329]:
# list/tuple/(set?) unpack
for x,y,z in zip(x, y, z):
    print(x)
    print(y)
    print(z)

four
4
0
five
5
2


`zip` is for "one time use" only as python won't be stored entire sequence in memory -- [look at what yield does in python](https://www.geeksforgeeks.org/use-yield-keyword-instead-return-keyword-python/)

The following code will print nothing as `x,y` are `None` since zipped is already used for `print` function two cells above.

In [330]:
for x,y in zipped:
    print(x)
    print(y)

In [333]:
# the * operator can be used to unzip a list
x = ['four', 'five', 'six']
y = [ 4,      5,      6]
x2, y2 = zip(*zip(x, y))
print(x == list(x2) and y == list(y2))

True


What is the type of x2 and y2 here? why?

Tuple. output of zip is a tuple of tuples, and it was unpacked by the assignment

In [359]:
# zip(x,y) produces a tuple with ('four', 4), then unzipped
[*zip(x,y)]

('four', 4)
('five', 5)
('six', 6)

[('four', 4), ('five', 5), ('six', 6)]

In [341]:
# zip the three tuples above give us another tuple with ('four', 'five', 'six')
# then unpacked
x2, y2 = zip(*zip(x,y))
print(x2)
print(y2)

('four', 'five', 'six')
(4, 5, 6)


#### Sequence unpacking

In [315]:
t = 12345, 54321, 'hello!' # tuple packing
print(type(t))
x, y, z = t # typle unpacking
print(type(x), type(y), type(z))

<class 'tuple'>
<class 'int'> <class 'int'> <class 'str'>


Sequence packing is length agnostic - you can pack as many items on the right hand side to the left side tuple. Sequence unpacking requires that there are as many variables on the left side of the equals sign as there are elements in the sequence on the ride side. Note that multiple assignment is really just a combination of tuple packing and sequence unpacking.

If you don't need or care about some of the unpacked values, you can use `_` to make up for space like the following.

In [317]:
a, b, _ = t
print(a)
print(b)
# the unpacking will fail without _

ValueError: too many values to unpack (expected 2)

#### 4th way to generate a dictionary

In [321]:
dict_list = zip(one_dict.keys(), one_dict.values())
print(list(dict_list))

[('upper', ['D', 'S']), ('lower', ['a', 't', 'a', 'c', 'i', 'e', 'n', 'c', 'e']), ('other', [' ']), ('string', 'Data Science')]


In [322]:
print(dict(zip(one_dict.keys(), one_dict.values())))
print(one_dict)

{'upper': ['D', 'S'], 'lower': ['a', 't', 'a', 'c', 'i', 'e', 'n', 'c', 'e'], 'other': [' '], 'string': 'Data Science'}
{'upper': ['D', 'S'], 'lower': ['a', 't', 'a', 'c', 'i', 'e', 'n', 'c', 'e'], 'other': [' '], 'string': 'Data Science'}


The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using `zip(*[iter(s)]*n)`. Look at the following example.

In [349]:
a = zip(*[x,y]*4)
for t in a:
    print(t)

('four', 4, 'four', 4, 'four', 4, 'four', 4)
('five', 5, 'five', 5, 'five', 5, 'five', 5)
('six', 6, 'six', 6, 'six', 6, 'six', 6)


### 5.2 Partial loop using `break` and `continue`

`for` loop will be executed for all items in iterable, but what if I don't want that?

`break` acts as a hard stop on the loop execution, code will exit from the loop. Note that if `break` is not wrapped within a condition, it will always break out of the loop.

In [360]:
for i in range(1,10):
    print(i)
    if i%2==0 and i%3==0:
        break
print(i)

1
2
3
4
5
6
6


- indentation
- `range(start, stop, [step])`. This is very convenient to use if you want to generate index for your iterations. [Read how to use `range`](https://docs.python.org/3/library/stdtypes.html#range) on your own for the exercise
    ```
    num_list = [2,4,6,8]
    for i in range(len(num_list)):
        print('value at index {0} is {1}'.format(i, num_list[i]))
    ```
- `==` logical condition
- loop is broken when `if` condition is met and hit `break`

`continue` is more like a shortcut to do nothing and skip the rest of the code within the indented block, and move on to the next iteration. It has a similar placeholder taste to `pass` for `if-elif-else` statement. However, they are actually different. Is `pass` simply does nothing, while `continue` goes on with the next loop iteration.

In [361]:
# global

for i in range(1,10):
    if i%2==0 and i%3==0:
        continue
    print(i)

1
2
3
4
5
7
8
9


In [370]:
a = [0, 1, 2]
for element in a:
    if not element:
#      if element == False: 0, '', [], ()
#      refer to boolen variable, if-else statement
#     if element ==0: 
        pass # do nothing
    print(element)

0
1
2


In [368]:
a = [0, 1, 2]
for element in a:
    if not element:
#     if element ==0: 
        continue # just to next iteration
    print(element)

1
2


### 5.3 A few iterables
We have seen `list` used with `for`, and it's said other collections will work too. 

#### `enumerate()`
We have seen `range()` is easier at handling indices, there is actually a easier method to get both item and index at the same time. This function yield an interable sequence of values, each value is a pair of index and the actual tiem.

In [None]:
zip[index/position, vlaues]

In [371]:
# compare with the code using range above
num_list = [2,4,6,8]
for i, number in enumerate(num_list):
    print('value at index {0} is {1}'.format(i, number))

value at index 0 is 2
value at index 1 is 4
value at index 2 is 6
value at index 3 is 8


`enumerate` seems more powerful when you want to iterate through existing iterable, while `range` gives you more flexibility to run any number of iterations.

`i, number` ~ sequence unpacking, already mentioned above, [read more here](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences)

#### `dictionary()`
You can use `for` statement to iterate through keys of a dictionary

In [372]:
print(three_dict)

{'data': 'science', 'science': 0, 0: {'data': 'science', 'science': 0}, (1, 2): [1, 2]}


In [375]:
# interate over keys
for key in three_dict:
    print('{0} : {1}'.format(key, three_dict[key]))

data : science
science : 0
0 : {'data': 'science', 'science': 0}
(1, 2) : [1, 2]


In [374]:
# you can also do this by unpacking
for key, value in three_dict.items():
    print('{0} : {1}'.format(key, value))

data : science
science : 0
0 : {'data': 'science', 'science': 0}
(1, 2) : [1, 2]


Note: dictionary is unordered, so it's not guaranteed you will iterate over the keys in the same order are they are defined. If you care about the order, check out [how to use `sorted()`](https://docs.python.org/3/library/functions.html#sorted) or [OrderedDict collection](https://docs.python.org/2/library/collections.html#ordereddict-objects).

### 5.4 Comprehensions
Comprehension is nothing but a more concise way of running `for` loop, allegedly more "pythonic". If you are familiar with how sets are defined in mathmetics, you can probably see the similarity in comprehension.

`[expression/code for item in iterable]`

Comprehension [may be a bit faster but technically still on the same scale as `for` loop](https://stackoverflow.com/questions/22108488/are-list-comprehensions-and-functional-functions-faster-than-for-loops) with slightly less overhead.

#### List comprehension

In [382]:
print(num_list)
# list comprehension
num_list_2 = [number**2 for number in num_list]
print(num_list_2)

[2, 4, 6, 8]
[4, 16, 36, 64]


In [383]:
# it is basically doing this in one line:
num_list_2 = []
for number in num_list:
    num_list_2.append(number**2)
print(num_list_2)

[4, 16, 36, 64]


In [499]:
# then what's the equivalent for the following comprehension?
num_list_3 = [num**3 for num in range(8)]
print(num_list_3)

# code up the for loop version:
num_list_3 = []
for num in range(8):
    num_list_3.append(num**3)
print(num_list_3)    

[0, 1, 8, 27, 64, 125, 216, 343]
[0, 1, 8, 27, 64, 125, 216, 343]


In [493]:
abc='ajf2345' 
print(abc.isalnum())

False


You can make it more complex and chain other statements to it.

You can combine `if-else` statement with comprehension:

`[f(x) for x in iterable if condition]`

`[f(x) if condition else g(x) for x in iterable]`

In [384]:
# with if-else
print([number**3 if number < 5 else number**2 for number in num_list])

[8, 64, 36, 64]


In [392]:
# nested for loops
print([(x, y) for x in [1,2,3] for y in [3,1,4] if x != y])

[(1, 1), (3, 3)]


In [391]:
xy = []
for x in [1,2,3]:
    for y in [3,1,4]:
        # what do you want program to do
        if x == y:
            xy.append((x,y))
print(xy)

[(1, 1), (3, 3)]


In [None]:
# helper function
%time
%timeit

It can get pretty complex really fast. Comprehension is usually quite elegant and short to write, but can be hard to read sometimes. For more examples, read [nested list comprehension here](https://docs.python.org/3/tutorial/datastructures.html#nested-list-comprehensions).

#### Set and dictionary comprehension
A comprehension creates a collection. The difference between different comprehension is the object generated in the end. Idea is the same though syntax are different.

- Dicionary: `{key:value for item in iterable}`
- Set: `{expression for item in iterable}`  -- generated set will be deduped

In [393]:
# set
num_list = [1,1,2,3,5,8]
set_odd = {x for x in num_list if x%2 == 1}
print(set_odd)

{1, 3, 5}


In [394]:
# dict
num_list = [1,1,2,3,5,8]
squared_dict = {x:x**2 for x in num_list}
print(squared_dict)

{1: 1, 2: 4, 3: 9, 5: 25, 8: 64}


In [395]:
# the slicing we've learned before is actually list comprehension :)
print([num_list[i] for i in range(0, len(num_list), 2)])
print(num_list[::2])

[1, 2, 5]
[1, 2, 5]


In [415]:
num_list[::2]

[1, 2, 5]

In [414]:
[num_list[i] for i in range(0, len(num_list), 2)]

[1, 2, 5]

### 5.5 `while` loop

With the `while` loop we can execute a set of statements as long as a condition is `True`. It will exit only when the condition is changed into `False`, or when there is a `break` statement.

```
while condition==True:
____execute code lock
```

In [409]:
i = 1 # iter/i iteration
while i < 6:
    print(i)
    if i == 3:
        break
    i += 1
    # i = i+1
# see final value of i    
print('while loop is broken at i={}'.format(i))

1
2
3
while loop is broken at i=3


In [410]:
i = 0
while i < 6:
    i += 1
    if i == 3:
        continue
    print(i)
print('while loop is broken at i={}'.format(i))

1
2
4
5
6
while loop is broken at i=6


With the `else` statement after `while` loop, we can run a block of code once when the condition no longer is `True`.

In [357]:
i = 1
while i < 6:
    print(i)
    i += 1
else:
    print("i is no longer less than 6")

1
2
3
4
5
i is no longer less than 6


When to use `while` or `for` loops? 
- `for` will repeat for a fixed number of times
- `while` will run until condition is no longer met (**warning: infinite loop**)

In [427]:
# equivalent
# while (iterable is not exhausted) or (number of iteration not met):
#     do something

list1 = [0,2,3,4,5]
for number in list1: # iterate over a list
    print(number)
    
while list1: # as long as the list is not empty
    number = list1.pop()
    print(number)
    print(list1)

0
2
3
4
5
5
[0, 2, 3, 4]
4
[0, 2, 3]
3
[0, 2]
2
[0]
0
[]


what is `True`? https://docs.python.org/3/library/stdtypes.html#truth-value-testing

#### Exercise 5.

1. Write a Python program to find those numbers which are divisible by 7 and multiple of 5, between 1500 and 2700 (both included).

In [505]:
# code up your solution here
start = 1500
end = 2700
numbers = []
for num in range(start, end+1):
    if num % 7 == 0 and num % 5 ==0:
        numbers.append(num)
print(numbers)

[1505, 1540, 1575, 1610, 1645, 1680, 1715, 1750, 1785, 1820, 1855, 1890, 1925, 1960, 1995, 2030, 2065, 2100, 2135, 2170, 2205, 2240, 2275, 2310, 2345, 2380, 2415, 2450, 2485, 2520, 2555, 2590, 2625, 2660, 2695]


2. Write a Python program to count the number of even and odd numbers from a series of numbers.

Sample: `numbers = (1, 2, 3, 4, 5, 6, 7, 8, 9)`

Expected Output :

```
Number of even numbers : 4
Number of odd numbers : 5
```

In [521]:
# code up your solution here
numbers = (1, 2, 3, 4, 5, 6, 7, 8, 9)
odds = 0
evens = 0
for num in numbers:
    if num % 2:
        odds += 1
    else:
        evens += 1
print('Number of even numbers : {}'.format(evens))
print('Number of odd numbers : {}'.format(odds))

Number of even numbers : 4
Number of odd numbers : 5


3. Write a Python program which iterates the integers from 0 to 50. For multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz".

Expected Output :
```
fizzbuzz
1
2
fizz
4
buzz
...
```

In [508]:
# code up your solution here
for i in range(0, 51):
    if i % 15==0:
        print('FizzBuzz')
    elif i % 3==0:
        print('Fizz')
    elif i % 5==0:
        print('Buzz')
    else:
        print(i)

FizzBuzz
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
Fizz
22
23
Fizz
Buzz
26
Fizz
28
29
FizzBuzz
31
32
Fizz
34
Buzz
Fizz
37
38
Fizz
Buzz
41
Fizz
43
44
FizzBuzz
46
47
Fizz
49
Buzz


4. Given a list iterate it and display numbers which are divisible by 5 and if you find number greater than 150 stop the loop iteration

`list1 = [12, 15, 32, 42, 55, 75, 122, 132, 150, 180, 200]`

Expected output:
```
15
55
75
150
```

In [511]:
list1 = [12, 15, 32, 42, 55, 75, 122, 132, 150, 180, 200]

for num in list1:
    if num > 150:
        break
    elif num % 5 == 0:
        print(num)

15
55
75
150


5. Pick one of the questions above and use `range()` for a different solution

In [None]:
# code up your solution here
# problem 1 & 3 are using range()

6. Pick one of the question above and use comprehensions for a different solution

In [516]:
# code up your solution here
# problem 1
[num for num in range(start, end+1) if (num % 7 == 0) & (num % 5 ==0)]

[1505,
 1540,
 1575,
 1610,
 1645,
 1680,
 1715,
 1750,
 1785,
 1820,
 1855,
 1890,
 1925,
 1960,
 1995,
 2030,
 2065,
 2100,
 2135,
 2170,
 2205,
 2240,
 2275,
 2310,
 2345,
 2380,
 2415,
 2450,
 2485,
 2520,
 2555,
 2590,
 2625,
 2660,
 2695]

7. Pcik one of the questions above and use `while` loop for a different solution

In [517]:
# code up your solution here
i = 0
while i <= 50:
    if i % 15==0:
        print('FizzBuzz')
    elif i % 3==0:
        print('Fizz')
    elif i % 5==0:
        print('Buzz')
    else:
        print(i)
    i += 1

FizzBuzz
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
Fizz
22
23
Fizz
Buzz
26
Fizz
28
29
FizzBuzz
31
32
Fizz
34
Buzz
Fizz
37
38
Fizz
Buzz
41
Fizz
43
44
FizzBuzz
46
47
Fizz
49
Buzz


## 6. Writing functions
We have learned a bunch of techniques to write code programs to solve problem, in some of the exercises we did, you'll need manually update the variables and it quickly becomes difficult to manage. Imagine you write a more complex program with 500+ lines of code, and now you need to update two varaibles. It's not only tiring but also prone to error. Now we are introducint function (we have already used a lot actually!) to make our code more repeatable.

Functions can:
- Reduce complexity. Now you can point to a chunk of code using shorthand. Once you make sure it's behaving as expceted, you are confident it will work anywhere.
- Be reused. Super easy for you and others for similar tasks.

### 6.1 Define functions with `def`

Goal of a function is to solve a problem. It could be super easy task, or a group of small tasks stacked together. Python functions take the input (`arguments`) to run some code and return results or do something as you defined. This definition is similar to how we define functions in math too, there is domain, mapping, and value.

Let's start with some simple examples

In [428]:
print('Data Science', end='!')
# print is a built-in function

Data Science!

Functions are usually declared by 

```
def function_name(arguments):
    
    code with arguments
    code with arguments
    
    [optional]
    [return output]
```

In [455]:
def say_my_name(name, excited=True):
    '''
    '''
    if excited:
        end = '!\n'
        print(name, end=end)
    else:
        print(name)
    
say_my_name('Heisenberg')
say_my_name('A girl', False)

Heisenberg!
A girl


Syntax:
- `def` needs matching `:`
- indentation 
    - indentied block will be executed
    - level of indentation: `if-else` has their own leven inside
- arguments
    - if not specify key words, will go by position
    - you can specify default value for your arguments - this argument will be optional
    - when passed a value, default will be overwritten

### 6.2 Getting value with `return`
We noticed there is no `return` from the function above. What does it mean?

In [7]:
a = say_my_name('Heisenberg')
print(say_my_name('Heisenberg'))
print(a)
# what is happening here?

Heisenberg!Heisenberg!None
None


If omitting `return`, a `None` object is returned. If want want to return value(s), there are different ways to do it. There are 3 ways to end a function:
1. `return expression` statment, this is the result of this function and can be anything
2. an empty `return` statement, no explicit return value, technically returns a `None`
3. no `return` statement, end the indented block as it. same as 2.

In [461]:
name = 'abc'
end = '!'
print(name, end)
print(name+end)

abc!


In [456]:
def say_my_name(name, excited=True):
    if excited:
        end = '!'
        print(name, end=end)
    else:
        end = ''
        print(name)
        
    return name+end

In [458]:
# try to understand output
a = say_my_name('Heisenberg')
print(say_my_name('Heisenberg'))
print(a)

Heisenberg!Heisenberg!Heisenberg!
Heisenberg!


In [462]:
def say_my_name(name, excited=True):
    if excited:
        end = '!'
        print(name, end=end)
    else:
        end = ''
        print(name)
        
    return name, end
#     return 

In [463]:
# try to understand output
a = say_my_name('Heisenberg')
print(say_my_name('Heisenberg'))
print(a)

# what if only want the '!' in output?


Heisenberg!Heisenberg!('Heisenberg', '!')
('Heisenberg', '!')


### 6.3 Arguments
Functions can take any number of parameters, that means 0 or as many as you want. 

These are usually provided by parameter **name** or the **position**, of a mixture of the two. We have seen in the example above, we can pass the parameters by position. This seems very straightforward, the only thing is, when you have more parameters you will very easily get confused, and the code will be unreadable too.


In [466]:
def print_100():
    print('!'*100)
print_100()

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


In [467]:
name, _ = say_my_name(name='A girl', excited=False)
name, _ = say_my_name(excited=False, name='A girl')

A girl
A girl


In the example above we assigned arguments by key words, and we noticed the order of key word arguments does not matter. However, this is not true for position arguments.

When mixing key word and position argument value, be sure to provide position arguments first, ans they will need to be matched to the parameters by order. Key word arguments won't be affect however they are provided in order.

In [479]:
print_out = 'False'
print(print_out == True)
print(bool(print_out) == True)

False
True


In [486]:
def print_numbers(num1, num2, num3, num4, num5, print_out=True):
    if print_out:
#     if bool(print_out) == True
#     is different from if print_out == True:
        print("Input has [{0}, {1}, {2}, {3}, {4}].".format(num1, num2, num3, num4, num5))

In [469]:
print_numbers(1,2,3,4,5)

Input has [1, 2, 3, 4, 5].


In [470]:
print_numbers(num4=4, num5=5, num3=3, num1=1, num2=2)

Input has [1, 2, 3, 4, 5].


In [471]:
print_numbers(1,2,3,num4=4, num5=5)

Input has [1, 2, 3, 4, 5].


In [472]:
# what's the output?
print_numbers(4,5,3,num2=2, num1=1)

TypeError: print_numbers() got multiple values for argument 'num2'

In [473]:
# what's the output?
print_numbers(1,2,3,num4=4, num5=5, 0)

SyntaxError: positional argument follows keyword argument (<ipython-input-473-22dfdc2e2586>, line 2)

In [474]:
# what's the output?
print_numbers(1,2,3,4,5,0)

You might have noticed even though the function has 6 parameters, we can run it with 5 arguments. The reason is the 6th parameter is optional, as we already provided a default value in the definition. The opposite of optional parameters is called mandatory.

When there is default value for an optional parameter, we don't have to provide an argument value. The parameter will take on default value unless there is an actual value specified (by kw or position).

*Note: mandatory parameters must come before any optional parameters when you definte the functions.*

In [488]:
def print_numbers(print_out=True, num1, num2, num3, num4, num5):
    if print_out:
        print("Input has [{0}, {1}, {2}, {3}, {4}].".format(num1, num2, num3, num4, num5))

SyntaxError: non-default argument follows default argument (<ipython-input-488-3df2860add0f>, line 1)

In [490]:
print_numbers(1,2,3,4,5,False)

It's also best practice to use only **imutable** object for optional parameters, as the value will be shared for different function runs.

In [141]:
def sum_ab(a, b, c=[]):
    c.append(a+b)
    return c

print(sum_ab(1,2))
print(sum_ab(3,4))

[3]
[3, 7]


In [528]:
def sum_ab(a, b, a_b=None):
    if not a_b: a_b = []
    a_b.append(a+b)
    return a_b

print(sum_ab(1,2))
print(sum_ab(3,4))

[3]
[7]


In [529]:
print(a_b)

NameError: name 'a_b' is not defined

### 6.4 Scope and passing by assignment
It's an important idea to understand the "scope" or "space" or a function. Each function has its own "local" space, which is different from "global". 

That happens inside a function (usually local) exists only inside the function (local).

![img](https://cdn.askpython.com/wp-content/uploads/2019/08/python-namespace-example.png)



In [530]:
def say_my_name():
    name = 'Heisenberg'
    print(name)

In [531]:
name = 'Not Heisenberg'
print(name)
say_my_name()
print(name)

Not Heisenberg
Heisenberg
Not Heisenberg


Note here we defined `name = 'Not Heisenberg'` in the global environment, and `name = 'Heisenberg'` is defined within the function. Even within the function a different value is assigned to `name`, it only exists within the function, and the original global value persist even after the function is called. The inside and outside of the function stay independent now. You can imagine the function as a "container", which have limited access to the outside of current `def`, and the outside has no idea what's happening inside.

This is usually desirable property, as you don't need to worry about overcrowding hte namespace, or accidentally overwrite another variable. You can have same parameter name as your global variable.

The variable scope resolution follows the LEGB rule:

`Local -> Enclosed -> Global -> Built-in`

![img](https://cdn.askpython.com/wp-content/uploads/2019/08/python-variable-scope-resolution-legb.png)

The following is an example illustrating all definitions, [read more about LEGB here](https://www.askpython.com/python/python-namespace-variable-scope-resolution-legb) if you are interested.

![img](https://cdn.askpython.com/wp-content/uploads/2019/08/python-variable-scope-example.png)

Let's look at a few examples to make sure we understand that!

In [532]:
# difference between global and local
global_var = 'foo'

def ex1():
    local_var = 'bar'
    print(global_var)
    print(local_var)
    
ex1()
print(global_var)
print(local_var)

# global namespace: global_var
# local namespace: local_var (+ LEGB ^)

foo
bar
foo


NameError: name 'local_var' is not defined

In [533]:
# how NOT to set global variable
var = 'foo'

def ex2():
    var = 'bar'
    print('inside the fuction var is '+var)
    
ex2()
print('outside the fuction var is '+var)

# this is not setting global variable, but creating a local variable with same name

inside the fuction var is bar
outside the fuction var is foo


In [534]:
# how to set global variable - BUT STRONGLY NOT RECOMMENDED
var = 'foo'

def ex3():
    global var # declare variable as global
    var = 'bar'
    print('inside the fuction var is '+var)
    
ex3()
print('outside the fuction var is '+var)

inside the fuction var is bar
outside the fuction var is bar


In [535]:
# nested functions
# function has access to global or outer vars (LEGB), but not vice versa
def ex4():
    var_outer = 'foo'
    def inner():
        var_inner = 'bar'
        print(var_outer)
        print(var_inner)
    inner()
    print(var_outer)
    print(var_inner)
    
ex4()

foo
bar
foo


NameError: name 'var_inner' is not defined

In [537]:
# global and outer are different
def ex6():
    var = 'foo'
    def inner():
        global var
        var = 'bar'
        print('inside inner, var is '+var)
    inner()
    print('inside outer, var is '+var)

ex6()

inside inner, var is bar
inside outer, var is foo


When the function is evaluated, a namespace of variables is created inside the function, and removed after function finishes running, with all local variables gone. The value assignment and variable change are localized only to the function. 

Also, though `global` seems to be helpful, it's generally not suggested to use this `global` statement as it can lead to unexpected changes. Try to avoid it.

Then how to solve this? If you are curious, [read more here](https://www.saltycrane.com/blog/2008/01/python-variable-scope-notes/).

The key take away from the exercises above is: don't confuse the layers of scopes, and when possible, name your parameters and variables properly to avoid unintended results.

 This means that, most of the time, a function can't change the variables assigned outside of functions namespace.

In [539]:
def remember_my_name(name):
    return name

In [540]:
name = 'Not Heisenberg'
print(name)

remember_my_name('Heisenberg')
print(name)

name = remember_my_name('Heisenberg')
print(name)

Not Heisenberg
Not Heisenberg
Heisenberg


The first time we called the function `remember_my_name`, the new value of `name` only exists inside the function. Even though it returns a value, the returned object is not captured hence no change to global `name` variable.

The second time we called the function, it's exactly the same within the function, but the returned value was assigned to global `name`, and it's changed.

Some objects are mutable (`list`, `tuple`, etc.) and you can use method function to change the value of them inside the fuction. What's the difference between functions and methods, you may ask? [Check out the answer here.](https://stackoverflow.com/questions/155609/whats-the-difference-between-a-method-and-a-function)

In [541]:
def remember_names(name_list):
    name_list.append('name1')
    name_list.append('name2')
    
name_list = ['aaa', 'bbb']
print(name_list)

remember_names(name_list)
print(name_list)

# no explicit assignment, but object value modified via method

['aaa', 'bbb']
['aaa', 'bbb', 'name1', 'name2']


### 6.5 Lambda 
Lambda objects are usually simple functions with only one expression or simple logic. It exists as a one-time use function, with no name assigned and no `def` formatting or `return` written out. Another way to look at Lambda is a "transformation" - you'll take the input, do something simple, and output the value. 

Genrally, you probably won't be coding a lot in Lambdas, but they can be handy once in a while for some super simple calculations. [documentation](https://realpython.com/python-lambda/)

``` lambda input1, input2, ... : returned expression using inputs```

In [544]:
function = lambda x,y  : x**2+y**2  # lambda object is a function
print(function(x=3, y=4))
print(type(function))

25
<class 'function'>


In [547]:
# quick use of lambda

scores = [('student1', 99), ('student2', 65), ('student3', 80)]
scores.sort()
print(scores) # only sorted by first item in tuple

scores.sort(key=lambda x : x[1], reverse=False)
print(scores)  

[('student1', 99), ('student2', 65), ('student3', 80)]
[('student2', 65), ('student3', 80), ('student1', 99)]


In [552]:
# lambda x : x[1]
[x[1] for x in scores]

[65, 80, 99]

### 6.6 Script file
So far we have only worked in jupyter notebook. It is a powerful tool but this is not the only way to use python. More often than notebook, you'll use python in terminal with `.py` scripts. To execute python script, simply run `python3 [script_name]` in you terminal/command line. The code is running behind the scene and output will be shown on the stdout of the terminal.

A typical script looks something like this:

In [560]:
%load_ext autoreload
%autoreload 2
from new_func import *  

In [None]:
# test.py
# command line
# >>> python test.py

import os
import datetime

def function1(kw1, kw2):
    # some thing 
    # some thing 
    return output1

def function2(kw1, kw2):
    # some thing 
    # some thing 
    return output2
# ...
# ...
# ...

if __name__=="__main__":
    function1(1,2)
    # code to be executed when script is run 

In [561]:
__name__

'__main__'

The last `if` statement basically means: only run this part when the script is directly run by user. Writing your program this way will make it more reusable (for imports) and testable (unit testing). Though we are still working primarily out of notebook, it's a good idea to think about writing your code this way when you're writing script.

To learn what `if __name__=="__main__":` exactly, [read this post](https://medium.com/@jonathan.turnock/what-does-if-name-main-do-pythons-main-scope-b6fd6b227f25) as it provides more detailed explanation and examples to play with.

### 6.7 Notes on writing functions
Now that you have learned about function, you have stepped out the frist step of reusable code. Coverting repeated code into functions can help prevent potential bugs, save you and other times, while keeping everything more readable and easier to collaborate on.

There are some generally good practice to follow:
- naming conventions: it's recommended to use understandable abbreviations to improve readability, for both functions and variables
- [doc string](https://www.python.org/dev/peps/pep-0257/#what-is-a-docstring): explain the purpose of the function, meaning of arguments, and expected return
- a function should do one thing only:  make it simple and flexible, it will make your code less likely to break and easier to test (unit testing)
- most of your program should be contained inside function, espcially more complex ones. It will save you A LOT OF time.
- it's considered bad practice to write long blocks of code at zero indentation. only imports and function `def` are exceptions. 

#### Exercise 6.
Please finish this exercise in the `excercise_6.py` script.