## Name Binding (What You Think of as Assignment)
Whenever you use assignment (=), you can think about it as putting a name tag (or a lanyard with a name tag) onto the data--this is called name binding. You can use the `id()` function to see the object's memory address. Hence, for the duration of the object's existence (ie not deleted), the id will be unique and unchanging. Python's assigment (=) NEVER copies data--it merely puts another name tag on (or wraps the lanyard around) the same object. The `is` keyword in Python is highly related to `id()` in that `is` detects whether the identity of 2 objects are the same--if the 2 variables are pointing to same memory address. If `x is y` is True, then `id(x) == id(y)`.

For all the Fany Pants: the technical term for a variable name/name tag/lanyard is called an <i>identifier</i>.

In [1]:
a = (1, 2)
b = (1, 2)
print(a == b)  # they have the same VALUES
print(a is b, id(a), id(b))  # but they are NOT the same object

True
False 2600551240000 2600551759296


In [2]:
a = (1, 2)
b = a  # now they are both pointing to the same object
print(a == b)
print(a is b, id(a), id(b))  # same ids

True
True 2600551249152 2600551249152


In [3]:
# Statistics people say "correlation does not imply causation."
# Python people *should* say "equality does NOT imply identity."
x = (1,)
lst = [x, (1,), (1,), (1,), (1,)]
print([tup == x for tup in lst])
print([tup is x for tup in lst])  # equality does NOT imply identity
print(set(lst))  # even though only 1 unique element

[True, True, True, True, True]
[True, False, False, False, False]
{(1,)}


In [4]:
import random

random.shuffle(lst)  # round and round x goes. where x stops, nobody knows.
print([tup is x for tup in lst])  # random position

[False, False, False, False, True]


#### Reference Counting
If you want to see how many name tags/lanyards are attached to the same object, then use `sys.getrefcout`. Python internally has a reference count for every object--once the reference count goes to 0, then the object is deleted from memory. Deleting a variable will remove the name tag (or cutting off the lanyard). Python's `del` statement doesn't directly delete a variable. Instead `del` deletes the name binding (the name tag) onto the object. Once an object's reference count goes to 0, then Python's garbage collector will finally remove the object from memory. In short, `del` statement indirectly deletes an object.

There's a fine print in the documentation: ```The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to getrefcount().```

In [5]:
import sys

a = []
print("reference count: {}; memory location: {}".format(sys.getrefcount(a), id(a)))

reference count: 2; memory location: 2600552108864


In [6]:
# a list is a mutable object
a = []
print(sys.getrefcount(a))
b = a  # same object
print(sys.getrefcount(a))
print(a is b)
print(id(a), id(b))
a.append(None)
print(a, b)  # both are mutated since they are the same object

2
3
True
2600551891840 2600551891840
[None] [None]


In [7]:
# a tuple is immutable, so you cannot modify it
a = (1, )
print(sys.getrefcount(a))
b = a
print(sys.getrefcount(a))
print(a is b)  # same object so far
print(id(a), id(b))  # same memory location

print()

a += (2,)  # original tuple not modified, another tuple is created (that exists in a different memory location), so different object
print(a, b)
print(a is b)  # 2 different objects
print(id(a), id(b))  # notice that id(a) has changed memory location. id(b) still the same
print(sys.getrefcount(a))  # notice that the reference count has fell back to 2

2
3
True
2600552143504 2600552143504

(1, 2) (1,)
False
2600551842688 2600552143504
2


#### Tuple Unpacking
You can have multiple expressions on the right side of the assignment and multiple variable names on the left side. Often used with tuples, but you can also use lists as well.

In [8]:
a, b = 1, 2  # tuple unpacking
a, b = b, a  # this is a safe operation in Python. You don't need a temporary third variable.
print(a, b)

2 1


In [9]:
a = 1, 2, 3  # no problem. no tuple unpacking is happening here. Just a tuple (without needing the extra parentheses)
print(a)

(1, 2, 3)


In [10]:
a, b = 1, 2, 3  # failed tuple unpacking due to "imbalance"

ValueError: too many values to unpack (expected 2)

In [11]:
a, b, c = 1, 2  # also causes problems due to "imbalance"

ValueError: not enough values to unpack (expected 3, got 2)

For example, fibonacci is often implemented with recursion. It can also be implemented iteratively with tuple unpacking.

In [12]:
def fibonacci(n):
    n_previous, n_current = 0, 1
    for iteration in range(n - 1):
        n_previous, n_current = n_current, n_previous + n_current # tuple unpacking
    return n_current

fibonacci(10)

55

#### Multiple Assignment
Also you don't have to assign 1 object to 1 variable name. You can do multiple assignment--1 object assigned to multiple variable names. The right-most expression is evaluated first and then bound to the variables left to right.

In [13]:
a = b = 9000 + 1  # bind 9001 to `a` first, then `b`
print(a, b)
print(a is b)

# logically equal to: 
# 9000 + 1 -> 9001
# a = 9001
# b = a

9001 9001
True


#### Multiple Assignment + Tuple Unpacking
Use both tricks at the same time. You get these fun puzzles.

In [14]:
# The values for a and b are swapped and then swapped back. Basically, nothing happens.
a, b = 1, 2

(b, a) = (a, b) = (a, b)
print(a, b)

# logically equal to:
# b, a = a, b       # the left most = the right most
# a, b = b, a       # the middle = the left most

1 2


To prove that multiple assigment happens left to right, I use a `property` attribute.

In [15]:
class SillyAssignment:
    def __init__(self, value):
        self.id = value
        self._gotit = None
        self._counter = 0
    
    @property
    def gotit(self):
        return self._gotit
        
    @gotit.setter
    def gotit(self, value):
        self._counter += 1
        print("Setter {} to {}; set {} time(s)".format(self.id, value, self._counter))
        self._gotit = value

s1 = SillyAssignment("s1")
s2 = SillyAssignment("s2")
a, b = 1, 2

In [16]:
s2.gotit, s1.gotit = s1.gotit, s2.gotit = (a, b)
# logically equal to:
# s2.gotit, s1.gotit = a, b                    # the left most = the right most
# s1.gotit, s2.gotit = s2.gotit, s1.gotit      # the middle = the left most

Setter s2 to 1; set 1 time(s)
Setter s1 to 2; set 1 time(s)
Setter s1 to 1; set 2 time(s)
Setter s2 to 2; set 2 time(s)


Magic trick: Any time you see a magic trick ✨, it's just for fun. Don't think about it too hard--unless you want to. 😃  
A recursive list: instead of a recursive function, perhaps this is a recursive version of a data structure.

In [17]:
a = a[0] = [None]

print(a)
a is a[0] is a[0][0]
print(id(a), id(a[0]), id(a[0][0]))

[[...]]
2600551901248 2600551901248 2600551901248


Another magic tricK ✨: twisting 2 lists together like a Möbius strip

In [18]:
a, b = b[0], a[0] =  [None], [None]

print(a, b)
print(a[0] is b, b[0] is a)
print(a is a[0][0])
print(id(a), id(a[0][0]))

[[[...]]] [[[...]]]
True True
True
2600551884608 2600551884608


#### Accidentally Overwriting a Built-in
Here's a common ~~n00b~~ beginner mistake. There's nothing special about built-in types: you can "overwrite" it. And here's how you undo it.

In [19]:
list = list(range(10))  # I just lost the ability to create lists
list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

How do I get `list()` function back?  
Note: Technically, list() is a class constructor, not a function.

In [20]:
# Option 1: get the object's type
list, my_list = type(list), list

print(my_list)
print(list(range(5)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4]


In [21]:
# Option 2: go into `builtins` and extract the `list` function. 
# builtins will be covered more in depth the variable scoping section below
import builtins

list = list(range(10))  # I just lost the ability to create lists
print(list)

list = builtins.list
print(list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
<class 'list'>


In [22]:
# Option 3: secret cheat: delete the global variable `list`, so you can access to the builtin `list` function: LEGB rule
list = list(range(10))  # I just lost the ability to create lists
print(list)

del list
print(list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
<class 'list'>


## Multiple Comparison
Similar to multiple assignment, comparisons can also be chained together. The motivation is some function calls can take a long time, so you don't want to call it twice. The chaining is equivalent to `left_expression and right_expression`.

In [1]:
print(1 < 2 < 3)  # equivalent to (1 < 2) and (2 < 3)
print(1 < 3 > 2)  # equivalent to (1 < 3) and (3 > 2)

In [2]:
import time

def slow_call(value, nap_time):
    time.sleep(nap_time)
    return value

In [3]:
%%time
(1 < slow_call(2, 5)) and (slow_call(2, 5) < 3)

Wall time: 10 s


True

In [4]:
%%time
1 < slow_call(2, 5) < 3  # takes half the time!

Wall time: 5.01 s


True

## Short Circuiting: `and` + `or`
`or` and `and` have a cool functionality called <i>short circuting</i> in that they do not have to evaluate all expressions. Short circuiting is similar to lazy evaluation in that not all expressions are executed.

`or` executes the left-hand side first, and if the left-side expression is logically `True`, then `or` will not evaluate the right-hand argument. Only if the left-hand side is logically `False`, then evaluate the right-hand side. Hence, `or` executes expressions from left to right and stops at the first logically `True` expression. If all expressions are logically `False`, then you get the last expression.
* Logically `False` objects: `False`, `None`, 0, empty string `""`, empty tuple `()`, empty list `[]`, empty dict `{}`, empty set `set()`
* Logically `True` objects: pretty much anything else. Even `(0,)` or `[None]` is truthy because they are non-empty containers.

The function `any()` is very similar to `or`. `any()` returns `True` or `False` by iterating your object and checking the truthiness of every element. If any of the elements is logically `True`, then it will return `True`. Otherwise, return `False` since all elements are logically `False`. Note, if you iterate over an empty container, then `any()` will return `False`.

In [1]:
# immediately short circuit
print(5 or False)
print(5 or 6 or 7 or 8)
print(5 or print("HI"))  # print() is not executed

5
5
5


In [2]:
# no short circuiting happens since all elements are False
print((1 - 1) or [] or print("BYE"))  # left hand side is evaluated, middle is evaluated, right side is evaluated since print() returns None

BYE
None


In [3]:
print(any(range(10)))
print(any([False, None, 0, "", (), [], {}, set()]))
print(any([]))

True
False
False


`and` is the opposite of `or` because `and` stops at the first logically `False` argument and returns it. If there are no falsy (real word!) expressions, then return the last argument (which is necessarily truthy). Just like `or`, `and` evaluates expressions left to right.

In [4]:
print(1 and 2 and 3 and 0 and 4 and 5)  # stops at the falsy element
print(1 and 2 and 3 and 4 and 5)  # all truthy so get the last truthy element
print(False and 0 and "")  # stops at the first one
print(0 and "" and False)  # stops at the first one

0
5
False
0


Short circuiting comes in handy if you are evaluating multiple conditions. You want to put the expression you think will short circuit more often or quicker on the left side.

In [5]:
import time

def slow_call(value, nap_time):
    time.sleep(nap_time)
    return value

In [6]:
%%time
if slow_call(True, 1) and slow_call(True, 2):  # has to wait the full 3 seconds
    print("HI")  # executed

HI
Wall time: 3.01 s


In [7]:
%%time
if slow_call(True, 1) or slow_call(True, 2):  # since the first expression is truthy, then skip the second one
    print("HI")  # executed

HI
Wall time: 1 s


In [8]:
%%time
if slow_call(False, 1) or slow_call(False, 2):  # has to wait the full 3 seconds
    print("HI")  # not executed

Wall time: 3.01 s


In [9]:
%%time
if slow_call(False, 1) and slow_call(False, 2):  # since first expression is falsy, then skip the second one
    print("HI")  # not executed

Wall time: 1 s


In [10]:
%%time
if slow_call(True, 1) or slow_call(True, 2) or slow_call(True, 3) or slow_call(True, 4):  # still only takes 1 second
    print("HI")

HI
Wall time: 1.01 s


## What's My Name? Variable Scoping
When you use multiple variables with the same name, how do you know which variable you get? For example, you have a globally defined variable `x` and a `x` in a function and `x` in some imported module. Which one do you get?
* `Namespaces` : A namespace is a container where names are mapped to objects. Hence, you have same variable names associated to different values/objects in different namespaces.
* `Scope`: A scope defines the hierarchical order in which the namespaces have to be searched in order to obtain the mappings of name to object. Python follows the LEGB rule: `Local, Enclosing, Global, Built-in`. If a variable is not found in the local namespace, go to the outer namespaces. If not found at all, then raise `NameError` exception. Variables with the same name in the same namespace will overwrite each other--only 1 can exist at a time.

The vocabulary words `namespace` and `scope` are often used interchangeably, but they are not the same. But don't worry about it, as the distinction isn't very important. What's more important is understanding how they work.

<p align="center"><img src="images/LEGB_Variable_Scoping.png"></p>
<b>Local</b>: inside a function, look to see if variable is defined. If it is there, use that one.

In [1]:
a = 1  # this is a global variable
def func():
    a = 2  # this is the local variable
    return a
print(func())
print(a)

2
1


<b>Enclosing</b>: if local variable not found inside inner function, look for the variable in the outer function.

In [2]:
a = 1  # this a global variable
def func1():
    a = 2  # get this enclosing variable
    def func2():
        return a
    return func2()
print(func1())
print(a)

2
1


In [3]:
a = 1  # this a global variable
def func1():
    a = 2  # this is still an enclosing variable as it is not global
    def func2():
        def func3():
            return a
        return func3()
    return func2()
print(func1())
print(a)

2
1


In [4]:
# this works
def a1():
    def a2():
        print(a2_var)  # a2_var is defined AFTER the function is defined. 
        # This is late binding from closure. Look at 4_Functional_Python.ipynb to learn more

    a2_var = 2
    a2()

a1()

2


In [5]:
# but why doesn't this one work?
def a1():
    a2_var = 2
    a2()

def a2():
    print(a2_var)  # a2_var is not a local or enclosure variable with respect to a2()
    
a1()

NameError: name 'a2_var' is not defined

<b>Global</b>: variables defined at the module level. Remember, a module is just a fancy way of saying 1 `.py` file. In your function call, if variables are not defined locally or enclosing, go to the module level to find the variable.

In [6]:
a = 1 # global variable
def func1():
    def func2():
        return a
    return func2()
print(func1())
print(a)

1
1


In [7]:
# recursion uses scoping to "discover" itself
def fibonacci(n):
    if (n == 1) or (n == 2):
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)  # where is fibonacci() defined? It's a global variable

fibonacci(10)

55

In [8]:
# Magic trick: you can "twist" functions to see how LEGB really works
def func1(n):
    def func2(n):
        if n:
            print("I'm inside func2. The number is {}".format(n))
            return func1(n - 1)
        return

    if n:
        print("I'm inside func1. The number is {}".format(n))
        return func2(n - 1)
        
print(func1(5))

I'm inside func1. The number is 5
I'm inside func2. The number is 4
I'm inside func1. The number is 3
I'm inside func2. The number is 2
I'm inside func1. The number is 1
None


Global variables are not _truly_ global in the sense that variables cannot communicate between modules. If `pandas` and `sklearn` both have a variable `a`, then they cannot communicate with each other. `pandas.a` does not know about `sklearn.a`.

In [9]:
%%writefile temp1.py
a = 1

Writing temp1.py


In [10]:
%%writefile temp2.py
a = 2

Writing temp2.py


In [11]:
%%writefile temp3.py
print(a)

Writing temp3.py


In [12]:
import temp1
print(temp1.a)

1


In [13]:
import temp2
print(temp2.a)

2


In [14]:
import temp3  # fails to load

NameError: name 'a' is not defined

In [15]:
a = 1  # even if you define it now, temp3 cannot be imported since `a` in temp3 does not exist
import temp3

NameError: name 'a' is not defined

The inability of global variables to transcend its module is not a bug; it is a feature. The reason is that a variable with  a boring name like `x` will almost certainly be used in multiple modules. Hence, you don't want to have each value of `x` be overwritten by each other. Hence, Python allow you have variables called `x` in different modules--by having each module has its own global namespace that does not interact with each other.

<b>Built-ins</b> are the things that you generally use without thinking: `sum()`, `range()`, `list`, `None`, `True`. These variables/functions are like "universal globals" in the sense that they exist between modules.

In [16]:
def printer():
    print("Money tree: sorry, my gold is green--the color of envy!") # where does the function print() come from? It's a built-in!

Magic trick ✨: how do you make your own "universal global" variable or your own builtin variable? How can you communicate between different modules? You can do that by hacking away at the `builtins` module. Built-in functions/variables are not special--they can be overwritten like everything else. As you can see, you can cause some chaos.

In [17]:
%%writefile temp1.py
import builtins

builtins.len = lambda _: "len() is now gone"
builtins.list = tuple
builtins.blah_blah_blah = "Ke$ha"

Overwriting temp1.py


In [18]:
%%writefile temp2.py
import temp1

print(len("HI"))  # len no longer works
print(list(range(10)))  # became a tuple
print(blah_blah_blah)  # some random variable now exists

Overwriting temp2.py


In [19]:
!python temp2.py

len() is now gone
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Ke$ha


Magic trick 2 ✨: you don't have to corrupt the `builtins` module. In fact, any module <i>behaves</i> like a singleton--there is only 1 out there. So mutating a module's variable in 1 place is instantenously reflected if you use that module's variable somewhere else--even if referenced in another module. Not an ansible, not a teleportation machine, not magic, simply Python's <i>pedantic</i> rules. 😃

In [20]:
%%writefile module_with_universal_variable.py
a = None

Writing module_with_universal_variable.py


In [21]:
%%writefile temp1.py
import module_with_universal_variable
module_with_universal_variable.a = 42

Overwriting temp1.py


In [22]:
%%writefile temp2.py
import module_with_universal_variable
module_with_universal_variable.a = 9001

Overwriting temp2.py


In [23]:
%%writefile temp3.py
import module_with_universal_variable
print(module_with_universal_variable.a)

import temp1
print(module_with_universal_variable.a)

import temp2
print(module_with_universal_variable.a)

Overwriting temp3.py


In [24]:
!python temp3.py

None
42
9001


In [25]:
!rm temp1.py temp2.py temp3.py module_with_universal_variable.py

Both approaches (messing with `builtins` module or your own module) are just for funsies ✨. Don't do it in real life--or a kitten will die every time you think about it. There are very rare circumstances where you will want a universally accessable and mutatable variable.

#### There Can Only Be One!
If there are 2 variables of the same name in the same namespace, then (like Highlander) there can only be one! The reason is that namespaces are themselves implemented with a dictionary. Hence, the most recent variable overwrites the previous one.

In [26]:
a = 1  # global variable will be overwritten by the following function
def a():  # this function overwrites/replaces the previous assignment. The value of 1 is totally lost/gone.
    def b():
        return a # refers to the outer function and thus not the variable
    return b()
print(a())
print(a is a() is a()() is a()()())
print(a, a(), a()(), a()()())

<function a at 0x00000188B314D708>
True
<function a at 0x00000188B314D708> <function a at 0x00000188B314D708> <function a at 0x00000188B314D708> <function a at 0x00000188B314D708>


## (Re)assignment within inner namespaces: `global` and `nonlocal`
As a general rule, variables in inner namespaces cannot be assigned to outer namespaces. However, you can update mutative variables in inner namespaces (for example, lists are mutable). This is not reassigning the inner variable to the outer variable though.  
There is a way to get around this rule: by using `global` or `nonlocal` (fun fact: these are the only 2 declarations in Python...until Python 3.6+ where you get variable annotation).
* `global` tells Python you are using a global variable, so inside a function call, you can permanently change a global variable.
* `nonlocal` is used to assign/reassign enclosing variables.

Caveat: I have never used `global` or `nonlocal` in real code, so use them sparingly, as they violate the Principle of Least Surprise.

In [1]:
a = []
def func():
    a.append(None)
print(a)
func()
print(a)  # a is mutated

[]
[None]


In [2]:
a = []  # global
def inner():
    a = []  # local
    a.append(None)  # refers to the local variable
print(a)
inner()
print(a)  # a is not mutated or assigned. Basically nothing has happened

[]
[]


In [3]:
a = 1
def reassign():
    global a
    a = 2

print(a)
reassign()
print(a)

1
2


In [4]:
a = 1  # re-assign this variable
def reassign():
    a = 2
    def func():
        global a
        print(a, "I am inside nested function")
        a = 3
    func()
    print(a, "I am the enclosing variable")  # nothing changed here
print(a, "before global reassignment")
reassign()
print(a, "after global reassignment")

1 before global reassignment
1 I am inside nested function
2 I am the enclosing variable
3 after global reassignment


In [5]:
def global_assign():
    global xyz
    xyz = 1
    
try:
    del xyz
    print("deleted xyz")
except NameError:
    print("xyz doesn't exist")

try:
    print(xyz)
except NameError:
    print("xyz doesn't exist")

global_assign()
print(xyz)

xyz doesn't exist
xyz doesn't exist
1


In [6]:
a = 1
def func1():
    a = 2  # reassign this enclosing variable
    def func2():
        nonlocal a
        a = 3
    print(a, "enclosing variable (relative to func2()) before reassignment")
    func2()
    print(a, "enclosing variable (relative to func2()) after reassignment")
print(a, "global variable unchanged")
func1()

1 global variable unchanged
2 enclosing variable (relative to func2()) before reassignment
3 enclosing variable (relative to func2()) after reassignment


## Executive Power: `exec` and `eval`
These 2 functions are very powerful. As Uncle Ben once said: with great power comes great responsibility. In practice, you will rarely have to use these.

`exec()` takes a string and will execute the string as if it were Python code. You should never run arbitrary code from  untrusted sources, since they can delete your entire hard drive. `exec()` is a function in Python 3 (that always returns `None`) whereas `exec` is a statement in Python 2 (which cannot be saved). This is the same upgrade for `print()` which is a function in Python 3 and a statement in Python 2.

In [1]:
exec("print('HI')")

exec("import sys")  # can mutate the environment
print(sys.version)  # sys is now imported

lister = []
exec("lister.append(42)")  # can mutate the environment
print(lister)

exec("x = 1; x += 2; print(x)")  # can have multiple statements

print(exec(""))  # exec returns None

exec("print({} + {})".format(1, 2))  # you can dynamically generate a string

HI
3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
[42]
3
None
3


In [2]:
print(print("HI"))  # the inner print() returns None. This syntax cannot be used in Python 2

HI
None


`eval()` is bit safer: you are allowed 1 expression, no statement. If you put an expression, you can save that value.

In [3]:
x = eval("1 + 2")
print(x)

3


In [4]:
print(eval("(1, 2) + (3, 4)"))  # no statement, the expression must return something

(1, 2, 3, 4)


In [5]:
eval("1 + 2; 3 + 4")  # only 1 expression

SyntaxError: invalid syntax (<string>, line 1)

In [6]:
eval("x = 10")  # no statement, the expression must return something

SyntaxError: invalid syntax (<string>, line 1)

There's an even safer option: `ast.literal_eval`, which on evaluates 1 literal object. It cannot understand variables.

In [7]:
import ast

print(ast.literal_eval("1"), ast.literal_eval("None"), ast.literal_eval("()"), ast.literal_eval("True"))

1 None () True


In [8]:
print(eval("[] + []"))
print(ast.literal_eval("[] + []"))  # not 1 element

[]


ValueError: malformed node or string: <_ast.List object at 0x00000293C1AFE608>

In [9]:
x = 1
print(eval("x"))
print(ast.literal_eval("x"))  # not a literal object, is a variable

1


ValueError: malformed node or string: <_ast.Name object at 0x00000293C19E10C8>

## Don't Use Mutable Default Arguments
The function definition (or importing a module/function) will immediately execute the function signature. Hence, default arguments are executed immediately and only 1 time; the default arguments are not executed for every function call.

In [1]:
def silly_func(element=print("this feels weird")):  # the print is immediate
    print(element)  # the return of a print function is None

this feels weird


In [2]:
silly_func()

None


In [3]:
def extender(sequence, lst=[]):
    lst += sequence  # equivalent to lst.extend(sequence)
    return lst

lst = ["hi", "thanks"]
print(extender(["bye"], lst))  # looks normal

print(extender(["bye"]))  # looks normal
print(extender(["bye"]))  # something is off
print(extender(["bye"]))  # something is definitely wrong

['hi', 'thanks', 'bye']
['bye']
['bye', 'bye']
['bye', 'bye', 'bye']


In [4]:
the_function_argument = extender([None])  # because the variable returned, we can save it
for i in range(10):
    the_function_argument.append(i)

print(extender("bye"))

['bye', 'bye', 'bye', None, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'b', 'y', 'e']


In [5]:
extender.__defaults__  # this is how you can inspect the function's default argument

(['bye', 'bye', 'bye', None, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'b', 'y', 'e'],)

This problem does not occur with immutable default arguments.

In [6]:
def string_doubler(string="abc"):
    string += string
    return string

print(string_doubler())
print(string_doubler())  # totally safe
print(string_doubler.__defaults__)

abcabc
abcabc
('abc',)


In [7]:
def tuple_doubler(tup=("x", "y", "z")):
    tup += tup
    return tup

print(tuple_doubler())
print(tuple_doubler())  # totally safe
print(tuple_doubler.__defaults__)

('x', 'y', 'z', 'x', 'y', 'z')
('x', 'y', 'z', 'x', 'y', 'z')
(('x', 'y', 'z'),)


To avoid this problem with mutable default arguments, use an immutable default instead. The usual solution is to use `None` and in the 1st lines to replace with the mutable default argument that you really want.

In [8]:
def extender_corrected(sequence, lst=None):
    lst = lst if lst is not None else []  # this line is run every function call, so the list is new every time
    lst += sequence
    return lst

lst = ["hi", "thanks"]
print(extender_corrected(["bye"], lst))  # looks normal

print(extender_corrected(["bye"]))  # looks normal
print(extender_corrected(["bye"]))  # fixed
print(extender_corrected(["bye"]))  # fixed

print(extender_corrected.__defaults__)

['hi', 'thanks', 'bye']
['bye']
['bye']
['bye']
(None,)


In [9]:
def horrible_default_dict_implementation(
    key,
    default_value="you get the leftovers",  # string is immutable so safe to put as default argument
    dct=None,  # dict is mutable so not safe to put as default argument
):
    dct = dct if dct is not None else {}  # this line is run every function call, so the dict is new every time
    if key not in dct:
        dct[key] = default_value
    return dct

dct = {}
print(horrible_default_dict_implementation("late?", dct=dct))  # looks normal

print(horrible_default_dict_implementation("hungry?"))  # looks normal
print(horrible_default_dict_implementation("famished?"))  # still works
print(horrible_default_dict_implementation("Uncle Bob?"))  # still works

print(horrible_default_dict_implementation.__defaults__)

{'late?': 'you get the leftovers'}
{'hungry?': 'you get the leftovers'}
{'famished?': 'you get the leftovers'}
{'Uncle Bob?': 'you get the leftovers'}
('you get the leftovers', None)


**Fun Fact Time!**
<p align="center"><img src="images/party_emoji.jpg" width=70></p>
As mention, the function definition (or importing a module/function) will immediately execute the function signature. However, the function body should NOT be executed. In reality, it is *kinda* executed. Python is a peeping tom! Python has something called peephole optimization that will try to reduce instructions into simpler instructions to run faster. This is implementation specific, so don't rely on it too much.

In [10]:
def silly_func():
    string = "abc" + "xyz"  # peephole optimized
    return string

def functionally_equivalent_silly_func():
    string1 = "abc"
    string2 = "xyz"
    string = string1 + string2  # not peephole optimized
    return string

silly_func() == functionally_equivalent_silly_func()

True

In [11]:
import dis

dis.dis(silly_func)  # simplified instruction

  2           0 LOAD_CONST               1 ('abcxyz')
              2 STORE_FAST               0 (string)

  3           4 LOAD_FAST                0 (string)
              6 RETURN_VALUE


In [12]:
dis.dis(functionally_equivalent_silly_func)  # unsimplified instruction

  6           0 LOAD_CONST               1 ('abc')
              2 STORE_FAST               0 (string1)

  7           4 LOAD_CONST               2 ('xyz')
              6 STORE_FAST               1 (string2)

  8           8 LOAD_FAST                0 (string1)
             10 LOAD_FAST                1 (string2)
             12 BINARY_ADD
             14 STORE_FAST               2 (string)

  9          16 LOAD_FAST                2 (string)
             18 RETURN_VALUE


Also at function definition time (or importing a module/function), Python's parser will look into the function body to determine if it is syntactically valid Python code, ie 2 things on both side of equals sign, valid variable names, not missing parenthesis, etc. If things are not kosher, then Python will raise `SyntaxError`.

In [13]:
def invalid_func():
    1 /  # Python knows this is just not valid Python code

SyntaxError: invalid syntax (<ipython-input-13-cef3b0c3f188>, line 2)

In [14]:
def valid_func():
    1 / 0  # however Python is fine with this at function definition time; error will be raised during runtime

## Don't Fear Importing a Module Multiple Times
Some people worry about importing a module multiple times. They think there's a time cost for each import. The truth is that only the 1st time does the import take time. You can import it thousands of times, and it would take no time. In fact, it's actually hard to get rid of an import after it's imported.

In [1]:
%time import pandas as pd
%time import pandas as pd  # fast as lightning

Wall time: 1.98 s
Wall time: 0 ns


In [2]:
%%time
for _ in range(10000):
    import pandas as pd

Wall time: 0 ns


In [3]:
del pd

In [4]:
pd  # you think it's gone

NameError: name 'pd' is not defined

In [5]:
%time import pandas as pd  # but it's still there!
# In the previous line, the import stays alive, but just not accessible as `pd`

Wall time: 0 ns


## (Extra) Fun Facts/Pedantic Details About 🐍
Here are some things you can keep in your mind just for fun ✨. It is not very important to know (unless you want to argue with your friends about <i>pedantic</i> details like a <i>real</i> computer scientist! Ya know, like a pro...).

#### Specification vs Implementation
Often when we think of Python, we are really thinking of CPython. "Python" is really a specification--a blueprint/whitepaper for what the syntax and features should be. A specification has no code; it is an idea. An implementation is the actual code that builds and brings the specification into real life. There are multiple implementations of Python:
* CPython: the most common one that most people use, implemented in C
* PyPy: Python implemented in (restricted) Python. It's logo is a snake eating itself!
* Jython: implemented in Java, so can interoperate with Java classes and libraries
* IronPython: implemented in C#, so can interoperate with C#/CLR/.NET classes and libraries

#### Interpreted vs Compiled
We often hear that Python is an interpreted language. What does that mean?  
An interpreted language executes the source code 1 line at a time. A compiled language "compiles" the entire source code file into machine code that is native to the hardware--hence your original code is translated into something non-human-readable but faster for the machine to process. Hence compiled languages like C tend to be faster than interpreted languages. Also in compiled languages, there are 2 types of errors: <i>compile-type errors</i> and <i>runtime errors</i>. The compilation step can discover compile-time error, which are invalid operations such as adding a list to a dictionary or syntax errors such as forgetting a closing parenthesis. Hence, the compilation step will not generate the machine code. Runtime error is running the machine code and discovering invalid logic such as dividing by 0. In interpreted languages (such as CPython), we primarily have runtime error (though we do have syntax error from parsing a function/class/module). Most errors/exceptions in CPython are runtime errors: adding a list to a dictionary and dividing by 0 is discovered at runtime.

Do note that a language's <i>specification</i> is not compiled nor interpreted. It is the <i>implementation</i> that is compiled or interpreted. The main implementation of Python is CPython, which is interpreted: the bytecode is interpreted line-by-line by the Python Virtual Machine (often simply referred to as the Python interpreter). If you want to see a function's bytecode, you can use Python's dissambler module called `dis` to see what steps your function actually consists of.  

#### Dynamically Typed vs Statically Typed
Python is a dynamically typed language. What does that mean?  
Early programming languages tended to be statically typed. This means you have to declare a variables along with its type, so that variable's type is known at all times and type errors can be determined at compile time. For example, in C you would write `int c = 42` and that variable would stay an integer. `c` cannot be 42.0 or "42" in the scope of that function (though another function with a variable `c` can be another type). Hence, the code becomes more verbose.  
Dynamically typed languages allow a variable to change types. For example, in Python you can write `x = 42` and in the next line `x = "42"`--this is not possible in a statically typed language. In Python, you don't declare a variable's type, so code tends to be more succinct. However, type errors are discovered in runtime. For example, can you add 10 to `x`? Python doesn't know until it runs that line of code.

One big advantage of interpreted languages is that you get an <i>interactive prompt</i> where you can run things 1 line at a time. This read-eval-print loop is called REPL and is very handy for quick-and-dirty code or debugging.  
Python is a duck typed language, which is a special case of dynamic typed language. Basically, you can call a method on object without knowing its class. For example, you can run `x.pop()` and at runtime it will either run it successfully or give you an error because Python doesn't know until runtime if `x` is an instance of a class that has defined a `.pop()` method.

#### Strongly Typed vs Weakly Typed
Python is a strongly typed language. What does that mean?  
A strongly typed language will not automatically convert an objects type. For example, in Python, you cannot do `"2" + 8`. Why would you ever want to do that? I don't know, but you CAN do that in a weakly typed language, which will automatically convert 1 type to another. In JavaScript, you can do the following horrendous things. <i>Wat</i>?
* `[] + {}` -> `[object Object]`
* `{} + []` -> 0
* `"wat" - 1` -> "wat1"
* `"wat" - 1` -> `NaN`

#### Pass by Object
Often there is a question of whether Python is `pass by value` or `pass by reference`. The answer is neither: Python is `pass by object`.
* <i>Pass by value</i> is when in a function call, the variable is copied over, so there are 2 things with the same data. R is pass by value--that means when you pass a dataframe into a function, the dataframe is copied. If you mutate the dataframe inside an R function, nothing will happen to the original dataframe. 
* <i>Pass by reference</i> is that you pass in a memory address (a pointer) as the argument to the function, so if you change the variable <i>inside</i> the function call, then that variable is also changed <i>outside</i> of the function.
* Python is <i>pass by object</i>. That means Python <i>appears</i> to be pass by value (when the variable does not change) and pass by reference (when the variable does change); it is dependent on the mutability of the object. When you pass in an immutable object (tuple, string, numeric [float, int, bool]), then the variable outside the function call will be not be changed by what happens inside the function call. When you pass in a mutable object (list, set, dictionary), then when the variable is changed inside the function, then it is also changed outside of the function.

Remember, Python's assigment (=) NEVER copies data, not even in a function's argument assignment. The assignment just creates another name tag pointing to the same object.

In [1]:
# immutable object remains unchanged
def concatenate(string):
    string = "Hello " + string  # this new `string` variable is no longer bound to the same object. 
    print(id(string), string)  # You have moved the "name tag" to a different object.

string = "ML Study Group"
print(id(string), string)
concatenate(string)
print(id(string), string)

2193488123568 ML Study Group
2193488362032 Hello ML Study Group
2193488123568 ML Study Group


In [2]:
# mutable object changed
def appender(lst):
    lst.append(None)
    
a = []
print(a)
appender(a)
print(a)

[]
[None]


# Not (Really) One Way to Do It
In "The Zen of Python", the 13th line is: "There should be one-- and preferably only one --obvious way to do it." This is not actually true. What it means to be "Pythonic" or "idiomatic Python" changes over time.

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In this article (https://treyhunner.com/2015/11/counting-things-in-python/), counting things used different styles in differnt versions of Python.
* Python 1.4 (look before you leap): use `if` statement to check existence of key in a dict
* Python 1.4 (easier to ask forgiveness than permission): use `try/except` block
* Python 1.5: use `.get()` method
* Python 2.0: use `.setdefault()` method
* Python 2.3: use `.fromkeys()` method
* Python 2.4: use `set()` and list comprehension
* Python 2.5: use `defaultdict()`
* Python 2.7: use `Counter()`

What it means to be Pythonic changes over time. Perhaps, the line should say: "There should be one-- and preferably only one --obvious way to do it-- in a given point in time or version of Python."

## New Features: Looking into the Future
As Python keeps improving with new versions, there will be new and better ways to do things. The future slowly unfurls to reveal itself.
* list comprehensions -> generator expressions -> dictionary comprehensions -> asynchronous comprehensions
* function annotation -> `mypy` module -> `typing` module

Python 3 improvements over Python 2:
* superior framework for asynchronous programming
* clear delineation of bytes and str objects
* superior memory management (by releasing unused memory)
* `print()` as function (instead of a statement)
* true division (instead of integer division). You can still do integer division using `//`
* list comprehensions no longer leak the looping variable

Python 3.4:
* `asyncio` library introduced
* `singledispatch()` added to `functools` module. Emulates function overloading/multimethod; basically a switch statement on a function depending on the type of the argument

Python 3.5:
* creating coroutines with `async/await`
* type annotations/type hints for function arguments
* `mypy` library introduced
* `@` operator for matrix multiplication
* additional unpacking generalizations: `[*range(4), 4]`

Python 3.6: Raymond Hettinger (core Python dev) calls it the first Python 3 that is "worthy of the name" due to 3.6 being much superior to Python 2.7
* dict order is guaranteed to be insertion order. Previously, dicts were unordered such that iterating over the keys should be effectively unpredictable
* variable annotation for class variables and instance variables
* asynchronous generators/comprehensions
* f-strings for string interpolation

Python 3.7:
* `asyncio` upgraded. Basically, asynchronous framework is very different and significantly improved compared to how it was in Python 3.4--making older ways to do asychronous programming obsolete.
* `dataclasses` module

Python 3.8:
* assignment expression (AKA walrus operator because `:=` looks like a walrus): can assign and evaluate in the same expression: `if (n := len(a)) > 10:`     It was so controversial that it was the stated reason for why Guido van Rossum retired as BFDL.
* `singledispatchmethod()` added to `functools` module. `singledispatchmethod()` is the method version of `singledispatch()` to emulate method overloading/multimethod

Python 3.9+:
* Who knows?

## Extra Resources
* https://www.toptal.com/python/top-10-mistakes-that-python-programmers-make: one of the first articles I read when I started learning Python. It lists some idiosyncratic things about Python. Basically, the initial seed idea for Pedantic Python and the other notebooks. When I first started out in Python, I had no idea what the article was trying to say. Now, I understand about 80% of what the article says. Don't worry--you don't need to understand it all to be a pro.
* wtf python (https://github.com/satwikkansal/wtfpython: idiosyncratic things about Python. Basically, even more Pedantic Python!
* Bloomberg's `What is Code` (https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/): an epic 38,000 word, comprehensive ~~diatribe~~ essay about the history and practice of software programming--thus answering the question: what is code? Also has fun interactive widgets/graphics to play with in the article.
* Unicode and UTF-8 (https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/): Joel Spolsky, founder of Stack Overflow, explains exactly what the article title says. With Python 3 separating `bytes` and `str`, unicode becomes very important for non-English languages (such as Chinese, Japanese, Korean--often abbreviated to CJK) for internationalization (abbreviated to i18n), localization (abbreviated to L10n), and weird encodings (CP-1252 encoding on a Window's system! 😩).
* Fun list of programming laws (https://github.com/dwmkerr/hacker-laws): whenever something in software breaks, you can cite a programming "law". Whenever you need a philosophical justification for what you are doing, you can find a programming "principle". Make yourself unbeatable in coding debates! 😎
* The Fun of Reinvention - David Beazley - Pycon Israel 2017 (https://www.youtube.com/watch?v=5nXmq1PsoJ0): David goes all out in using metaprogramming techniques (which he calls the "secrets of the framework builders") to change how Python behaves. Basically, David shows how Python is so expressive that you can bend Python to your will. There is no ~~spoon~~ Python. It is not the ~~spoon~~ language that bends, it is only yourself. This video is quite dense with use of type annotations and metaclasses. Just watch for the magic ✨. David Beazley is a giant in the field of Python. Having been in the Python community for nearly 20 years, David publishes practical books on Python and gives cool talks and is constantly rethinking Python. He charges \\$2500 for his Practical Python Programming course and \\$2750 for his Advanced Python Mastery. And you got all that here for free! What a steal, you thief!
* Design Patterns (https://github.com/faif/python-patterns): Decades ago, 4 authors (known as the Gang of Four) wrote a book called <i>Design Patterns</i> to explain how to make a language do cool stuff (ie reuasable solutions to commonly occuring problems). Older languages benefitted from design pattern due to their limitations. Fortunately in Python, you don't have to worry about design patterns as much since Python has built in a lot of cool functionality for you. However, if you ever have a tough problem to solve, you can take a look at this GitHub repo to see if you can benefit from a already solved/battle-tested solution.

## Concluding Remarks
Python is a multi-paradigm language. And Python is always changing and improving. You can use Python in different styles to approach different problems. Hence, there is no one "ultimate" style that rules them all: the best style is the one that solves your problem. For example, there's a phrase called "thesis, antithesis, synthesis" and there's also the parable of the blind men and an elephant; you choose the parts you like and put them together. Python can be what you want it to be: it is a matter of perspective.
<table><tr>
<td> <img src="images/Tensorflow_logo.png" alt="Drawing" style="width: 250px;"/> </td>
<td> <img src="images/GED_triplet.png" alt="Drawing" style="width: 250px;"/> </td>
</tr></table>
Here is the TensorFlow logo and the cover art for <i>Gödel Escher Bach: an Eternal Golden Braid</i>. GEB explains through self-reference how systems can acquire meaning despite being made of "meaningless" elements.

Remember the reason you learn about Advanced Python is not because you want to ~~get a pay raise~~ argue with your (nerdy) friends. It is because you care about interesting problems and how to solve them (in a better way). And it isn't really about Python--it's about you. In a programmer's lifetime, all the code you will ever write will fit neatly into a small USB flash drive. The millions of lines of code you write will weigh 0.00000000001 pounds. Code is invisible and weightless but ever-present and evermore expanding.

But the power of your imagination is boundless. Napoleon Hill's  epiphany is "Whatever the mind can conceive and believe, the mind can achieve." What will you want to create? What you imagine, you can bring forth into the real world--and code is just a tool to realize that dream. Though "software is eating the world", at the end of the day the problems you try to solve with code is to improve human lives. Coding is fundamentally a human-centered mission. Advanced Python is a McGuffin: Python isn't a destination, it is a journey. I wish you an exciting and meaningful journey! -- Eugene Huang
<table><tr>
<td> <img src="images/Python-logo.png" style="width: 100px;"/> </td>
<td> <img src="images/python_yin_yang.png" style="width: 100px;"/> </td>
</tr></table>

#### Python Short Story
```python
I once went to the pet store
Slithering Slytherin Emporium was its name
At the Counter, I asked for a special breed
A pet with max-imum bite and pow-er.
Something super fast like an animal on high oct-ane fuel
except it must be gentle and human friendly
The clerk said: whatever float-s your boat
True, I said--None but the mst Exception-al will do
The clerk said to give him a min-ute to eval-uate my options
Some time to run through his list
Given these input-s, he will search the property
He needed to enumerate all the possibilities
And with this context, he will talk to his manager
I assert-ed that this wait is too slow
He will def-initely help me out

while I was await-ing his return
I wanted to try my luck
What would my int-erest yield?
What else could I do?
Hence, I took a break by looking through the aisles
Animals of any type and all sizes
Something nonlocal to the area
Even the import-ed kinds and global-ly sourced
Some were in a bin, some out in the open 
Some were round and some looked slice-d up
The range of options were so complex, not even sorted
There was no order or filter-ing going on
In this jungle of a store, even the signs were reversed and out of place
I needed a map, a dir-ectory of this place
I must not give up. I must not hit a breakpoint
I must continue. 

The clerk called me back
He suggested a mouse inside an Easter egg.
Give it a wheel and go to the cheese shop.
I thought this rodent was just a pip squeek
I wanted to cry. I wanted to scream
I wanted to raise my voice.
Instead, I just said pass
I will come back next time
Even though that was a False statement
His facial expression belied his true motives
OOPs, he said: "I was just trying to be Pragmatic"
Something Functional, not Pedantic
Alas, he was just following Procedures



Just when I was about to leave
I finally found the one I liked
The clerk print-ed up a receipt
And I zip-ped out of the store
I was ec-static like a child with his parent
Who just inherited a Water Gun 3.0
I could barely compose myself
The trip was a dynamic experience

Ultimately the pet store was not a destination
It was a journey. What an adventure!
Plus, I found my favorite pet, a class of its own
The kind that bytes!
```

while, await, global, assert, def, import, finally, or, and, except, not, True, break, None, return, try, nonlocal, yield, continue, class, else, raise, pass

'False', 'async', 'del', 'elif', 'lambda'  
'for', 'from', 'if', 'in', 'is', 'with', 'as'

all, type, any, bin, open, bytes, map, dir, sorted, oct, breakpoint, filter, pow, super, max, float, enumerate, eval, int, min, round, slice, print, range, complex, help, zip, property, next, input

abs, hash, set, dict, hex, id, object, bool, str, exec, ord, sum, iter, format, len, 