## Enumerate the Possibilities
For people starting to learn Python, they often use C or Java syntax. 
```python
simple_string = "asdf"
for i in range(len(simple_string)):
    print(simple_string[i])
```
When they figure out you can iterate through containers (list, tuple, set, dict, string, etc.), they are amazed.
```python
simple_string = "asdf"
for element in simple_string:
    print(element)
```
But then they want the index (like for filtering purposes) and the element, so they go back to the original syntax:
```python
simple_string = "asdf"
for i in range(len(simple_string)):
    print(simple_string[i])
    if i == ...: do_something
```
However, you can use enumerate instead to get the index. `enumerate()` gives you a tuple--(index, element). By default, index starts at 0. E enumerate can take a 2nd argument, which is the index you __actually__ want to start at.

In [1]:
for index, item in enumerate('asdf'): 
    print(index, item)
    
print()

for index, item in enumerate('asdf', 1):
    print(index, item)

0 a
1 s
2 d
3 f

1 a
2 s
3 d
4 f


## `zip` it up!
If you want to pair 2 iterables together, use `zip()`, which stops at the shorter list. If you want to zip to the longer list, use `itertools.zip_longest()`. `zip()` can also take more than 2 arguments--you can have an arbitrary number of iterables.

In [2]:
for a, b in zip('asdfghjkl', range(42)):
    print(a, b)
    
print()

for a, b, c in zip(range(5), range(100, 105), range(200, 205)):
    print(a, b, c)

a 0
s 1
d 2
f 3
g 4
h 5
j 6
k 7
l 8

0 100 200
1 101 201
2 102 202
3 103 203
4 104 204


## Practical ~~oops~~ OOP
Everybody talks about Object Oriented Programming. Here's some actual useful functionality. 

#### Lingo:
* `class`: blueprints or definitions for creating an object. Another synonym for class is `type`.  
* `object` (also called an instance): an actual, living, breathing creation of a `class`--the manifestation of building out what was in your blueprint. The process of creating an instance is called `instantiation`. A secondary definition is that everything (storable) in Python is an object. 
* `method`: a function you put inside a class.
* `attribute`: a variable inside an instance (also called instance variable) or class (also called class variable).

In short, a Python object has 4 things: a type/class, data/attributes, methods, and a unique identity (which can be found by calling `id()`). 

#### 3 Types (What an OOP pun!) of Methods:
* `instance Method`: Most common (vast majority of time). Do something with the instance, ie update the instance's attributes or return something from that instance. Call from the instance.
* `class method`: uncommon (maybe 10% of the time). Do something with the class, ie update the class's attributes or return something from that class. Call from the class or an instance.
* `static method`: very uncommon (<5% of the time). Use no information about an instance or class, ie knows nothing about the instance or class. Just a regular function but you attached to a class (because it might be helpful). Call from the class or an instance.

In [1]:
class MyClass:
    my_class_attribute = 42 # notice it looks like a regular assignment within a class
    
    def my_instance_method(self, my_instance_attribute):
        self.my_instance_attribute = my_instance_attribute
        return my_instance_attribute    
    
    @classmethod # notice this decorator
    def my_class_method(cls): # notice `cls`, not `self`
        return cls.my_class_attribute # notice `cls`, not `self`
    
    @staticmethod # notice this decorator
    def my_static_method(x, y): # no `self` or `cls`
        return x + y 

In [2]:
# instance method
my_instance = MyClass()
print(my_instance.my_instance_method(10)) # instance method call on instance works
print(MyClass.my_instance_method(10)) # instance method call on class does not work

10


TypeError: my_instance_method() missing 1 required positional argument: 'my_instance_attribute'

In [3]:
# each instance has its own `copy` of the data/attributes
my_instance.my_instance_method(1)

my_instance_2 = MyClass()
my_instance_2.my_instance_method(2)

print(my_instance.my_instance_attribute)
print(my_instance_2.my_instance_attribute)

1
2


In [4]:
# class method
print(my_instance.my_class_method()) # class method call on instance works
print(MyClass.my_class_method()) # class method call on class works, though this notation is probably 
# preferred to denote that it is class method


# class attributes are visible to all instances. There is only 1 `copy` of the class attribute
my_instance_2 = MyClass()
print(my_instance_2.my_class_method())

42
42
42


In [5]:
# class attributes can be updated, which then all instances can see
MyClass.my_class_attribute += 1
print(my_instance.my_class_method())
print(my_instance_2.my_class_method())
print(MyClass.my_class_method())

43
43
43


In [6]:
# static method
print(my_instance.my_static_method(1, 2)) # static method call on instance works
print(MyClass.my_static_method(1, 2)) # static method call on class works

3
3


#### Food for Thought (~~Tongue~~ Brain Twister): 
* What if an instance attribute has the same name as a class attribute?
* If an instance attribute has the same name as a class attribute, what happens if you try to delete the attribute? Which one gets deleted?

## I Need Help: Can you `lambda` hand?  
Lambda functions (AKA anonymous functions) are a quick way to define a function. Lambda functions are for the most part equivalent to regular `def` functions, but specifically for simple, 1-line, single expressions. Common use cases for lambdas are in `sorted()`, functional programming (`map()`, `filter()`, `reduce()`), pandas DataFrame `.apply()`, and DAG-based transformations like Spark (which uses functional programming). Lambdas cannot be used for statements (like assignment).

In [1]:
sorted([(1, "z"), (2, "y"), (3, "x")], key=lambda tup: tup[1])

In [2]:
from functools import reduce

reduce(
    lambda x, y: x + y, 
    filter(
        lambda x: x % 2, 
        map(
            lambda x: x * 3, 
            range(10)
        )
    )
) # multiple all the numbers by 3, filter and keep numbers that are odd, add them together

75

In [6]:
# the previous example is a bit contrived since list comprehensions (and generators) can perform both map and filter, and a for-loop is recommended over a reduce function
summation = 0
map_and_filter = (element * 3 for element in range(10) if element * 3 % 2)
for element in map_and_filter:
    summation += element
summation

75

In [3]:
import pandas as pd

df = pd.DataFrame({"a": range(2, 10, 2)})
df["a"].apply(lambda x: x ** 2)

0     4
1    16
2    36
3    64
Name: a, dtype: int64

In [10]:
# technically you can name your function and still have default values. lambda truly is like a `def` function with a 1-line body.
summer = lambda x, y=42: x + y

print(summer(1))
print(summer(1, 2))
print(reduce(summer, range(10)))

43
3
45


## The Key in Getting into Arguments
Often times, you see `*args` and `**kwargs`, and you always were curious about what it does but were too afraid to ask.  
It's a bit confusing in that where `*args` appears affects how it works.  

For example, if you see `*args` in the function **definition/signature** (`def my_function(*args)`), then it is accumulating all the arguments into 1 tuple stored in a variable called `args`.

In [11]:
def silly_function(*args):
    print(args)
    print(type(args))
    
print(silly_function(1, 2, 3, 4, 4))

(1, 2, 3, 4, 4)
<class 'tuple'>
None


However, if you see `*args` in the function **call**, then it is separating the iterable into multiple arguments: 1 argument becomes multiple arguments. The order of the separated arguments is retained.  

In [17]:
def silly_function(a, b, c):
    print(a)
    print(b)
    print(c)

silly_function(*(1, 2, 3)) # this function call passes in 1 argument, which is a tuple.
print()
silly_function(*"xyz") # this function call passes in a string, which is technically iterable.

1
2
3

x
y
z


`**kwargs` stands for keyword arguments. It means that you are explicitly passing in an argument name and argument value. Again, the placement of `**kwargs` changes the behavior. `**kwargs` in the function **definition/signature** means that all named arguments will be placed in a dictionary.

In [18]:
def silly_function(**kwargs):
    print(kwargs)
    print(type(kwargs))
    
silly_function(a=1, b=2, c=3)

{'a': 1, 'b': 2, 'c': 3}
<class 'dict'>


`**kwargs` in the function **call** means separate the dictionary into named arguments. This is useful if you want to guarantee that the variables are explicitly given an argument name and argument value because you cannot rely on the order of the arguments like in `*args`.

In [22]:
def silly_function(a, b, c):
    print(a)
    print(b)
    print(c)
    
silly_function(**{"c": 3, "b": 2, "a": 1}) # equivalent to silly_function(c=3, b=2, a=1)

1
2
3


You can also use `*args` and `**kwargs` together. Also, the word `args` or `kwargs` are not special. It's just a Python convention. You can name it anything.

In [30]:
def silly_function(*my_silly_args, **my_silly_kwargs): # * in the function definition
    print(my_silly_args)
    print(my_silly_kwargs)
    
silly_function(1, 2, 3, d=4, e=5, f=6)

(1, 2, 3)
{'d': 4, 'e': 5, 'f': 6}


In [32]:
def silly_function(a, b, c, d, e, f):
    print(a, b, c, d, e, f)
    
silly_function(
    *[1, 2, 3], # * in the function call
    **{"f": 6, "e": 5, "d": 4}
)

1 2 3 4 5 6


In [24]:
# stars everywhere: it's a constellation!
def silly_function(*my_silly_args, **my_silly_kwargs):
    print(my_silly_args)
    print(my_silly_kwargs)
    
silly_function(
    *[1, 2, 3],
    **{"d": 4, "e": 5, "f": 6}
)

(1, 2, 3)
{'d': 4, 'e': 5, 'f': 6}


Often times, you will see a function called a decorator that has both `*args` and `**kwargs` and the purpose is to give the full expressiveness of the original function (without loss of generality). Whatever you could do with the original function, you can do with the decorated function.

In [36]:
def identity_decorator(func): # does nothing special
    def inner(*args, **kwargs):
        return func(*args, **kwargs)
    return inner

## Decorate like a Boss!
Everything you wanted to know about **decorators** but were too afraid to ask.  

Decorators sounds like an ornament you put on a Christmas tree. Then you put your programming hat on, and you think it must be very complicated. BUT it's actually a very simple concept.  
Boring, academic definition: decorators are higher-order functions that either take in a function and/or return a function for function composition.  
Actual practical importance: You have a function that you want to change some behavior before or after the function call.  

Now you may be wondering: why would you want a decorator if you can just change the original function--if all you are going to do is change something before and/or after the original function call. The reason could be: 
* You can't change the original function because it is too complicated to understand.  
* You can't change the original function because you don't have access to change it. For example, many numpy functions are written in C, so you literally could not change the source code.  
* You can't change the original function because it is used everywhere and you don't know which ones you want to change.  
* You just wanna be ~~cool~~ a pro!  

\* *Nota bene*: subtitle stolen/appropriated/borrowed/adapted from an article called "Everything You Wanted to Know about the Kernel Trick
(But Were Too Afraid to Ask)"

In [61]:
def make_my_function_polite(func):
    def inner(arg):
        print("hello!")
        result = func(arg) # notice I'm storing results if I want to do something after my function call
        print("bye!")
        return result
    return inner # notice I am returning a function back

In [62]:
def double(x):
    return x * 2
print(double(42))

84


In [67]:
make_my_function_polite(double) # notice that a function is returned

<function __main__.make_my_function_polite.<locals>.inner>

In [63]:
make_my_function_polite(double)(42) # apply decorator 1 time

hello!
bye!


84

In [64]:
# however, often times, you want to make the effects of the decorator "permanent" to all function calls
double = make_my_function_polite(double)
print(double(1))
print()
print(double(3))

hello!
bye!
2

hello!
bye!
6


In [65]:
# however, the previous syntax is ugly--nobody uses it. Here's the decorator syntax you see in the real-world
@make_my_function_polite
def double(x):
    return x * 2

double(42)

hello!
bye!


84

Now you may be thinking, what are some practical things you want to change before you your original function call: change the arguments or check for valid types. Some practical things you want after the original function call: logging action to disk, closing a connection to a database, return the output of the original function call and an extra flag depending on the output.

In [83]:
def penalize_type(func):
    def inner(arg):
        if not isinstance(arg, (float, int)):
            raise ValueError("Not a valid argument")
        else:
            return func(arg)
    return inner

@penalize_type
def double(x):
    return x * 2

double("42")

ValueError: Not a valid argument

In [85]:
def accommodate_type(func):
    def inner(arg):
        if isinstance(arg, str):
            arg = float(arg)
        return func(arg)
    return inner

@accommodate_type
def double(x):
    return x * 2

double("42")

84.0

There is more content about decorators in `4_Function_Python.ipynb`. But for now, are there any more questions? ;-)

## Helpful Libraries
#### `collections`:
* Marketing docstring: This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple. 
* What it actually does: Whatever types you have, they can be cooler!
* Useful classes: `Counter` and `defaultdict`

In [None]:
# Typical Way of Letter Count
letters = "abcdcacdacadacdabbbabc"

letter_count = {}
for letter in letters:
    if letter in letter_count:
        letter_count[letter] += 1
    else:
        letter_count[letter] = 1
letter_count

In [2]:
# Using `dict.get`, which is a safe operator for unknown keys
letters = "abcdcacdacadacdabbbabc"

letter_count = {}
for letter in letters:
    letter_count[letter] = letter_count.get(letter, 0) + 1
letter_count

{'a': 7, 'b': 5, 'c': 6, 'd': 4}

In [3]:
# Use counter instead! counter is really an upgraded dictionary: it counts!
from collections import Counter

letters = "abcdcacdacadacdabbbabc"
counter = Counter(letters) # just put your iterable here and Counter will do the rest
print(counter)
print(counter.most_common())
print(counter['d']) # key is inside dict
print(counter['z']) # key is not inside dict, so will output 0. Hence you don't need to counter.get('z', 0)

Counter({'a': 7, 'c': 6, 'b': 5, 'd': 4})
[('a', 7), ('c', 6), ('b', 5), ('d', 4)]
4
0


In [4]:
# defaultdict takes in a function, and that will be your default value if the key doesn't exist yet
from collections import defaultdict

dict_key_always_has_value = defaultdict(list) # put a function here
print(dict_key_always_has_value['a']) # the key doesn't exist in your dictionary but the value is already available
dict_key_always_has_value['a'].append(1)
print(dict_key_always_has_value)

[]
defaultdict(<class 'list'>, {'a': [1]})


In [5]:
# Here is how you do letter counts using a defaultdict.
# If you think about it, a Counter is just a defaultdict using int, since int() returns 0.
letters = "abcdcacdacadacdabbbabc"

dict_key_always_has_value = defaultdict(int) # because int() returns 0
for letter in letters:
    dict_key_always_has_value[letter] = dict_key_always_has_value[letter] + 1 # equivalent to dict.get(letter, function_here())
print(dict_key_always_has_value)

defaultdict(<class 'int'>, {'a': 7, 'b': 5, 'c': 6, 'd': 4})


In [6]:
# since the defaultdict argument is just a function, you can create whatever default values you like
dict_key_always_has_value = defaultdict(lambda: [None] * 10)
print(dict_key_always_has_value['a']) # the value by default is automatically [None] * 10
print(dict_key_always_has_value) # notice that once you lookup ANY key, the key-value pair now exists in your dictionary
print('b' in dict_key_always_has_value) # use this notation if you just want to check membership, but not add key-value pair to dict
print(dict_key_always_has_value)

[None, None, None, None, None, None, None, None, None, None]
defaultdict(<function <lambda> at 0x00000195D3B952F0>, {'a': [None, None, None, None, None, None, None, None, None, None]})
False
defaultdict(<function <lambda> at 0x00000195D3B952F0>, {'a': [None, None, None, None, None, None, None, None, None, None]})


In [7]:
# can nest defaultdicts for interesting data structure. I have actually used dict in dict before 
nested_defaultdict = defaultdict(lambda: defaultdict(list)) # each argument of a defaultdict must be a function
print(nested_defaultdict['a']) # gives you the inner defaultdict back
print(nested_defaultdict['a']['a']) # gives you the nested list
nested_defaultdict['a']['a'].append(42)
print(nested_defaultdict)

defaultdict(<class 'list'>, {})
[]
defaultdict(<function <lambda> at 0x00000195D3B95510>, {'a': defaultdict(<class 'list'>, {'a': [42]})})


#### `tqdm`:
* Marketing docstring: A Fast, Extensible Progress Bar for Python and CLI.
* What it actually does: put a timer everywhere! If something is slow, time it!
* tqdm means "progress" in Arabic (taqadum, تقدّم) and is an abbreviation for "I love you so much" in Spanish (te quiero demasiado).

In [8]:
from tqdm import tqdm # tqdm is a great library to show progress bar
import time

for i in tqdm(range(10)): # just wrap tqdm() around your iterable; useful Time Lapsed and Estimated Time of Completion for loops
    time.sleep(1)

100%|██████████| 10/10 [00:10<00:00,  1.00s/it]


#### `pdb` or `ipdb`
When your back is against the wall, use the debugger! I tend to use `print()` to try to reason about my code when I hit a bug. However, when the problem is so deep that I cannot easily think it through, then it's time to use the trusty debugger.  
Put `pdb.set_trace()` right before you line you get an error message--you can even use it in the notebook!  
2 important keys:  
 - n - run next line  
 - q - quit the debugger  
 While inside the debugger, all your variables are alive, so you can literally see what state your variables/objects are in. You can run arbitrary functions, expressions, and statements in the debugger. For example, you can overwrite a variable with a different value to see how your function would work under those conditions.  

`ipdb` is a fancier debugger that needs to be installed, but it offers tab completion and nicer printouts.

In [1]:
%%writefile script_to_debug.py
import pdb

def flawed_function():
    x = 1
    y = 0
    pdb.set_trace()
    return x / y

flawed_function()

Overwriting script_to_debug.py


If you run `python script_to_debug.py` in the terminal, the Python interpreter will stop where you put `set_trave()` and the variables will be alive. You will be in the interactive mode in the Python REPL.

You can also run `pdb` in the notebook itself!

In [2]:
import pdb

def flawed_function():
    x = 1
    y = 0
    pdb.set_trace()
    return x / y

flawed_function()

> <ipython-input-2-19bdbc5020a5>(7)flawed_function()
-> return x / y
(Pdb) x
1
(Pdb) y
0
(Pdb) n
ZeroDivisionError: division by zero
> <ipython-input-2-19bdbc5020a5>(7)flawed_function()
-> return x / y
(Pdb) y = 1
(Pdb) n
--Return--
> <ipython-input-2-19bdbc5020a5>(7)flawed_function()->None
-> return x / y
(Pdb) q


BdbQuit: 

#### `inspect/dir`
Another equally handy tool is the `inspect` library. Specifically, `getsource()` extracts the source code for modules, classes, functions, and methods----the main exceptions are built-ins and Cython code.

In [3]:
import inspect

print(inspect.getsource(flawed_function)) # get the source code--you can do this inside the debugger

def flawed_function():
    x = 1
    y = 0
    pdb.set_trace()
    return x / y



In [4]:
import pandas as pd

print(inspect.getsource(pd.DataFrame.merge))

    @Substitution('')
    @Appender(_merge_doc, indents=2)
    def merge(self, right, how='inner', on=None, left_on=None, right_on=None,
              left_index=False, right_index=False, sort=False,
              suffixes=('_x', '_y'), copy=True, indicator=False,
              validate=None):
        from pandas.core.reshape.merge import merge
        return merge(self, right, how=how, on=on, left_on=left_on,
                     right_on=right_on, left_index=left_index,
                     right_index=right_index, sort=sort, suffixes=suffixes,
                     copy=copy, indicator=indicator, validate=validate)



`dir()` is a builtin-in function (not a module) that allows you the inspect the modules, classes, and objects. I use it primarily to see what attributes/methods an object has. Once I know what attributes it has, then I can go inspect and see what values the attributes contain.

In [16]:
x = range(10)
dir(x)

['__bool__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index',
 'start',
 'step',
 'stop']

#### `os`:
Instead  of directory_name + "/" + filename, use this instead since it doesn't require adding '/' to directory_name and is safer between operation systems and also won't add an unnecessary "/".

In [5]:
import os

print(os.path.join('directory_name', 'file_name')) # sorry, wrote this on a Windows, so the printout looks funny
print(os.path.join('directory_name/', 'file_name'))

directory_name\file_name
directory_name/file_name


## Extra Materials
* https://www.curiousefficiency.org/posts/2015/10/languages-to-improve-your-python.html: This article was inspired me to make this series of notebooks. The author is Nick Coghlan, who is a core Python dev, showed me the universe of possibilities using Python.  
* Advanced Python (https://www.youtube.com/watch?v=uOzdG3lwcB4): Thomas Wouters, a core Python dev, gave a great talk on Python that focused more on the Python syntax and tricks. It is amazing how timeless his lessons are as he gave this talk in 2007 before Python 3 existed. He was working with Python 2.5 at the time. I learned quite a lot from his talks, as he tells you how Python _really_ works with the magic methods.  

## Concluding Remarks
Since software is nothing but knowledge representation, the answer to the question: "What is the best way to represent my problem?" is choosing a specific programming paradigm. Python is not strictly OOP like Java nor is it pure functional like Haskell. You choose the parts you like and put it together. Python takes a multi-paradigm approach and it is constantly evolving. Your style should evolve too to fit the problem you are trying to solve. And also don't take paradigms _too_ seriously. The reason is that you can combine stylistic elements to formulate a more comprehensive solution. For example, there's a phrase called "thesis, antithesis, synthesis". You have an idea, somebody has an opposite idea, then you try to think of a path that reconciles the two. For example, OOP is not the opposite of functional; for example, Spark uses both elements of OOP and functional. Don't let the paradigm be the bounding box of your imagination.  

Also, I recommend you take notes when you learn Python. Once upon a time, I was a R programmer, and I was pretty good at it; I made a 2-player snake game that plays itself. A student smarter told me to learn Python, but at the time I had no motivation to do so. However, once I jumped on the Python bandwagon (ie was forced to learn it), then it opened my eyes to a whole universe of possibilities. As a student of Python, I recommend that you keep taking notes of Python, so you can look back at it for useful tips but also see your improvement as your notes will become more advanced over time.  