## Python Review

![image](https://user-images.githubusercontent.com/26397102/228243047-c320449a-166d-4dec-9051-fe5bd071c4ac.png)


Python is the programming language of winners.

But in all seriousness, Python has become the go-to language for numerous developers, analysts, and engineers.

This is because Python is highly flexible, we could find ourselves using Python in a variety of contexts:
* Engineering a dynamic website that connects to a database via Django
* Designing a data pipeline that extracts & transforms data
* Programming a script that runs in the background of someone's computer & keeps track of keyboard-clicks

However, Python is not the only language that a data developer should interest themselves in. I recommend taking a look at the following languages at some point in your journey of self-studying. 

* [C](https://pll.harvard.edu/course/cs50-introduction-computer-science?delta=0)
* [R](https://www.tutorialspoint.com/r/index.htm)
* [Scala](https://www.tutorialspoint.com/scala/index.htm)

## Features of Python

### Python is dynamically-typed (or type-less)

```python
x = 5
x = "hello"
```

It is completely unnessecary to indicate types in Python! This sometimes spoils us, as almost all other programming languages need types. Take a look at how we create variables in the `c` programming langauge.

```c
int x = 5;
char* c = "hello";
```

### Python is an interpreted language

In the world of computer languages, we have two ways to execute a program:

* **interpretation** : an interpreter reads a file line by line and executes individual steps of the program
* **compilation** : a compiler translates a file into machine code, and then this binary file is run independently of the compiler

While compilation usually takes more time to complete than interpretation, it actually produces a binary file that has faster execute time than an interpreter.

### Python is an object oriented language

Keep in mind that **everything** in Python is an object. And likewise keep in mind that an object is just a bundle of data that contains `methods` (behavior) and `attributes` (descriptors). This indicates that almost every data-type that we create in Python has attributes & methods. 

```python
x = "hello!"
print(x.upper())
```

If we really boil this concept down to its bare utility, we see that Object-Oriented-Programming (OOP) is a realization of the concept of digitization of the real world. We are attempting to describe **all** types of objects within a digital space, whether abstract (such as an available time-slot) or physical (such as an Amazon order).

This unfortunately works towards our detriment when we are trying to create programs that have fast execute-time. Creating an object entails **huge** overhead in memory & time. 

Overhead: extra-time or cost needed to do something. *ex*: Buying a pie vs baking a pie. *Sure baking a pie probably tastes better, but there is huge overhead associated with that.*

OOP however is not the only school of thought that exists, I recommend exploring [Scala](https://www.tutorialspoint.com/scala/index.htm) if you're interested in other paradigms of programming/

### Python is open source

We are free to make a profit, build an organization, or tear down an organization (maybe) all in the context of Python. 

In the next few code-blocks, we will go over some basic fundemental concepts of Python, as well as new & exciting features that will allow us to become more **powerful** developers.

## Basic Data Types

Our universe is made out of fundemental & microscopic components such as neutrons & electrons. The unique arrangement of these components lead to nuance & interesting structures. 

Similairly, we can boil down our data-types in Python to some essential types:

* int : whole numbers with no decimal
* char : single character (notice how I am not talking about strings!)
* float : number with decimal
* boolean : true or false 

There are of course more data-types than just these 4, but these are the most widely used amongst all programming languages. In addition, these are the most fundemental and any other data-type is actually just an extension (or limitation) on these data-types! For example, we have a concept called an `unsigned int` in `c`. This is actually just an int without a sign! AKA A strictly positive int.

As mentioned, these data-types are apparent in almost every other programming language, albeit more *explicitly*. 

**Explicit**: Making itself obviously known

Take for example the `c` programming language. Notice how each data-type is *explicitly* written out:

```c
int x = 5;
char c = 'h';
float f = 3.14159;
```

Python on the other-hand is *implicitly-typed*. This means that the Python interpreter does **not** want you worrying about types as you are coding your programs! Instead, the interpreter looks at the data you are assigning to variables, and assigns type as it executes.

```python
# Interpreter Vision: I see you want an int
x = 5

# Interpreter Vision: I see you want a char
c = 'h'

# Interpreter Vision: I see you want a float
f = 3.14159
```

Whether you should prefer `explicit`-typing or `implicit`-typing is mostly up to you. Some prefer explicit types since it enables static type-checking at compile time. This allows your compiler to check that you have not made an obvious mistake before allowing your program to run.

C Static Typing
```c
int x = 5;

// toupper is a builtin function that only takes characters. However, we are passing an int. Your compiler will catch this.
toupper(x);
```

Python Dynamic Typing
```python
# define a topupper function
def toupper(x):
    return x.upper()

x = 5

# Python will happily allow this program to run, even though it is impossible to uppercase an int
toupper(x)
```

However, other's prefer the wild-west nature of dynamic typing:

```python
# a function that adds
def adder(v1, v2):
    return v1 + v2

x = 5
y = 10
z = "hello"
a = " world"

# the future is now!
adder(x, y)
adder(z, a)
```

In [None]:
# look at me I'm dynamic
b = False

x = 2000
y = 23
z = "hello"
a = " world"

p = 3.14
e = 2.718

## Operations

To complete this discussion of basic data-types, we also have simple operations that we can use to create calculations, combined values, & more. 

In a nutshell, they are: 

* Addition: `+`
* Concatenation: (joining together two or more strs/chars) `+`
* Subtraction: `-`
* Division: `/`
* Division with No Decimal: `//`
* Modulo (calculate the remainder): `%`
* Multiplication: `*`
* Equality: `==`
* Inequality: `>=` `<=` `>` `<` `!=`

Keep in mind that applying these operators might actually *change* our data-types, especially as it applies to numeric values! Operators will always select the **most** general numeric-type as the data-type of the result when adding multiple types together. 

Think back to the mathematical grouping of numbers. Rational numbers (floats) are the most general number-type (at least in Python). Whereas bools (which could be represented as 1 or 0) & ints are the most specific. 

![image](https://user-images.githubusercontent.com/26397102/228243430-3485307d-07a4-47be-b337-4919e257d1ac.png)


Our hierarchy of *least general* to *most general* looks like the following:

* Bools 
* Ints
* Floats
  
`Bool + Int = Int`  
`Bool + Float = Float`  
`Int + Float = Float`  

Division will always result in a floating point value. 

Attempt to solve the 5 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question.

In [63]:
# look at me I'm dynamic
b = False

x = 2000
y = 23
z = "hello"
a = " world"

p = 3.14
e = 2.718

# try to solve these CONCEPTUAL questions WITHOUT PRINTING to gauge your undertstanding of types & operations ...
# use the variables in the code-block above

# Q1: Without running. What is the output of this line of code?: the year is now 2023
print("the year is now " + str(x + y))
# Q2: What kind of data-type does str(x + y) this evaluate to? string

# Q3: What kind of data-type does this addition result in?
# float
x + p

# Q4: What kind of data-type does this addition result in?
x + p

# Q5: What kind of data-type does this addition result in?
# Integer 
b + x


the year is now 2023


int

## Floating Point Imprecision

You'll notice that working with floating point values is imprecise & messy. This is because the intrinsic operations of adding decimal values in a limited system (such as a computer) is messy itself.

In [1]:
# this should be true
.1 + .1 + .1 == .3

False

In [2]:
# it would be nice to display this as 7.30
2.00 + 5.30

7.3

In [3]:
# this should be 5.53 
3.53 + 2.00

5.529999999999999

We usually get around this issue by rounding. However, another good approach (especially in the context of cash-money) is to utilize the builtin `Decimal` data-type.

This is an object that solves a lot of the issues that we usually have with imprecise values.

It preserves accuracy, given that we are expressing a float as a string.

https://docs.python.org/3/library/decimal.html

In [68]:
from decimal import Decimal

x = 3.53
y = 2.00

# same issue, but with more precision
Decimal(x) + Decimal(y)

Decimal('5.529999999999999804600747666')

In [69]:
# doesn't this look better?
Decimal(str(x)) + Decimal(str(y))

Decimal('5.53')

In [70]:
Decimal(str(.1)) + Decimal(str(.1)) + Decimal(str(.1)) == Decimal(str(.3))

True

In [75]:
# with a little extra coding, we can get a monetary amount
# check out the 'format' method: https://www.programiz.com/python-programming/methods/built-in/format
format(Decimal(str(2.00)) + Decimal(str(5.30)), '.2f')

'7.30'

In [76]:
# however simple operations no longer work. This is because "Decimal" is a different object than `float`
Decimal(1.30) + 1.20

TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'

Attempt to solve the 5 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question.

In [83]:
# Q1: Without running, what will be the result of the following line of code?: 
print(5.30 + 2.20)

# Q2: Without running, answer the following: will this output 0.0?: 
print(0.1 + 0.1 + 0.1 - 0.3)


# Q3: Without running, what will the following output?
print(format(Decimal(str(5.30)) + Decimal(str(2.20)), '.2f'))

# Q4: Without running, what will the following output?
# print(Decimal(str(5.30)) + Decimal((2.20)))

# Q5: Without running, what will the following output?
print(Decimal(str(5.30)) + 4.00)

7.5
5.551115123125783e-17
7.50


TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'

## Conditionals

We often do not want lines of code to only run once, instead we want them to run conditionally (if something is true) or iteratively (multiple times).

As a recap, we write conditionals in the following format:
```python
if cond1:
    print("cond1")
elif cond2:
    print("cond2")
else:
    print("everything else was false")
```

where we execute the first if-statement if `cond1` is true. Alternativey if the first `cond1` is false but the second `cond2` is true, we execute the second elif statement. And lastly if both `cond1` and `cond2` are false, then we execute the else statement.

Lastly, let's recall how logical operators work in Python:

### Conjunctions (and) 

False-sticky. If anything is false, everything becomes false:

```
T and T and F = F
T and T and T and F = F
F and F and T = F
```

### Disjunctions (or)
True-sticky. If anything is true, everything becomes true:

```
T or T or F = T
T or T or T or F = T
F or F or T = T
```

### Negations (not) 

simply negate the statement.

```
not T = F
not F = T
not not T = T
```

Attempt to solve the 4 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [5]:
#Q1: what does this evaluate to?: False
print(True and True and True and False)

#Q2: what does this evaluate to?: True
print(True or True or True or False)

#Q3: what does this evaluate to?: True
print(not not not not True)

#Q4: what does this print out?:
x = "hello"

if x[0] == "h":
    print(x + " world")
elif x[-1] == "o":
    print(x + " goodbye")
else:
    print("foo")

False
True
True
hello world
hello goodbye


## Loops

A brief review of loops:

### For-Loops

Primarily used for when we KNOW how many times something should iterate (i.e. loop through elements of a data-structure)

```python
# start, stop, step
for i in range(0, 21, 5):
    print(i)
```

The for-loop increments automatically through an object. Namely, we call this an "iterator." Our data-structs and objects sometimes provides an iterator that automatically serves the next object to iterate through. In the above case, we create a `range` object that starts at 0, and the iterator will serve up `start + step` as the next value. In this case `0 + 5` will be served.

To keep ideas simple, just imagine this as a variable that is initially set to 0, and then we keep adding 5 to it until we violate the `stop` condition.

### While-Loops

Primarily used when we DO NOT KNOW how many times something should run for (i.e. keep looping until something is false)

```python
i = 0
while i < 20:
    print(i)
    i += 5
```

This while-loop **behaves** exactly the same as the above for-loop. However this time, we have to create this `i` variable ourself, and we tell it what condition constitutes a stop `i < 20`.

Attempt to solve the 4 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [8]:
for i in range(5):
    print(i)
# Q1: translate the above for-loop into a while loop
i = 0
while i < 5:
    print(i)
    i += 1

#for i in range(20, 10, -1):
    #print(i)
# Q2: translate the above for-loop into a while loop


i = 2
while i < 10:
    print(i)
    i += 2
# Q3: translate the above while-loop this into a for loop
for i in range(2, 10, 2):
    print(i)

#i = 0
#while i > -10:
    #print(i)
    #i -= 1
# Q4:s translate the above while-loop this into a for loop


0
1
2
3
4
0
1
2
3
4
2
4
6
8
2
4
6
8


## Data-Structures

In our quest to increase complexity (in return for more functionality) we will also be joining singular data-types together in a concept called "data-structures" (a structure of data!)

Recall our fundemental data-structs.

### Lists

Details:
* Maintains order
* Utilizes an index, starting from 0
    * We use the index to access certain positions of the list: `list[0]`
    * We can access slices of lists via `list[0:5]` (get all elements from index 0 to index 4)
* Mutable
    * We can change values through the following pattern `list[0] = new_val`

```python
example = ["a", "b", "c"]

# access first position
print(example[0])

# access second position
print(example[1])

# access last position
print(example[-1])

# access slice of list
print(example[0:3])
```

In other languages, lists are called arrays.

### Tuples

Details:
* Maintains order
* Utilizes an index, starting from 0
* Immutable
    * We CANNOT change values through the following pattern `tuple[0] = new_val`
    * We CANNOT add or remove elements neither

```python
example = ("a", "b", "c")

# access first position
print(example[0])

# access second position
print(example[1])

# access last position
print(example[-1])

# access slice of tuple
print(example[0:3])
```

Tuples are essentially the SAME thing as lists, but we are not able to "mutate" them.

### Sets

Details:
* DOES NOT maintain order
    * Think of a set as a bag, everything gets mixed in there, so there is no guarantee
    * Therefore we do NOT utilize an index
* Mutable
    * While we cannot access index positions, we can still add and remove elements

```python
example = {"a", "b", "c"}

example.add("a")
example.add("a")
example.remove("b")
print(example)
```

Sets are great intermediate data-structs to extract unique values from a list (or any other object)

### Dictionaries

Details:
* Maintians order
    * Be aware! Before Python 3.6, dictionaries DID NOT maintain order
* Utilizes keys to map to values
* Mutable
    * We can change values through the following pattern `dict[key] = new_val`
* Support INSTANT lookup.
    * When we check for membership in ordered data-structs (lists & tuples), this takes O(n) time-complexity
    * Dictionaries & sets however, only take O(1)! Use this to your advantage when you need a quick lookup table!

```python
example = {"a": 1, "b": 2, "c": 3}

# access value of "a"
print(example["a"])

# access value of "b"
print(example["b"])

# delete & return key-value
print(example.pop("c"))
```

Dictionaries are also know as hash-maps or hash-tables in other programming languages

Attempt to solve the 5 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [10]:
lst = ["a", "b", "c", "d", "e"]
submissions = {"Ulysses", "Juan", "Juan", "Ahmed", "Ahmed", "Ahmed", "Ana", "Frank", "Yazmin", "Yazmin"}
prices = {"apple": 0.98, "dozen eggs": 3.54, "chips": 5.83}

# Q1: what is the output of the following code?
print(lst[:2])
# [a, b]

# Q2: what is the output of the following code?
print(submissions)
# {"Ulysses", "Juan", "Ahmed", "Ana", "Frank", "Yazmin"}

# Q3: what is the output of the following code?
# print(submissions[:2])
# ERROR

# Q4: what is the output of the following code?
print("Jazmin" in submissions)
# False 
# INSTANT ACCESS (sets/dictionaries we support instant membership checking) O(1)/O(n) --> lists

# inflation
prices["dozen eggs"] = 4.00
prices.pop("dozen eggs")

# Q5: what is the output of the following code?
print(prices["dozen eggs"])
# ERROR: key does not exist

['a', 'b']
{'Frank', 'Juan', 'Ulysses', 'Yazmin', 'Ahmed', 'Ana'}
False


KeyError: 'dozen eggs'

## List Comprehension

The concept of writing "Pythonic" lines of code is the practice of using all Python functionality to create code that is easy to read & also efficiently accomplishes our intention. As Python engineers, we value this skill.

We can combine for-loops & lists to quickly generate list data-structures in one line of code. This is called `list comprehension`.

The pattern goes as follows:

```python
new_list = [variable.method() for variable in object]
```

This is the equivalent of doing:

```python
new_list = []
for variable in object:
    new_list.append(variable.method())
```

We can even attach a conditional list-comprehension so that we only select the elements from the object that meet pass some conditional (sort of like a filter!)

```python
new_list = [variable.method() for variable in object if condition]
```

This is the equivalent of doing:

```python
new_list = []
for variable in object:
    if condition:
        new_list.append(variable.method())
```

Attempt to solve the 4 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [85]:
new_list = []
for i in range(5):
    new_list.append(i)
# Q1: Translate the above loop into list comprehension
new_list = [i for i in range(5)]

new_list = []
for i in range(10, 5, -1):
    new_list.append(i)
# Q2: Translate the above loop into list comprehension
new_list = [i for i in range(10, 5, -1)]

new_str = []
word = "hello world"
for letter in word:
    if letter == "e" or letter == "o":
        new_str.append(letter)
# Q3: Translate the above loop into list comprehension
new_str = [letter for letter in word if letter == "e" or letter == "o"]

old_emails = ["scam@gmail.com", "bob@scam.com", "realperson@gmail.com"]
new_emails = []
for email in old_emails:
    if "scam" not in email:
        new_emails.append(email)
# Q4: Translate the above loop into list comprehension
new_emails = [email for email in old_emails if "scam" not in email]

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]


## Lesser Known Data-Structs

### Strings
It turns out that "strings" are actually "chars" in disguise! In other programming languages, "strings" are simply arrays of chars that we implement ourselves.

Of course in Python, this is expressed a pre-built object that already has some functionality built into it.

Details:
* Maintains order
* Utilizes an index, starting from 0
* Immutable
    * We CANNOT change values through the following pattern `str[0] = new_val`
    * We CANNOT add or remove characters neither

```python
example = "abcdefghik"

# access first position
print(example[0])

# access second position
print(example[1])

# access last position
print(example[-1])

# access slice of string
print(example[1:3])
```

Think of a string as just an immutable list (aka a tuple) that only accepts chars.

https://www.w3schools.com/python/python_ref_string.asp

## Abstract Data Types

### Queues

Queues are simply lists that maintain order as we add and remove elements. A queue is FIFO (first in first out). We implement a queue simply by popping from the front of a list and simply by appending to the back of the list.

```python
queue = []

queue.append("a")
queue.append("b")
queue.append("c")

# who gets removed first?
queue.pop(0)
```

https://www.geeksforgeeks.org/queue-in-python/

### Stacks

Stacks are simply lists that maintain order as we add and remove elements. A queue is FILO (first in last out). We implement a stack simply by popping from the back of a list and simply by appending to the back of the list as well.

```python
stack = []

stack.append("a")
stack.append("b")
stack.append("c")

# who gets removed first?
queue.pop()
```

https://www.geeksforgeeks.org/stack-in-python/

### Trees

A tree is an object composed of nodes & connections. We start from a root-tree and then generate branches that lead to children. Abstractly, think of this as a family-tree, where the great-grandparents are at the top, the grandparents are a level below, the parents are at the 3rd level, and then finally the bottom-most nodes constitute the children.

For now, we will not discuss this concept further, but as it is useful for engineering tasks, we will be discussing in the Sunday technical interview sessions.

https://www.tutorialspoint.com/python_data_structure/python_binary_tree.htm

## Collection Data Types

### NamedTuple
A namedtuple is a tuple (immutable list) that has labels for each value that is stored inside of it. This is a quick and easy way to implement immutable "method-less objects."

We implement a namedtuple through the following pattern: `x = namedtuple("Name", ["attr1", "attr2", "attr3"])`

```python
from collections import namedtuple

Cooridinate = namedtuple("Cooridinate", ["x", "y", "z"])

p1 = Cooridinate(10, 20, 3)
p2 = Cooridinate(5, 3, 19)

# this calculates to 15
print(p1.x + p2.x)

# this also calculates to 15
print(p1[0] + p2[0])

# this calculates to 49
print(p1.y + p2.z)
```

There are a variety of other useful data-structs in "Collections", and I encourage you all to explore this documentation: https://docs.python.org/3/library/collections.html

Attempt to solve the 5 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [86]:
# Q1: What is a real life example of a stack?
# write answer here

# Q2: What is a real life example of a queue?
# write answer here
queue = []

queue.append("a")
queue.append("b")
queue.append("c")

queue.pop(0)
# Q3: what will be the next element popped from the queue?

stack = []

stack.append("a")
stack.append("b")
stack.append("c")

stack.pop()
# Q4: what will be the next element popped from the stack?

# Q5: What is the result of this code?
from collections import namedtuple

Cooridinate = namedtuple("Cooridinate", ["x", "y", "z"])

p1 = Cooridinate("53", 20, "19")
p2 = Cooridinate("21", 19, "28")

print(p1.x + p2.x)

5321


## Functions

You should notice by this point that we run functions by writing out the function name and then follow up with a left and right parenthesis. 

`function()`  

We can nest functions within one another 

`function3(function2(function1()))`

*this will result in the firstly the inner-most function being run, and then subsequently the next inner-most and then finally the outer-most.*

And lastly, functions are occasionally attached to objects. In this case we call them `methods`, but honestly methods & functions are one and the same, so harpering on this distinction is irrelevant.

`object.method()`

As we know, we have built-in methods:

In [None]:
a = [1,2,3,4]

print(sum(a))

print(len(a))

and we also have functions that we can **construct** ourselves:

In [None]:
def func(a):
    print("Wow I am a new function!")
    print("here is your argument", a)

func("hello")

Let's break down structure of a function in Python:

```python
def func(a):
```

We create functions using the function header `def name(param1, param2)`. The variables that we place inside of the parentheses of a function are called **parameters**. These are the variables that will keep track of your positional arguments! For example if we were to call `name(20, 30)`, 20 would be assigned to param1 and 30 would be assigned to param2. 

Even if these were assigned to be variables of similair name outside of the function, we would only keep track of the value of these variables!

```python
    print("Wow I am a new function!")
    print("here is your argument", a)
```

This is simply the body of the function. Here we print a few lines of text and also the value of the argument that we passed in.

```python
func("hello")
```

Lastly, we call a function by writing out the name of the function followed by parentheses. The values we write inside of our function are called **arguments**.

Attempt to solve the question in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [None]:
def divider(a, b):
    return a / b

def multiplier(a, b):
    return a * b

a = 1
b = 5

# Q1: What is x equal to ? Without running.
x = divider(b, a)

# Q2: which function runs first?
y = multiplier(x, divider(10, 2))

# Q3: what is the output of y

Recall the importance of a `return` statement. Printing simply outputs text to the terminal. `return` saves a discrete value and returns it to the main flow of code.

Lastly, keep in mind that we utilize [numpy documentation standards](https://numpydoc.readthedocs.io/en/latest/format.html) for our code.

```python
def divider(a, b):
    ''' summary
    
    Parameters
    ----------
    a:  int
        First int
    b:  int
        Second int

    Returns
    ----------
    int
        An integer representing the quotient of a & b
    ''' 
    return a / b
```

## Optional Typing

As a refresher, we can also specify types in our python code via type-hints. Python will not yell at you for violating these types, but this basically serves as documentation for other developers. Some organizations choose to do this, others do not. 

```python
def divider(a: int, b: int) -> int:
    ''' summary
    
    Parameters
    ----------
    a:  int
        First int
    b:  int
        Second int

    Returns
    ----------
    int
        An integer representing the quotient of a & b
    ''' 
    return a / b
```

https://docs.python.org/3/library/typing.html

## Generator

Let's say you are reading in a large file or calculating a series of numbers that entails too much memory usage. This could easily lead to a runtime memory error if we utilize more memory than is allocated on our OS for running programs. 

```python
def infinite_sequence():
    num = 0
    # this will eventually crash
    while True:
        print(num)
        num += 1
```

```python
def infinite_sequence():
    num = 0
    # this will run "forever"
    while True:
        yield num
        num += 1
```

In general, this is a great memory saving technique that we should implement when we need to keep track of some internal state (variable) while also pulling up data in a memory friendly way.

We likewise create generators by utilizing paranthesis instead of square-brackets in our list-comprehension!

```python
nums_squared = (num**2 for num in range(5))
```

I recommend you take a look at the following RealPython article to get a good idea of how generators are used:

https://realpython.com/introduction-to-python-generators/

In [56]:
import sys

# RealPython Example

# list comprehension
nums_squared_lc = [i ** 2 for i in range(10000)]
print(sys.getsizeof(nums_squared_lc))


# generator
nums_squared_gc = (i ** 2 for i in range(10000))
print(sys.getsizeof(nums_squared_gc))

85176
104


## Decorators

Decorators, in essence, are just functions that wrap around other functions to implement additional functionality. We can always implement our own decorators, but we often use the [functools](https://docs.python.org/3/library/functools.html) module which loads some useful decorators that we can use to save on memory. 

I recommend you take a look at the following RealPython article(s) to get a good idea of how decorators are used.

https://realpython.com/primer-on-python-decorators/  
https://docs.python.org/3/library/functools.html  
https://refactoring.guru/design-patterns/decorator  

In [30]:
# Real Python example

def my_decorator(func):
    def wrapper():
        print("hello world!")
        func()
        print("goodbye world!")
    return wrapper

@my_decorator
def do_maths():
    print("1 + 1 = 2")

do_maths()

hello world!
1 + 1 = 2
goodbye world!


In [52]:
from functools import cache

# cache's are super useful when it comes to recursion, let's compare 
@cache
def cache_factorial(n):
    return n * cache_factorial(n-1) if n else 1


def factorial(n):
    return n * factorial(n-1) if n else 1

In [53]:
import time

start = time.time()
res = cache_factorial(1200)
end = time.time()

print("This took ", end - start, " seconds")

start = time.time()
res = factorial(1200)
end = time.time()

print("This took ", end - start, " seconds")

This took  0.0020008087158203125  seconds
This took  0.003013134002685547  seconds


In [54]:
start = time.time()
res = cache_factorial(1000)
end = time.time()

print("This took ", end - start, " seconds")

start = time.time()
res = factorial(1000)
end = time.time()

print("This took ", end - start, " seconds")

# notice the difference in time efficiency!

This took  0.0  seconds
This took  0.0010120868682861328  seconds


## Packing & Unpacking

A feature of Python you all might have seen already in the domain of data-structures is packing & unpacking. 

We can pack a list using the `*` operator, and similarly unpack a list by placing the asterisk behind the list variable name as we pass it into a function.

**list packing**
```python
a, *b, c = [1, 2, 3, 4, 5, 6]
```

**list unpacking**
```python
def adder(a, b, c, d):
    return a + b + c + d

x = [1, 2, 3, 4]
adder(*x)
```

The same applies for dictionaries, except this time we utilize two asterisks `**`.

**dictionary unpacking**
```python
def adder(a, b, c, d):
    return a + b + c + d

x = {"a": 1, "b": 2, "c": 3, "d": 4}
adder(**x)
```

https://www.geeksforgeeks.org/packing-and-unpacking-arguments-in-python/


Attempt to solve the 3 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [None]:
x = [1, 2, 3, 4]
y = [5, 6, 7, 8]

z = {"a": 1, "b": 2, "c": 3}

def var_reveal(a, b, c):
    print("The value of a is", a)
    print("The value of b is", b)
    print("The value of c is", c)

# Q1: what will be the result of this print statement?
print([*x, *y])

# Q2: what will be the result of this print statement?
print(var_reveal(**z))

x, *y, z = [1, 2, 3, 4]
# Q3: what will be the result of this print statement?
print(y)

## Command Line Arguments

We can actually take in "arguments" from a command line in our Python program. This is especially useful for when we are running programs from the terminal. As a refresher, we run programs using the python interpreter via `python filename.py`. Let's assume that this program prints out "hello world" once.

```python
#filename.py
runtime = 1

for i in range(runtime):
    print("hello world")
```

Let's say we want to provide some arguments to modify how our program behaves via a command argument. Let's say we want "hello world" to be printed out `n` times where provide the `n`.

```
python filename.py 10
```

Would print out "hello world" 10 times.

Using the `sys` module, we could actually check what inputs were provided in the command-line using the following syntax!

```python
#filename.py
import sys

runtime = sys.argv[1]

# convert to int
runtime = int(runtime)

for i in range(runtime):
    print("hello world")
```

As you can see the `sys.argv` object acts as a list of command-line arguments. As we provide more arguments, our list of arguments also becomes larger (argument1 is sys.argv[1], argument 2 is sys.argv[2], etc). I challenge you all to inspect what `sys.argv[0]` holds... 

In compute-environments, all of our programs should be run in the terminal. Going forward, I want you all to run your `.py` files using the command `python filename.py` or `python3 filename.py` if you're on Mac.

This functionality is usually used in `.py` files as opposed to `jupyter` notebooks. Therefore, you will see a practice example in `cli.py` in the `fix-me-week2` exercise. 

## Lambda Functions

Lambda functions are an implementation  of [lambda calculus](https://plato.stanford.edu/entries/lambda-calculus/), a hallmark of functional programming languages such as Scala.

The pattern to lambda functions entail `lambda var1, var2: one-line of code`. Notice how a lambda function only has "room" for a singular line of code. For any functionality that requires more than 1 line of code, simply use a regular function.

Whatever the singular line of code evalutes to is what will be your return value.

```python 
def x(n1, n2):
    return n1 + n2

# can be expressed as...
x = lambda n1, n2: n1 + n2

def y(n1):
    if n1 == "red":
        return "blue!"
    else:
        return "green!"

# can be expressed as...   
y = lambda n1: "blue!" if n1 == "red" "green!"
```

We use lambda functions to quickly generate nameless "anonymous functions". 

```python
z = lambda x, y: x + y
print(z(2, 4))
```

This concept usually finds its utility in conjunction with pandas dataframes, as we'll see in the questions below.

```python
# this lambda function will take each data in the `col1` series  and lowercase it
df["col1"].apply(lambda x: x.lower())
```

https://realpython.com/python-lambda/

Attempt to solve the 3 questions in the below code-block. Questions are labeled as `Q#` and set as comments. Write your answer below or next to the question. Attempt to do this without running!

In [None]:
import pandas as pd

def clean_str(word):
    return word.replace("spam", "")
#Q1: Translate the above function into a lambda function

clean_str = ...

def get_last_char(word):
    return word[-1]
#Q2: Translate the above function into a lambda function

get_last_char = ...

data = {'col1': [0, 1, 2, 3], 'col2': ["red", "blue", "yellow", "green"]}
df = pd.DataFrame(data=data, index=[0, 1, 2, 3])

#Q3: Write a lambda function in the `apply` method to add 100 to each row in the `col1` column
df.loc["col1"] = df["col1"].apply(...)

## Imports & Python Modules

Lastly, to complete our engineering skill-set, we should also be aware of how we can import modules (aka Python files) into other modules. 

We usually take the pattern of:

```python
import module_name
```

to import all the objects & methods of a single module. Keep in mind that if you do this, you must refer to that module at every instance that you want to refer to its objects:

```python
# this gets the "Var1" variable from "module_name"
module_name.var1

# this gets the "method" function from "module_name"
module_name.method()
```

If you want to make your life easier, you can alias your module name so that you have less text to type out. This is especially useful if you are a "one-finger" typer like me:

```python
import module_name as mn

mn.var1

mn.method()
```

And lastly, if you only want certain variables or methods from your module, you should specify as such. This removes the need to refer to the module.

```python
from module_name import var1, method

# this prints var1
print(var1)

# this calls the method
print(method())
```

This is usually done in `.py` files as opposed to `.ipynb` files (especially when we are composing a data pipeline).

Keep in mind, `.ipynb` is good for outward presentation. `.py` is good for engineering.

Attempt the 4 questions below listed in the comments.

In [None]:
# Q1: import the "secret1" & "secret2" variables from module1
from module1 import 

# Q2: import all of module2


# Q3: print out secret1


# print out secret2


# Q4: call the "secret_func" method from module2

## Python for Data Pipelines

And just to leave you all with some good pipeline practices for Python, we usually create a "runner" file for our data pipelines that will combine all our ETL steps into one script. This allows us to have one central script that runs all steps of the pipeline.

The example below is an abstact example. Think of all modules doing complex SQL queries & pandas transformations. However, this is all hidden via our import statements.

*ex*:
```python
import extract as ext
import transform as tf
import load
import predict as pred

def main():
    # extract & update my data
    df = ext.extract()

    # transform this new data
    tf.transform(df)

    # load this back in
    load.save(df)

    # and make predictions
    pred.predict(df)

if __name__ == "__main__":
    main()
```