# 3.1 Lesson: What is "None" in Python? 

`None` is a special value used in Python when no specific value has been specified. This happens fwhen a function finishes running without a return statement. `None` is often used as a placeholder or missing data marker. This dictionary `get()` method will return `None` if the requwested key is not in the dictionary and no default is specified. Many librarys for reading data similarly use `None` for missing data. 

## Using None

In [1]:
None

In [2]:
_ = ...

In [3]:
type(None)

NoneType

In [4]:
x = 3
x is None

False

In [5]:
None is None

True

In [6]:
id(None)

4310722176

In [7]:
x is not None

True

In [8]:
None is not None

False

### Warning
ChatGPT can be helpful when coding, but be careful using its code without reading and testing it. Subtleties like "`is none` vs `== None` are areas where it tends to make mistakes.

# 3.2 Lesson: Booleans - Truth in Python

Control flow in all programming languages is based on boolean decisions. True or false? Do we do it or not? For Python specifically, if and while statements have a boolean decision whether to execute the following code or not. In typical Pythonic fashion, any expression can be used, and once you know the rules, they generally make sense.

Read | A Whirlwind Tour of Python, Built-In Types: Simple Values the boolean section.

https://jakevdp.github.io/WhirlwindTourOfPython/05-built-in-scalar-types.html#Boolean-Type

### Boolean Type

The boolean type is a simple type with two possible values: `True` and `False`, and is returned by comparison operators discussed previously:

In [9]:
result = (4 < 5)
result

True

In [10]:
type(result)

bool

Keep in mind that the Boolean values are case sensitive: unlike some other languages, `True` and `False` must be capitalized.

In [11]:
print(True, False)

True False


Booleans can also be constructed using the `bool()` object constructor: values of any other type can be converted to Boolean via predictable rules. For example, any numeric type is False if equal to zero, and True otherwise:

In [12]:
bool(2014)

True

In [13]:
bool(0)

False

In [14]:
bool(3.1415)

True

The boolean conversion of `None` is always False:

In [15]:
bool(None)

False

In [16]:
bool('abc')

True

for strings, `bool(s)` is False for empty strings and True otherwise:

In [17]:
bool("")

False

In [18]:
bool("abc")

True

For sequences, which we'll see in the next section, the Boolean representation is False for empty sequences and True for any other sequence. 

In [19]:
bool([1,2,3])

True

In [20]:
bool([])

False

https://docs.python.org/3/library/stdtypes.html

### Built-in Types:

The following sections describe the standard types that are built into the interpreter.

The principal built-in types are numerics, sequences, mappings, classes, instances and exceptions.

Some collection classes are mutable. The methods that add, subtract, or rearrange their members in place, and don’t return a specific item, never return the collection instance itself but None.

Some operations are supported by several object types; in particular, practically all objects can be compared for equality, tested for truth value, and converted to a string (with the `repr()` function or the slightly different `str()` function). The latter function is implicitly used when an object is written by the `print()` function.

### Truth Value Testing

Any object can be tested for truth value, for use in an `if` or `while` condition or as operand of the Boolean operations below.

By default, an object is considered true unless its class defines either a `__bool__()` method that returns False or a `__len__()` method that returns zero, when called with the object. `[1]` Here are most of the built-in objects considered false:

- constants defined to be false: `None` and `False`
- zero of any numeric type: `0, 0.0, 0j, Decimal(0), Fraction(0, 1)`
- empty sequences and collections: `'', (), [], {}, set(), range(0)`

Operations and built-in functions that have a Boolean result always return 0 or False for false and 1 or True for true, unless otherwise stated. (Important exception: the Boolean operations or and and always return one of their operands.)


### Boolean Operations - `and`, `or`, `not`

These are the Boolean Operations, ordered by ascending priority:

| Operation | Result | Notes |
| :--- | :--- | :--- |
| `x` or `y` | if x is true, then x, else y | (1) |
| `x` and `y` | if x is false, then x, else y \ (2) |
| `not x` | if x is false, then `True`, else `False` |

Notes:
1. This is a short-circuit operator, so it only evalutates the second argument if the first one is false.
2. This is a short-circuit operator, so it only evaluates the second argument if the first one is true.
3. `not` has a lower priority than non-boolean operatiors, so `not a == b` is interpreted as `not (a == b)`, and `a == not b` is a syntax error.

### Comparisons

There are eight comparison operations in Python. They all have the same priority (which is higher than that of the Boolean operations). Comparisons can be chained arbitrarily; for example, `x < y <= z` is equivalent to `x < y` and `y <= z`, dexcept that y is evaluated only once (but in both cases z is not evaluated at all when `x < y` is found to be `False`).

This table summarizes the comparison operations:
| Operation | Meaning |
| :--- | :--- |
| `<` | strictly less than |
| `<=` | less than or equal |
| `>` | strictly greater than |
| `>=` | greater than or equal |
| `==` | equal |
| `!=` | not equal |
| `is` | object identity |
| `is not` | negated object identity |


Objects of different types, except different numeric types, never compare equal. The `==` operator is always defined but for some object types (for example, class objects) is equivalent to is. The `<`, `<=`, `>` and `>=` operators are only defined where they make sense; for example, they raise a `TypeError` exception when one of the arguments is a complex number.

Non-identical instances of a class normally compare as non-equal unless the class defines the `__eq__()` method.

Instances of a class cannot be ordered with respect to other instances of the same class, or other types of object, unless the class defines enough of the methods `__lt__()`, `__le__()`, `__gt__()`, and `__ge__()` (in general, `__lt__()` and `__eq__()` are sufficient, if you want the conventional meanings of the comparison operators).

The behavior of the is and is not operators cannot be customized; also they can be applied to any two objects and never raise an exception.

Two more operations with the same syntactic priority, in and not in, are supported by types that are iterable or implement the `__contains__()` method.


## Converting to Boolean

Both `if` and `while` statements will accept any expression as their condition, and will automatically convert that expression into a boolean value. So an important question is what values are converted into `True` or `False`?

| Type | False values | True |
| :--- | :--- | :--- |
| `int`, `float`, other numeric types | 0, 0.0, anything that equals zero | All other values |
| `list`, `tuple`, `dict`, anything that overrides `len()` | Anything with `len() = 0`(an empty sequence). | Everything else |
| `NoneType` (default return value) | None | n/a |
| Objects that override `bool()` | Depends on custom `__bool__.` | Depends on custom `__bool__`.|
| Everything else | n/a | all | 

These rules are sometimes described as "trivial objects are false." By these rules:
- `None` is trivial.
- `False` is trivial.
- Numbers evaluating to zero are trivial (`False` becomes zero if converted to an integer).
- Sequences of length zero are trival. Sequences without length support are non-trival.
- Everything else is non-trivial. 

Here's an example of using a list as a while condition:

In [22]:
while todo_list:
   # remove an item from the TODO list
   todo_item = todo_list.pop()
 
   # process that item
   result = process(todo_item)
 
   # add new items to the TODO list if any work is still left
   todo_list.extend(followup_if_any(result))

NameError: name 'todo_list' is not defined

## And/Or Expressions:

Expressions using `and` or `or` are closely related to Boolean expressions but do not return Booleans. Instead, they return the last value checked while evaluating the `and` or `or` expressions. 

- an `and` expression returns a value that is `True` if both of its inputs are `True`.

In [23]:
True and True

True

In [24]:
True and False

False

In [25]:
False and True

False

In [26]:
False and False

False

If you chain a number of `and` statements together, you'll return a value that is `True` if all of the inputs are true:

In [27]:
True and True and True and True

True

In [28]:
False and True and True and True

False

These expressions will still work if the input expressions are not Boolean values:

In [29]:
3 and 4

4

In [30]:
0 and 2

0

In [31]:
[] and 2

[]

If all of the expressions convert to `True`, the end expression will return the last `True` one. If any of the input expressions convert to `False`, then the first expression that converts to `False` will be returned. The later evaluations will be skipped, and not evaluated at all. 

This behavior where the later expressions are skiped is called **short circuit logic**. It lets you write expressions that involve a number of potentially expensive checks, but they will stop once the whole expression is determined, so you don't have to pay all the costs. You only get this speed up if you write them all in one expression together. If you save the parts into variables first, they have all run before you start the `and` expression. You will get the same value back from the `and` expression, but it will have evaluated all of the expressions first. 

This short-circuit behavior also works with `or` expressions, but they stop at the first true input and return that. This makes them the opposite of `and` expressions.

Why does python return the last value here? So we can write expressions like this:

In [32]:
first_choice, second_choice = 1, 2
choice = first_choice or second_choice

#if the first choice evaluates to True, go with that, otherwise fall back to the second choice. 

# 3.3 Lesson: More about Sequences:

In Python, a sequence is any object that can return an iterator, and an iterator is any object that can repeatedly return the next item in the sequence. This iterator support is enough to support for loops, list comprehensions, and every other Python feature that uses sequences. Why belabor the point?

Programmers who know C, C++, or Java know that it is possible to simulate a `for` loop with a `while` loop. They may have even been taught how a `for` loop works by rewriting it into a `while` loop. Here is how that works. To reproduce this `for` loop over list `my_data`,

In [33]:
my_data = []
for x in my_data:
    do_something(x)

you can rewrite it to

In [34]:
i = 0;
while i < len(my_data):
    x = my_data[1]
    do_something(x)
    i = i + 1

In [35]:
my_data

[]

From a certain point of view, you may know exactly what is going on in minute detail. But it took more than twice as many lines, and the extra lines are spent maintaining a variable that wasn’t needed in the first example, and it doesn’t work with all sequences. The `for` loop version does.

### Loop like a Native

The video and slides are from a PyCon 2013 talk. This talk starts with a similar example and covers many examples of clean `for` loop usage in Python that are more verbose or cumbersome in other languages. Again, this will be particularly valuable to programmers of other languages who already know how to achieve the same behavior but will need to write a lot more code with less clarity than using the native version. There are several examples of creating new iterators to match your desired behavior while preserving the simplicity of the `for` loop.

Python has been evolving to make these common patterns easy to express briefly and concisely. The Python community, particularly the Python developers, refers to these patterns as “Pythonic programming”. One distinctive principle that guides Python development is the goal of having a clear and best way to code in Python. (This is in strict contrast to a competing language, Perl, with the motto “there’s more than one way to do it”.) For processes that repeat the same operation over a lot of data, that answer is the for loop.

### Iteration Basics:

This is an example of a loop. Let's say you have a list, and you want to iterate over the values of the list and print out all of the values. You can start a counter at 0, you can loop as long as the counter is still within the range of the list, you can get the element out of the list, you can print that value, and you can increment `i` to go onto the next element, so you can go back to the while loop and it continues around and around.

In [36]:
my_list = [1,2,3,4]

In [37]:
i = 0
while i < len(my_list):
    v = my_list[i]
    print(v)
    i += 1

1
2
3
4


Python programmers know you shouldn't do this. If you want to loop over all of the indexes of a list, you can use the `range()` function. Range of len of my list will give you an i that ranges from 0 to 1 minues the length of the list. You can get the element out of the list and you can print it.   

In [38]:
for i in range(len(my_list)):
    v = my_list[i]
    print(v)

1
2
3
4


*What you really should be doing* is just use a `for` loop and just loop over the elements in the list for `v` and print my list `v`.

In [39]:
for v in my_list:
    print(v)

1
2
3
4


**The For Loop:**

**`for` name `in` iterable**:\
&emsp;**statements**

- Iterable produces a stream of values
- Assign stream values to name
- Execute statement once for each value in iterable
- Iterable decides what values it produces
- Lots of things are iterable


**Strings => Characters**\
If you iterate a string, you get its characters:

In [40]:
for c in "Hello":
    print(c)

H
e
l
l
o


If you iterate a dictionary, you get its keys:

In [41]:
d = {'a' : 1, 'b': 2, 'c': 3}
for k in d:
    print(k)

a
b
c


**Files => lines**\
if you iterate over files, you give them their lines:

In [43]:
with open("gettysburg.txt.txt") as f:
    for line in f:
        print(repr(line))

'Fourscore and seven years ago our fathers brought forth on this\n'
'continent a new nation, conceived in liberty and dedicated to the\n'
'proposition that all men are created equal.\n'
'Now we are engaged in a great civil war, testing whether that nation\n'
'or any nation so conceived and so dedicated can long endure. We are\n'
'met on a great battle field of that war. We have come to dedicate a\n'
'portion of that field, as a final resting place for those who here\n'
'gave their lives that that nation might live. It is altogether\n'
'fitting and proper that we should do this.\n'
'But, in a larger sense, we can not dedicate - we can not consecrate\n'
'- we can not hallow - this ground. The brave men, living and dead,\n'
'who struggled here, have consecrated it, far above our poor power to\n'
'add or detract. The world will little note, nor long remember, what\n'
'we say here, but it can never forget what they did here. It is for\n'
'us the living, rather, to be dedicated here to the u

Stdlib has interesting iterables as well:\

For examle the `re` library has "iter", which will give you a stream and match objects one for each place in the string where you pattern is matched. 

In [46]:
import re
for match in re.finditer(pattern, string):
    # once for each regex match:

SyntaxError: incomplete input (1924450486.py, line 3)

The `os.walk` function iterates all of the subdirectories in a tree of directories, flattening the two dimensional structure from the tree into a single directory. 

In [47]:
for root, dirs, files in os.walk('/some/dir'):
    #once for each sub-directory... 

SyntaxError: incomplete input (2146152790.py, line 2)

Itertools is a module full of all sorts of tools for playing with iteration. The count function gives you an infinite stream of integers starting with zero and it keeps going forever. It never stops, so you'll need to close your loop somehow. Iterables can be indefinite.

In [48]:
for num in intertools.count():
    # once for each integer... Infinite! 

SyntaxError: incomplete input (1678778350.py, line 2)

Lastly, the itertools module has all sorts of cool functions that you can use to put things together. Here we've made a chain which repeats 17 3 times, and then cycles forever in range of four:

In [50]:
from itertools import chain, repeat, cycle
seq = chain(repeat(17, 3), cycle(range(4)))
for num in seq:

SyntaxError: incomplete input (2797762832.py, line 3)

Bottom line: Python gives you a way of creating iterables, which are streams of values, and they can take all sorts of forms.

### Other uses for iterables
The list function will take an iterable and pull all of the values out of it and give you a list of all of them. 

In [1]:
iterable = [1,2,3,4]
new_list = list(iterable)

A list comprehension can loop over an iterable, and give you, for example a list of the values f of x, for every x in your iterable.

In [63]:
def f(x):
    return x * 2
iterable = [1,2,3,4]
results = [f(x) for x in iterable] #pass the iterable to a function.
print(results)

[2, 4, 6, 8]


Sum will take a list of things that can be added together and give you the total of all of them.

In [65]:
total = sum(iterable)
print(total)

10


`min` and `max` will take your iterables of comparable values, like numbers or strings and find you the least or greatest value.

In [66]:
smallest = min(iterable)
largest = max(iterable)

Join will take an iterable of strings and join them all together. 

In [69]:
iterable = ["bob", "larry"]
combined = "".join(iterable)
print(combined)

boblarry


### Common Questions 
1. How do I get the index?

In [77]:
#No:
for i in range(len(my_list)):
    v = my_list[i]
    print(i, v)

0 1
1 2
2 3
3 4


In [78]:
 #Yes:
for i, v in enumerate(my_list):
    print(i,v)

0 1
1 2
2 3
3 4


In [79]:
#enumerate() makes useful pairs 
names = ["Eiffel Tower", "Empire State", "Sears Tower"]
list(enumerate(names))

[(0, 'Eiffel Tower'), (1, 'Empire State'), (2, 'Sears Tower')]

If I enumerate those names, and make a list of what `enumerate()` gives me, you can see that i've gotten a tuple of `[(0, 'Eiffel Tower'), (1, 'Empire State'), (2, 'Sears Tower')]`. So `enumerate()` has taken an iterable, and it pulls values off of the interable, and bundles them together with the index that's keeping track of how many its gotten, and it gives you pairs that are the index and the value.

So now we can do the right thing, to number our values without 

In [80]:
for num, name in enumerate(names):
    print(num, name)

0 Eiffel Tower
1 Empire State
2 Sears Tower


### Iteration vs Indexing
This step here only works if the value you are iterating can be indexed. Lists can be iterated over, but you can also ask for the 17th element. Lots of iterables don't let you do that.

In [81]:
#limited:
for i in range(len(my_list)):
    v = my_list[i] #indexing!
    print(i, v)

0 1
1 2
2 3
3 4


In [None]:
# more powerful
for i, v in enumerate(iterable):
    print(i,v)
    
for linenum, line in enumerate(f, start=1):
    #... 

When we use enumerate of iterable we don't have to index into the list. For example, this open file object, you can't get the hundredth line in a file by saying f sub 100, you have to read them in order. 

**How do I loop over two lists?**

I've got two lists, the towers and their heights, and I want to print out the coresponding data between them. So now i have to go back and forth 

In [3]:
names = ["Eiffel Tower", "Empire State", "Sears Tower"]
heights = [324, 381, 442]

#bad
for i in range(len(names)):
    name = names[i]
    height = heights[i]
    print("%s: %s meters" % (name, height))

Eiffel Tower: 324 meters
Empire State: 381 meters
Sears Tower: 442 meters


We still don't have to do this, you can use the `zip()` function. Zip takes a pare of strings, and gives you a string of pairs. 

In [5]:
for name, height in zip(names, heights):
     print("%s: %s meters" % (name, height))

Eiffel Tower: 324 meters
Empire State: 381 meters
Sears Tower: 442 meters


**`dict()` accepts a stream of pairs**

The dict constructor will also take a string of pairs and give us a dictionary. 

In [6]:
names = ["Eiffel Tower", "Empire State", "Sears Tower"]
heights = [324, 381, 442]

dict(zip(names, heights))

{'Eiffel Tower': 324, 'Empire State': 381, 'Sears Tower': 442}

In [8]:
#Powerful
tall_buildings = {"Empire State": 381, "Sears Tower": 442, "Burj Khalifa": 828, "Taipei 101": 509}

print(max(tall_buildings.values()))

828


In [13]:
print(max(tall_buildings.items(), key=lambda b: b[1]))

('Burj Khalifa', 828)


In [14]:
print(max(tall_buildings, key=tall_buildings.get))

Burj Khalifa


### Customizing Iteration

We've seen that python gives you very powerful ways of dealing with data. We can also customize it to be more direct. 

**Generators**
Functions return one value, Generators produce a stream. So whats a generator? A generator is like a function; a function when you call it runs all the statements and it returns one value. A generator when you call it produces and iterator and when you iterate the values in the iterator it runs the statements in the generator and every it hits a yield statement it produces one more value. So its kind of like a function that can keep producing values over and over again.  

In [27]:
def hello_world():
    yield "Hello"
    yield "World"
    
for x in hello_world():
    print(x)

Hello
World


In [28]:
#evens generator
def evens(stream):
    for n in stream:
        if n% 2 == 0:
            yield n

Range is a generator 

**Abstracting your iteration**

In [None]:
#more real world example
f = open("my_config.ini")
for line in f:
    line = line.strip()
    if line.startswith("#"):
        # A comment line, skip it.
        continue
    if not line:
        # A blank line, skip it.
        continue
    # An interesting line.
    do_something(line)

In [30]:
# generator version
def interesting_lines(f):
    for line in f:
        line = line.strip()
        if line.startswith('#'):
            continue
        if not line:
            continue
        yield line
        
# this function will take any value that can produce strings, not just files. Now we can use it:

In [None]:
with open("my_config.ini") as f:
    for line in interesting_lines(f):
        do_something(line)
        
with open("my_other_dat") as f2:
    for line in interesting_lines(f2):
        do_something_else(line)

**Q: How do I break out of two loops?**

In [31]:
for row in range(height):
    for col in range(width):
        
        value = spreadsheet.get_value(col,row)
        do_something(value)
        
        if this_is_my_value(value):
            break # <- ???

NameError: name 'width' is not defined

In [34]:
# Make the double loop single:
def range_2d(width, height):
    ''' Produce a stream of two-D coordinates.'''
    for y in range(height):
        for x in range(width):
            yield x,y
#good            
for col, row in range_2d(width, height):
    value = spreadsheet.get_value(col, row)
    do_something(value)
    
    if this_is_my_value(value):
        break
        
#better
for cell in spreadsheet.cells():
    value = cell.get_value()
    do_something(value)
    
    if this_is_my_value(value):
        break

NameError: name 'width' is not defined

### Low Level Iteration

Iterable: produces an iterator\
Iterator: produces a stream of values

In [35]:
iterator = iter(iterable) # iterable.__iter__()
value = next(iterator) #iterator.next() or .__next__()
value = next(iterator)

NameError: name 'iterable' is not defined

Only operation on iterators is next().  

In [37]:
with open('blah.dat') as f:
    #read the first line
    header_line = next(f)
    
    # read the rest
    for data_line in f:

SyntaxError: incomplete input (1472646461.py, line 6)

**Making your own objects iterable

In [39]:
class ToDoList(object):
    def __init__(self):
        self.tasks = []
        
        def __iter__(self):
            return iter(self.tasks)

In [40]:
todo = ToDoList()

for task in todo:

SyntaxError: incomplete input (2976549113.py, line 3)

In [43]:
# __iter__ generators
class ToDoList(object):
    def __init__(self):
        self.tasks = []
        
    def __iter__(self):
        for task in self.tasks:
            if not task.done:
                yield task
                
    def all(self):
        return iter(self.tasks)
    
    def done(self):
        return (t for t in self.tasks if t.done)

**Common ways to create more sequences**
| Function/Pattern | Comment |
| :--- | :--- |
| Use a built-in container | Covered in Week 2. All the built-in containers act as sequences. |
| List comprehensions | Covered in week 2. Easy way to transform sequences. |
| `enumerate()` | Takes in a sequence, and returns a sequence of (`index`, `value`) `tuples`. Use this to keep track of how far you are in a sequence without writing the bookkeeping code. If looping over a list, you can use the index to modify the value in the list. |
| `range()` | Returns a sequence of integers. Can specify both the start and end, and the spacing (e.g. counting by fives). |
| `zip()` | Takes in two or more sequences and returns tuples with one value from each input sequenc. Useful for making dictionaries. |
| Any function returning a list | This is the easiest way to make a new sequence. The main cost is keeping the whole list around at once. This wont be a big deal until we start working with big data. |
| Generator Functions | If you write a function with the yield keyword, the function result becomes an iterator for all the values yielded by the function. You will use them soon to read files. |
| Itertools module | This built-in Python library has many useful functions to compose sequences in different ways to easily build iterators with common behavior. |

# 3.4 lesson: First Class Functions

We have already seen functions as a way to reuse code. Python treats functions as first class citizens, a programming language expression meaning that functions can be treated like ordinary values, just with different types. It also means that we can write “higher order” functions that take other functions as input. What does this give us?

**Functions As variables**
The practice of storing function references in new variables.

In [44]:
import math
f = math.sin

In [45]:
f

<function math.sin(x, /)>

In [46]:
type(f)

builtin_function_or_method

Now, its just like using the sine function from the math library, because it is using the sine function from the math library. F is just a new variable referring to that same function.

In [47]:
f(0.2)

0.19866933079506122

In [50]:
math.sin(0.2)

0.19866933079506122

**Functions as Expressions (Lambda Functions) How to create new functions using lambda functions.**

Let's make new functions now

In [51]:
f = lambda x: x + 1

This just defined a new function, which we then gave the name f. 
- The keyword `lambda` is used to make a new anonymous function. 
- Lambda is a reference to Alonso Church's lambda calculus. 
- After the keyword lambda, there are one or more variable names before colon.
- Those variable names are the input to the new function. 
- After the colon, there is an expression that is returned from the anonymous function. You don't need to use a return. 

Lambda expressions are intended for brief functions that don't need to have a persistent name. One feature of lambda functions, like other functions, is that they can access the variables around them when they are defined. 

In [53]:
def make_linear_function(m,b):
    return lambda x: m * x + b

In [54]:
f = make_linear_function(3,2)

In [55]:
f

<function __main__.make_linear_function.<locals>.<lambda>(x)>

In [56]:
f(1)

5

if you call `make_linear_function` again, you'll have a different m and b, and a different function using the different m and b. 

In [57]:
f2 = make_linear_function(4,3)

In [58]:
f2

<function __main__.make_linear_function.<locals>.<lambda>(x)>

In [59]:
f2(0)

3

The most common use of higher-order functions is custom sort keys. The sort function has an optional parameter called a key. If you pass a function in this key parameter, then it will be applied to each row in the list, and the list will be sorted according to the output of the function. In data science, a function returning a particular column in the data would be used to sort by that column. Or a model’s prediction function might be used to sort by the model’s predictions. This essentially lets you select a sorting order by selecting a key function. The same key functions can be reused with the min and max functions if you want an extreme value from the data by the same criteria.

### Flexible sorting in Python
How to use lambda functions and the reverse option to change the behavior of Python's built-in sorting. 

In [2]:
# python makes it very easy to sort data by criteria
my_data = [{"a" : 3}, {"a": 4}, {"a": 2}]

In [3]:
my_data.sort()

TypeError: '<' not supported between instances of 'dict' and 'dict'

Lambda functions are an easy way to define a new sort order.

In [62]:
my_data.sort(key = lambda r: r["a"])

In [63]:
my_data

[{'a': 2}, {'a': 3}, {'a': 4}]

In [64]:
my_data.sort(key = lambda r: -r["a"])

In [66]:
my_data

[{'a': 4}, {'a': 3}, {'a': 2}]

In [67]:
#easier way to reverse sort
my_data.sort(key= lambda r: r["a"], reverse=True)

In [68]:
my_data

[{'a': 4}, {'a': 3}, {'a': 2}]

In [69]:
def my_key_func(r):
    return r["a"]

In [70]:
my_data.sort(key = my_key_func, reverse=True)

In [71]:
my_data

[{'a': 4}, {'a': 3}, {'a': 2}]

In [72]:
#lets look at a few more examples

sorted(my_data, key = lambda r: r["a"])

[{'a': 2}, {'a': 3}, {'a': 4}]

That returns a sorted list with the data from `my_data`, sorted by the `"a"` key. The difference is that `my_data.sort` reorders my data, and `sorted()` makes a new list with the same data.
- `my_data.sort` works for lists
- `sorted()` works for any sequence

If you already have your data in a list, and don't care about the old order, use sort. If you just want the first row when sorting by that key, then:

In [73]:
min(my_data, key = lambda r: r["a"])

{'a': 2}

In [74]:
max(my_data, key = lambda r: r["a"])

{'a': 4}

Another example where first-class functions are useful is selecting functionality based on outside choices. For example, Module 1: Mathematical Foundations of Data Science introduces loss functions as an important modeling component best chosen based on the specific modeling goals. As such, most modeling frameworks allow the name of the loss function to be passed in as an argument to the various functions. This argument could be implemented with a sequence of `if`/`else` statements but is much more concisely implemented with a dictionary from loss function names to loss functions. Here’s an abbreviated example.

In [None]:
def calculate_l1_loss(data):
   ...
 
def calculate_l2_loss(data):
   ...
 
LOSS_FUNCTIONS = {"L1": calculate_l1_loss, "L2": calculate_l2_loss}
 
def calculate_loss(data, loss_function_name):
   loss_function = LOSS_FUNCTIONS[loss_function_name]
   return loss_function(data)

### Variable Scoping

When we first covered functions last week, we noted that function arguments were different from variables of the same name outside the function. We also saw that lambda functions can access variables from the functions around them. Let’s go over the rules for when variables are shared or not.

The basic concept controlling variable sharing is called scope. Scopes can be nested, but visibility goes in only one way. Access from the inside scopes can see outer scopes, but not vice versa. Each function call starts a new scope that is nested inside the scope where it was defined, but can access variables in the outer scopes containing the function.

The tricky part about Python scoping is that if a variable is assigned inside a scope, that scope will limit the use of that variable to the one inside that scope. This happens even if that assignment is never reached and variables in the outer scopes are inaccessible. For the purposes of scoping, function arguments are considered to have been assigned in the scope.

In [75]:
a_totally_outside_variable = 3
 
def outside_function(x):
   y = x + 1
 
   def inside_function(z):
       x = z / 2
       return x
 
   def another_inside_function(z):
       return x

In this example, there are four scopes. There is the outermost scope accessible by all of this code. There is the scope inside `outside_function`. And there are two separate scopes for the inside functions. The following code rewrites the previous code block with the variable names changed to reflect the scoping rules.

In [76]:
SCOPE_1_a_totally_outside_variable = 3
 
def outside_function(SCOPE_2_x):
   SCOPE_2_y = SCOPE_2_x + 1
 
   def inside_function(SCOPE_3_z):
       SCOPE_3_x = z / 2
       return SCOPE_3_x
 
   def another_inside_function(SCOPE_4_z):
       return SCOPE_2_x

Each functions’ arguments reflect the new scope of that function. And note that the two inside functions that ended with “return x” originally now return two different scoped variables. That is because the first inside function has an assignment to x and makes its own scoped version of x, while the second inside function does not and shares with the outer scope.

This scope rewriting is sufficient for this simple example but is not enough for a case with recursion, where a function ends up calling itself. To get the scoping right in this case, the key detail is that each function call creates a new scope. So you can imagine repeated calls to a function having different scope prefixes each time. Though to be clear, Python does not really rename the variables with these scope prefixes. That was just a way to illustrate the scoping differences.

Compare and contrast:
- C: The C language has a function qsort in the standard library, which takes in a comparator function (compares two elements) instead of a key function.
- C++: The C++ language now has std::sort in the standard library. Std::sort will use the natural ordering (which can be overridden), or take in a comparator which is like a function. C++ also has anonymous functions; see std::functional for details.
- Java: Java has sorting with an optional comparator function in java.utils.Array.sort. Java has had lambda expressions since Java 8.

In [77]:
bool(2*3-6)

False

In [78]:
bool(3 + 4)

True

In [79]:
bool("hello")

True

In [80]:
bool(None)

False

In [81]:
bool([x for x in range(5)])

True

In [82]:
def my_add(x, y):
    return x+y

In [83]:
f = my_add
x = f(3,4)
print(x)

7


In [84]:
z = f(5,2)

In [85]:
x

7

In [86]:
def f(x):
    x = x * 3
    return x
def g(y):
    f(y)
    return y

x = 4
f(x)
g(x)
print(x)

4


In [87]:
def f(x):
    def g(y):
        def h(x):
            return x + y

        return h(x)

    x = g(2)
    return g(3)

print(f(5))

10


In [88]:
d = {"a": "b", "c": 3, "d": 3}
for k in d:
    print(k, d[k])

a b
c 3
d 3


In [4]:
x = 7
y = 4
for z in range(y):
    if z:
        x = x * z

In [5]:
x

42

In [10]:
x = False
x is not x

False

In [11]:
False is not False

False

In [None]:
[0, 1, 2, None, 4, "a", "b", False, True]