# Python Data Science Toolbox (Part 1)

## Chapter 1: Writing your own functions

### User-defined functions

#### You'll learn:
* Define functions without parameters
* Define functions with one parameter
* Define functions that return a value
* Later: multiple arguments, multiple return values

#### Built-in functions
* str()

In [1]:
x = str(5)
print(x)

5


In [2]:
print(type(x))

<class 'str'>


#### Defining a function

In [7]:
def square():    # <- Function header
    new_value = 4 ** 2
    print(new_value)
    
square()

16


#### Function parameters

In [8]:
def square(value):   
    new_value = value ** 2
    print(new_value)
    
square(4)

16


In [9]:
square(5)

25


#### Return values from functions
* Return a value from a function using return

In [12]:
def square(value):   
    new_value = value ** 2
    return new_value

num = square(6)
print(num)

36


#### Docstrings
* Docstrings describe what your function does
* Serve as documentation for your function
* Placed in the immediate line after the function header
* In between triple double quotes """

In [13]:
def square(value):   
    """Return the square of a value."""
    new_value = value ** 2
    return new_value

### Multiple parameters and return values

#### Multiple function parameters
* Accept more than 1 parameter

In [79]:
def raise_to_power(value1, value2):   
    """Raise value1 to the power of value2."""
    new_value = value1 ** value2
    return new_value

* Call function: # of arguments = # of parameters

In [15]:
result = raise_to_power(2,3)
print(result)

8


#### A quick jump into tuples
* Make functions return multiple values: Tuples!
* Tuples:
    * Like a list - can contain multiple values
    * Immutable - can't modify values!
    * Constructed using parentheses

In [16]:
even_nums = (2, 4, 6)
print(type(even_nums))

<class 'tuple'>


#### Unpacking tuples
* Unpack a tuple into several variables

In [17]:
even_nums = (2, 4, 6)
a, b, c = even_nums
print(a)
print(b)
print(c)

2
4
6


#### Accesing tuple elements
* Access tuple elements like you do with lists

In [18]:
even_nums = (2, 4, 6)
print(even_nums[1])

4


#### Returning multiple values

In [19]:
def raise_both(value1, value2):
    """Raise value 1 to the power of value 2 and vice versa."""
    
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    
    new_tuple = (new_value1, new_value2)
    
    return new_tuple

In [20]:
result = raise_both(2, 3)
result

(8, 9)

## Chapter 2: Default arguments, variable-length arguments and scope

### Scope and user-defined functions

#### Crash course on scope in functions
* Not all objects are accessible everywhere in a script
* Scope - part of the program where an object or name may be accessible
    * Global scope - defined in the main body of a script
    * Local scope - defined inside a function
    * Bult-in scope = names in the pre-defined built-ins module
    
#### Global vs. local scope (1)

In [21]:
def square(value):   
    new_value = value ** 2
    return new_value

square(3)

9

In [22]:
new_value # not defined because it's inside the function

NameError: name 'new_value' is not defined

#### Global vs. local scope (2)
* Any time we call the name in the global scope, it will access the name in the global scope
* Any time we call the name in the local scope of the function, it will look first in the local scope (that's why calling `square(3)` returns 9 instead of 10. If python cannot find the name in the local scope, it will then and only then look in the global scope
* If python can't find anything in the local or global scope, then it will look in the built-in scope

In [23]:
new_value = 10

def square(value):   
    new_value = value ** 2
    return new_value

square(3)

9

In [24]:
new_value

10

#### Global vs. local scope (3)
* Here, we access `new_value` defined globally within the function `square()`
* The global value accessed is the value at the time the function is called, not the value when the function is defined
* Thus, if we re-assign `new_value` and call the function `square()`, we see that the new value of `new_value` is accessed

In [27]:
new_value = 10

def square(value):   
    new_value2 = new_value ** 2
    return new_value2

square(3)

100

In [29]:
new_value = 20

square(3)

400

#### Globacl vs. local scope (4)
* What if we want to alter the value of a global name within a function call? Then we use the keyword `global`

In [30]:
new_value = 10

def square(value):
    global new_value
    new_value = new_value ** 2
    return new_value

square(3)

100

In [32]:
# Value of new_value has been updated by calling `square()`
new_value

100

#### Built-in Functions

In [33]:
import builtins
dir(builtins)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

### Nested functions

#### Nested functions (1)
* What if we have a function `inner()` defined inside our function `outer()` and we reference a name x in the inner function?
* Python searches the local scope of the function `inner()`, then if it doesn't find x, it searches the scope of the funciton `outer()`, which is called an "enclosing function" because it encloses the function `inner()`.
* If python can't find x in the scope of the enclosing function, it only then searches the global scope and then the built-in scope.

In [None]:
def outer( ... ):
    """ ... """
    x = ...
    
    def inner ( ... ):
        """ ... """
        y = x ** 2
    return ...

#### Nested functions (2)
* But why do we even need to nest functions?
* There are many reasons:
    * Let's say that we want to use a process a number of times within a function, for example we want a function that takes 3 numbers as parameters and performs the same function on each of them
    * One way would be to write out the computation 3 times, but this is not scalable.

In [34]:
def mod2plus5(x1, x2, x3):
    """Returns the remainder plus 5 of three values."""
    
    new_x1 = x1 % 2 + 5
    new_x2 = x2 % 2 + 5
    new_x3 = x3 % 2 + 5
    
    return (new_x1, new_x2, new_x3)

#### Nested functions (3)
* What we can do instead is define an inner function within our function definition, and call it where necessary.


In [37]:
def mod2plus5(x1, x2, x3):
    """Returns the remainder plus 5 of three values."""
    
    def inner(x):
        """Returns the remaineder plus 5 of a value."""
        return x % 2 + 5
    
    return (inner(x1), inner(x2), inner(x3))

print(mod2plus5(1, 2, 3))

(6, 5, 6)


#### Returning functions
* Another use case of nested functions, we can return other functions given parameters.
* One interesting detail: when we call the function `square()` it remembers the value n = 2, though the enclosing scope defined by raise_val and to which n = 2 is local, has finished execution (this is a subtlety referred to as a "closure" in Computer Science circles but shouldn't concern us too much)

In [80]:
def raise_val(n):
    """Return the inner function."""
    
    def inner(x):
        """Raise x to the power of n."""
        raised = x ** n
        return raised
    
    return inner

In [81]:
square = raise_val(2)
cube = raise_val(3)
print(square(2), cube(4))

4 64


#### Using nonlocal
* The keyword `nonlocal` allows you to create and change names in an enclosing scope.
* In the example below, we alter the value of n in the inner function; but because we use the keyword `nonlocal`, it also alters the value of n in the enclosing scope.
* This is why calling the function `outer()` prints the value of n as determined within the function `inner()`.

In [82]:
def outer():
    """Prints the value of n."""
    n = 1
    
    def inner():
        nonlocal n
        n = 2
        print(n)
        
    inner()
    print(n)

In [83]:
outer()

2
2


#### Scopes searched
* Local scope
* Enclosing functions
* Global
* Built-in

This is known as the LEGB rule, where L is for local, E for enclosing, G for global, and B for built-ins!

### Default and flexible arguments

#### Add a default argument

In [48]:
def power(number, pow=1):
    """Raise number to the power of pow."""
    new_value = number ** pow
    return new_value

power(9,2)

81

In [49]:
power(9,1)

9

In [50]:
power(9)

9

#### Flexible arguments: *args(1)
* Let's say you want to write a function but you aren't sure how many arguments a user will want to pass it
* For example, a function that takes floats or ints and adds them all up, irrespective of how many there are --> Enter flexible arguments!
* In the function definition, we use the parameter `*args`. This turns all the arguments passed to a function call into a tuple called `args` in the function body.

In [84]:
def add_all(*args):
    """Sum all the values in *args together."""
    
    # Initialize sum
    sum_all = 0
    
    # Accumulate the sum
    for num in args:
        sum_all += num
        
    return sum_all

#### Flexible arguments: *args (2)

In [85]:
add_all(1)

1

In [86]:
add_all(1, 2)

3

In [87]:
add_all(5, 10, 15, 20)

50

#### Flexible arguments: ** kwargs
* These are arguments preceded by identifiers
* In the function definition, we use the parameter `**kwargs`. This turns the identifier-keyword pairs into a dictionary within the function body.

In [88]:
def print_all(**kwargs):
    """Print out key-value pairs in **kwargs."""
    
    for key, value in kwargs.items():
        print(key + ": " + value)

In [89]:
print_all(name='dumbledore', job='headmaster')

name: dumbledore
job: headmaster


## Chapter 3: Lambda functions and error-handling

### Lambda functions

#### Lambda functions

In [90]:
raise_to_power = lambda x, y: x ** y
raise_to_power(2, 3)

8

#### Anonymous functions
* Function map take two arguemnts: `map(func, seq)`
* `map()` applies the function to ALL elements in the sequence
* This returns a `map` object, which needs to be converted into a list in order to see the contents

In [93]:
nums = [48, 6, 9, 21, 1]
square_all = map(lambda num: num ** 2, nums)
print(square_all)

<map object at 0x106477c50>


In [94]:
list(square_all)

[2304, 36, 81, 441, 1]

#### Using map()
* map() applies a lambda function to each element in a list.

In [95]:
spells = ["protego", "accio", "expecto patronum", "legilimens"]
shout_spells = map(lambda item: item + '!!!', spells)

# Convert shout_spells to a list
shout_spells_list = list(shout_spells)

print(shout_spells_list)

['protego!!!', 'accio!!!', 'expecto patronum!!!', 'legilimens!!!']


#### Using filter()
* filter() can filter a list based on how the lambda funtion is defined.

In [96]:
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']
result = filter(lambda member: len(member) > 6, fellowship)

# Convert result to a list
result_list = list(result)

print(fellowship)
print(result_list)

['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']
['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']


#### Using reduce()
* reduce() takes a kind of "accumulator" function that "reduces" a list to a single value.

In [97]:
from functools import reduce

stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']
result = reduce(lambda item1, item2: item1 + item2, stark)
print(result)

robbsansaaryabrandonrickon


### Introduction to error handling

#### Passing an incorrect argument

In [98]:
float(2)

2.0

In [99]:
float('2.3')

2.3

In [100]:
float('hello')

ValueError: could not convert string to float: 'hello'

#### Passing valid arguments

In [101]:
def sqrt(x):
    """Returns the square root of a number."""
    return x ** (0.5)

sqrt(4)

2.0

In [102]:
sqrt(10)

3.1622776601683795

#### Passing invalid arguments

In [103]:
sqrt('hello')

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'float'

#### Errors and exceptions
* Exceptions - caught during execution
* Catch exceptions with `try-except` clause
    * Runs the code following `try`
    * If there's an exception, run the code following `except`

In [104]:
def sqrt(x):
    """Returns the square root of a number."""
    try:
        return x ** (0.5)
    except:
        print('x must be an int of float')

sqrt('hello')

x must be an int of float


* If we only want to catch `TypeErrors` and let other errors pass through, we would use:

In [105]:
def sqrt(x):
    """Returns the square root of a number."""
    try:
        return x ** (0.5)
    except TypeError:
        print('x must be an int of float')

sqrt('hello')

x must be an int of float


* More often than not, instead of merely printing an error message, we'll actually raise an error by using the keyword `raise`
* For example, our function may do something we may not desire when passed a negative number

In [106]:
sqrt(-9)

(1.8369701987210297e-16+3j)

In [107]:
def sqrt(x):
    """Returns the square root of a number."""
    if x < 0:
        raise ValueError('x must be nonnegative')
    try:
        return x ** (0.5)
    except TypeError:
        print('x must be an int of float')

In [108]:
sqrt(-9)

ValueError: x must be nonnegative