# Python Data Science Toolbox (Part 1)
Run the hidden code cell below to import the data used in this course.

In [None]:
# Import the course packages
import pandas as pd
from functools import reduce

# Import the dataset
tweets = pd.read_csv('datasets/tweets.csv')

## User-defined functions

Learning to:
- Define functions without parameters
- Define functions with parameters
- Define functions that return a value

The part of the header that specifies the function name and parameters (like square(value)) is the signature of the function

When you define a function, you write **_parameters_** in the function header. When you call a function, you pass **_arguments_** into the function

When you assign a varialbe to a function that prints a value but does not return a value will result in that variable being type _NoneType_. 

You can have values be printed when the function is executed, but you could also _return_ the value and assign it to some variable. 

Docstrings describe what your function does, such as the computations it performs or its return values. Serve as documentation. Placed between the line immediately after the function header, and are placed in between triple quotation marks

In [1]:
def square(): # <- Function header, no parameters in the parentheses
    new_value = 4 ** 2 # <- Function body
    print(new_value)
    
square()

16


In [3]:
def square(value): # <- Function header, with parameter in the parentheses
    """Return the square of a value""" # <- Docstring
    new_value = 4 ** 2 # <- Function body
    print(new_value)
    
square(5)

16


In [None]:
def square(value): # <- Function header, with parameter in the parentheses
    new_value = 4 ** 2 # <- Function body
    return new_value
num = square(4)

print(num)

## Multiple Parameters and Return Values

call function: # of arguments = # of parameters

To have a function return multiple values, you need to construct objects known as tuples in your functions. You can unpack tuples into several variables. You can also access individual tuple elements like you can with lists with zero-indexing

In [3]:
def raise_both(value1, value2):
    '''Raise value1 to the power of value2 and vice versa'''
    
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    
    new_tuple = (new_value1, new_value2) # Returns multiple values in a tuple
    
    return new_tuple

result = raise_both(2, 3)
print(result)

(8, 9)


In [2]:
# Unpacking tuples

even_nums = (2, 4, 6)

a, b, c = even_nums

print(a)

## Accessing tuple elements with indexing

print(even_nums[1])

2
4


## Scope and user defined functions

Not all objects that you define are always accessible everywhere in a program. Scope: part of the program where an object or name may be accessible. Names refer to the variables or, more generally, objects such as functions that are defined in your program. For example, a variable x has a name, as does the function sum. 

Three types of scopes:
1. Global scope - defined in the main body of a script or a Python program
2. Local scope - defined in a function. Once a function is executed, any name inside the local scope ceases to exist, so you can't access those names anymore outside the function definition 
3. Built-in scope - names in the pre-defined built-ins module Python provides, such as print and sum 

Any time we call the name in the local scope of the function, it will look first in the local scope, so if a variable is defined in the main body of the script but also in the function, it will pull the variable value from the function. If it's not in the local scope, _then_ it will look in the global scope. If not there, then the built-in scope is searched

To alter the value of a global name within a function call, you use the global keyword in the function. After the functions is executed, the global value will be updated if updated in the function

Python's built in scope is a built-in module called builtins. You have to import builtins, and then execute dir(builtins) to print a list of all the names in the module

Learning Python: https://www.oreilly.com/library/view/learning-python-5th/9781449355722/

In [None]:
new_val = 10 # <- name defined in the global scope

def square(value):
    '''Returns the square of a number.'''
    global new _val # <- global keyword used to access the the value in the global scope
    new_val = new_val ** 2
    return new_val

## Nested Functions

If you have function x nested within function y, and reference a name in the inner function, what happens? Python searches the local scope of the inner function, and if it doesn't find the name, it will look in the outer function (called an enclosing function). If no name is there, then python will looking the global scope and then the built-in scope

Nesting functions are good for using a process a number of times within a function, like adding numbers to multiple parameters. 

Returning functions: where the inner function is returned; so the function is essentially creating a function! You can use nonlocal to create and change names in an enclosing scope. Using the nonlocal keyword lets you update the variable in the inner scope and the enclosing scope as well

Another good reason for nesting functions is closure. This means that the nested or inner function remembers the state of its enclosing scope when called. So anything defined locally in the enclosing scope is available to the inner function even when the outer function has finished execution

Remember the LEGB rule for scope searches: local, enclosing, global, and built-ins

Assigning names will only create or change local names, unless they are declared in global or nonlocal statements using the keyword global or nonlocal

In [1]:
def mod2plus5(x1, x2, x3):
    '''Returns the remainder plus 5 of three values.'''
    
    def inner(x):
        '''Returns the remainder plus 5 of a value.'''
        return x % 2 + 5
    
    return (inner(x1), inner(x2), inner(x3))

print(mod2plus5(1, 2, 3))

(6, 5, 6)


In [2]:
# Returning functions

def raise_val(n):
    '''Return the inner function.'''
    
    def inner(x):
        '''Raise x to the power of n.'''
        raised = x ** n
        return raised
    return inner

square = raise_val(2) # <- this is creating a function that squares any number
cube = raise_val(3) # <- this is creating a function that cubes any number
print(square(2), cube(4))

# When you call the function square, it remembers the value n=2, although the enclosing scope defined by raise_val and to which n=2 is local, has finished execution. This is a subtlety referred to as a closure in Computer Science

4 64


In [4]:
# Using nonloca

def outer():
    '''Prints the value of n.'''
    n = 1
    
    def inner():
        nonlocal n # 
        n = 2
        print(n)
        
    inner()
    print(n)
    
outer()

2
2


## Default and flexible arguments

Flexible arguments: * args in the parantheses allows the user to pass multiple arguments into the function. This turns all the arguments passed to a function call into a tuple called args in the function body

Keyboard argument: ** krawgs in the parantheses allows you to pass an arbitray number of keyword arguments (arguments preceded by identifiers). This will turn the identifier-keyword pairs into a dictionary within the function body. Then you can print the key-value pairs stored in the dictionary kwargs

The names args and kwargs aren't important when using flexible arguments, but rather that they're preceded by a single and double star

In [None]:
def power(number, pow=1): # <- pow= is the default argument
    '''Raise number to the power of pow.'''
    new_value = number ** pow
    return new_value

# With this function, if you use two arguments, it'll raise the first argument to the power of the second argument, but if you only use one argument, then the it's raised to the default pow=1

In [1]:
def add_all(*args):
    '''Sum all values in *args together.'''
    
    # Initialize sum
    sum_all = 0
    
    # Accumulate the sum
    for num in args:
        sum_all += num
        
    return sum_all 

add_all(5, 78, 2829)

2912

In [3]:
def print_all(**kwargs):
    '''Print out key-value pairs in **kwargs.'''
    
    # Print out the key-value pairs
    for key, value in kwargs.items():
        print(key + ': ' + value)
        
print_all(name='Macy Watson', job='Data Scientist')

name: Macy Watson
job: Data Scientist


## Lambda functions

Allow you to write functions in a quick and potentially dirty way. Don't use them all the time

Anonymous functions: 

Function map takes two arguments: a function, and a sequence like a list to apply the function over all the elements of the sequence. If you pass lambda functions to map without naming them, they are called anonymous functions. To get the results, you have to use the function list to turn it into a list and print the results to the shell. Otherwise, it prints that it's a map object.

filter() can also take a lambda function to filter a sequence like a list

reduce() can be used to return a single value rather than a list of values like map()

In [1]:
raise_to_power = lambda x, y: x ** y

raise_to_power(2, 3)

8

In [4]:
nums = [48, 474, 9, 30]

square_all = map(lambda num: num ** 2, nums)

# print(square_all) <- this would just print that it's a map object

print(list(square_all))

[2304, 224676, 81, 900]


## Intro to error handling

You can provide useful error messages for the functions we write. 

Exceptions are errors caught during execution. The main way to catch these exceptions is with a try-except clause, where Python tries to run the code following _try_. If it can't run it due to an exception, it runs the code following _except_. 

In [None]:
def sqrt(x):
    '''Returns the square root of a number.'''
    try:
        return x ** 0.05
    except: 
        print('x must be an int or float')

If we only want to catch TypeErrors and let others pass, in which case you would use except TypeError

In [None]:
def sqrt(x):
    '''Returns the square root of a number.'''
    try:
        return x ** 0.05
    except TypeError: 
        print('x must be an int or float')

There are other types of exceptions that can be caught. Check out Python documentation. 

Instead of printing an error message, we may want to actually raise an error by using the keyword 'raise'. 

In [None]:
def sqrt(x):
    '''Returns square root of a number.'''
    if x < 0:
        raise ValueError('x must be non-negative')
    try:
        return x ** 0.5
    except TypeError:
        print('x must be an int or float')

## Bringing it all together



### Tuples

It's like a list, and can contain multiple values. Immutable, you can't modify values. Constructed using parentheses

### Scope Search Order, the LEGB Rule

- Local scope
- Enclosing functions
- Global
- Built in 

## Explore Datasets
Use the DataFrame imported in the first cell to explore the data and practice your skills!
- Write a function that takes a timestamp (see column `timestamp_ms`) and returns the text of any tweet published at that timestamp. Additionally, make it so that users can pass column names as flexible arguments (`*args`) so that the function can print out any other columns users want to see.
- In a `filter()` call, write a lambda function to return tweets created on a Tuesday. Tip: look at the first three characters of the `created_at` column.
- Make sure to add error handling on the functions you've created!