## More on functions

Last week, we learned that functions can return values to be used by other code. Here's an example:

    def double(num):
        return 2 * num
        
    def square(num):
        return num ** 2
        
    x = 5
    x_doubled = double(x)
    x_doubled_and_squared = square(x_doubled)
    
This is the power of `return`ing values - they can then be used by other code. Function _parameters_ - the values inside the parentheses - are evaluated first, so this is equivalent to the above (this is referred to as function _composition_):

    x_doubled_and_squared = square(double(5))

Functions can also return multiple values. Just separate the values after the `return` keyword by a comma:

    def means(num1,num2):
        """
        returns the arithmetic and geometric means of two numbers.
        inputs: num1, num2; ints or floats to have their means taken.
        output: a tuple of floats containing the arithmetic and geometric means, in that order.
        """
        ari_mean = (num1 + num2) / 2
        geo_mean = (num1 * num2) ** 0.5
        
        return ari_mean, geo_mean
        
The return values can be assigned, also using commas, like this:

    ari_output, geo_output = means(4, 16)
    


In [11]:
# write your code here

## A note on commenting

Note that I used triple-double quotes to make a multi-line comment explaining the function above. You should always comment your functions to include this information:
- An explanation of what the function does
- An explanation of each input: data type,
- An explanation of each output

This is the bare minimum. Part of your grade on every assignment going forward will be your application of comments.

In [None]:
# write your code here

### Default values

Function parameters can have _default values_ - values that are inserted if no value is supplied when the function is called. The one rule about default values is that all parameters that _do not_ have default values have to come first.

    def greet_person(name,greeting="hello"):
        print(greeting + " " + name)
        
Also note that you can refer to function parameters out of order if you call them by name:

    def greet_person(name, greeting):
        print(greeting + " " + name)
        
    greet_person("toby","hello")
    greet_person(greeting="hello",name="toby")
    
These two function calls produce identical results.

In [7]:
# write your code here

## Useful built-in functions

We've used some functions already - for example, we used `str()` to turn numbers into strings. Python has a number of <a href="https://docs.python.org/3/library/functions.html">built-in functions</a> that can be used right out of the box. These include `abs` for absolute value, `max`, `min`, `sum`, and `type`, among others.

In [None]:
# write your code here

## Modules

There are many, many, many, many things you will want to do in Python that require functionality beyond the few dozen built-in functions Python provides. To service these needs, Python offers modules, some of which are built and maintained by the official Python developers, and others of which are built and maintained by Python's enormous, global, open-source community.

Here are a few of the most important libraries:
- `math` (mathematical functions)
- `pandas` (dataframe package for manipulating spreadsheet-like datasets like csv's)
- `numpy` (extensive math library including matrix functions required for statistics)
- `scikit-learn` (machine learning package)
- `matplotlib` (data viasualization)

Modules are added to your code using the `import` keyword:

    import math
    x = 5.99
    x_floor = math.floor(x)
    
Two principles are customary when it comes to modules:
- Don't just import your modules as needed in the code - when you realize the need for a module, add its `import` statement at the top of your code, along with all the other `import` statements for your entire script. This may result in a large number of module import statements; that's totally fine.
- Import modules only for the scope they are needed - if a module is used only in one function, it is considered good hygiene to limit that import to the function itself:
    ```
    def function_that_uses_some_module():
        import some_module
        [...]
    ```
    This way, the module and its functions will be visible only within the function scope, not the global scope. This increases the cleanliness of your _namespace_ - the set of all variable names and pieces of data you've loaded into your script.
    
Note - if you are trying to use a module but python says it's not found, use `pip` to install it:

    pip install some_module
    

In [16]:
# write your code here

Modules are typically unified by some central theme - for `pandas` it is dataframe manipulation, for example. The module will contain the functions to do almost anything you could want to do with that type of data - pandas contains summary functions like averages that can be applied per column, column renaming functionality, reshaping functions, and much more.

## A couple more functions concepts

The next two topics - functions of functions, and recursion - are pretty advanced. I am going to cover them so you will be aware of them; however, you're not expected to implement these or be an expert in them unless you're just the kind of person who likes that stuff.

## Functions of functions

Functions have a different "feel" than data types like ints or strings, because they are dynamic and not static; however, functions are treated as data by your machine, and can therefore be `return`ed by functions just like anything else. This means you can create things like _function generators_:

    def make_a_multiplier(m):
        def multiplier_instance(n):
            return n * m
        
        return multiplier_instance
        
    multiply_by_5 = make_a_multiplier(5)
    multiply_by_5(15)
    
What will this code do?

In [19]:
# write your code here

## Recursion

The last topic we'll touch on with respect to functions is the most mind-bending one of all: recursion. Recursion refers to functions that call themselves.

    def recursive_sequence_incrementer(step,start=0,target_number):
        if start >= target_number: return start
        return recursive_sequence_incrementer(step,start + step,target_number)
    
    recursive_sequence_incrementer(5,0,102)

In [53]:
# write your code here

## Data structures

So far we've dealt with singleton data types: one int, one float, one string at a time - the one exception being functions that return multiple things in a _tuple_. But datasets are not singletons; they're large structures of many pieces of data. One way to represent many pieces of data is to use a _list_.

Lists are declared with either the "list()" function, or by using square brackets ([]). The following are equivalent:
- list((1,2,3))
    - Note the double parentheses - we'll understand this in a moment.
- [1,2,3]

Lists are exactly what they sound like: ordered collections of data. In some languages, lists (often called "arrays" in other settings) can contain only one data type, e.g., all ints, or all floats, or all strings. However, in Python they can contain anything, including mixed data. The following are legal lists - though I will say, while there's nothing wrong with them, these grab-bags of data do make most developers feel icky:
- [1,1.5,2] (ints mixed with floats)
- ["Pineapple", 2, 3.14159, ["another","list",2], "Jelly"] (ints, floats, strings, and lists)
- [1,1,2,3,5,8] (ints only)

Try it below:
- Create a list with three elements: your first name, your last name, and your age.
- Assign the list to a variable with whatever name you want.
- Print the list.

Note - lists and tuples behave in the same way, except tuples are created using parentheses, whereas lists are created using square brackets. Tuples cannot be modified in-place, meaning you can't, for example, set the second value of a tuple to be something else. Tuples are useful for things like constants (e.g., pi) or function return values - things you will use but not modify.

In [None]:
## write your code here

Now you know how to put data in lists. How do you get it out? For example, what if you want to access just your last name from the list above?

Python uses bracket notation to access entries in lists. So if you have a list called "my_list," you can access the first element of that list by typing:
- my_list[0]

Note that the first element of the list is element _zero_, not element one - if you asked for my_list[1], you'd get the _second_ element. This is because Python is "zero-indexed." This takes some getting used to conceptually, but is convenient for a lot of reasons which we'll find out about later.

Try it below:
- Create the list ["the","quick","brown","fox","jumped","over","the","lazy","dog"].
- Assign the last element of the list to a new variable and print it out.
- Create a list with the first 10 Fibonacci numbers and assign it to a variable.
- Modify the list so the last number is divided by 3.


In [None]:
## write your code here

## Tuples

Similar to lists, tuples store multiple pieces of data. Unlike lists, tuples cannot be modified.

Tuples are created the same way as lists, except they are wrapped in regular parentheses instead of square brackets. Everything else about tuples is the same as with lists: they can contain different types of data; they are accessed by bracket notation; they are zero-indexed. Tuples are useful for more advanced programming tasks, which we'll encounter in future weeks. For now, just remember they exist.

Try it below:
- Create a tuple with the first 10 Fibonacci numbers and assign it to a variable.
- Try to modify the list so the last number is divided by 3. What happens?


In [None]:
## write your code here

<div class="alert alert-block alert-info">
Changing data is also technically known as "mutating" it; therefore, data types that can be modified, like lists, are called "mutable," as in, "able to be mutated." Data types like tuples that can't be changed are called "immutable."
</div>

## Dictionaries

There's one more important "collection" type object in Python: the dictionary. Dictionaries are also known as collections "key-value pairs," and they are used as lookup lists for pieces of related data. Think about a collection of student ID -> student name mappings:

- "1030204": "Penk, Toby"
- "1030205": "Doe, John"
- ... etc

So now if you have a program that knows student ID's, you can look up the student names here to display it ona webpage, print an ID badge, etc etc. The syntax of dictionaries uses curly braces ({}), commas, and colons:

```
{
    "1030204": "Penk, Toby",
    "1030205": "Doe, John"
}
```

All of the keys (the entries on the left) and the values (the entries on the right) are separated by colons, and the key-value pairs themselves are separated by commas. Strictly speaking, the line spacing and visual formatting doesn't matter - you could create this dictionary all one line, or with different indentation, etc. But the way I've declared it is a standard because it's easy to read.

Keys always have to be ints or strings, whereas values can be any data type at all, including other dictionaries. Dictionaries can be deeply nested, with dictionaries being the keys of dictionaries being the keys of dictionaries.... data types like this are the nightmares of CS students, and we will not discuss them in this class.

Try it yourself:
- Create a dictionary where the keys are your class names, and the values are the names of the teachers of those classes.
- Print out the name of your favorite teacher by calling the appropriate key-value pair from the dictionary.


In [None]:
# write your code here

## Using dictionaries to test functions

Dictionaries can be useful for writing tests for functions you've written, to make sure they work correctly in every scenario. Think about testing this function:

    def divide_by_2(n):
        return n / 2
        
Is this always going to return correct results? Let's declare a dictionary containing the answers we know are correct:

    test_dict = {
        2: 1.0,
        5: 2.5,
        1000: 500.0
    }
    
You can iterate the `key`s of a dict the same as you iterate a `list`:
    
    for key in dict:
        test_val = divide_by_2(key)
        if test_val != test_dict[key]:
            print("error in key-value pair " + str(key) + ":" + str(test_dict[key])
            
We gave the function some pretty easy tests, but the real point of testing is to expose the weird, unusual situations, or _edge cases_, where the function might fail. Try some more unusual values:

    test_dict = {
        2: 1.0,
        5: 2.5,
        1000: 500.0,
        1000000000000000: 500000000000000.0,
        0.1: 0.05,
        0.000000001: 0.0000000005,
        "string": "ERROR"
    }
   
Currently our function will pass all the tests here except the last one. How can we make it handle the last one correctly?


In [70]:
# write your code here

## Three common data structure transformations

There are three functions that are commonly implemented in programming languages to enable canonical data operations:
- `filter` returns only values meeting some criteria you set
- `map` transforms every value in a list by some function you specify
- `reduce` gives back one value from a list of data resulting from a function you specify that combines elements one at a time

### filter()

There are two ways to specify the function that will be used to `filter` a dataset.
1. Declare a function outside the `filter` and call it by name
2. Use function logic entirely inside the `filter` with the keyword `lambda`

```
#first method
def is_even(n):
    """
    determines whether integer n is even
    inputs: n, an int
    output: one boolean, True if n is even; False otherwise
    """
    return n % 2 == 0

my_list = list(range(20)) # list function makes range more usable
my_list_evens = list(filter(is_even,my_list))
```

We declare an even-number-checker called `is_even`; then, we give that function, and a list of numbers, to the `filter` function. `filter` returns a `filter` object, which is not technically a `list`, so we coerce the `filter` to a `list` with the `list()` function.

We can also do this by passing the logic of `is_even` directly to the `filter` without declaring a separate function. This is how `lambda` works:

```
#second method
my_list = list(range(20)) # list function makes range more usable
my_list_evens = list(filter(lambda x: x % 2 == 0,my_list))
```

These two methods are exactly equivalent; therefore, your consideration should mainly be which is easier to implement and to read. `lambda` can be easier to write, and for some reason (I think most coders would agree) is simply more fun to code. However, when the logic gets more complicated, `lambda` can be a lot less readable.

Try it yourself:
- Create a list of strings giving the first names of ten people
- Filter the list to the names that are exactly 4 letters long
    - Hint: use the len() function
    - You can implement this with a named function or with lambda; it's up to you


In [14]:
# write your code here

### map()

`map` works very similarly to `filter`, except the function you pass will be used to _transform_, rather than _filter_, your data:

    def cube(n):
        """
        returns the cube of a number, aka that number raised to the third power
        inputs: n, an int or float
        outputs: an int or float representing the cube
        """
        return n ** 3
        
    my_list = list(range(20))
    my_list_cubed = list(map(cube, my_list))
    
This is exactly equivalent to the `lambda` version:

    my_list = list(range(20))
    my_list_cubed = list(map(lambda x: x ** 3, my_list))
    
Try it yourself:
- Declare a list containing the first ten letters of the alphabet in lowercase
- `map` that list to an uppercase version


In [73]:
# write your code here

### reduce()

The last common list transformation is `reduce`, which returns one value from a list. This can also be done with a dedicated function, or a `lambda`, and note that unlike `map` and `filter`, `reduce` is not in base Python but comes fro mthe `functools` package:

    import functools
    
    def add(x,y):
        return x + y
        
    my_list = list(range(10))
    list_sum = functools.reduce(add,my_list)
    
This is equivalent to:

    import functools
        
    my_list = list(range(10))
    list_sum = functools.reduce(lambda x, y: x + y, my_list)
    
Try it yourself:
- Write a function that uses `reduce` to calculate the factorial of an integer.
    - Recall that factorial represents the product of all numbers up to and including n. So the factorial of 5 is 5 * 4 * 3 * 2 * 1.
    

In [76]:
# write your code here

## String methods

In [None]:
string splitting and joining



## Classes