<a href="https://colab.research.google.com/github/jbpost2/ST-554-Big-Data-With-Python-Course-Notes/blob/main/01_Programming_in_python/18_More_Function_Writing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# More on Writing Functions

Justin Post


---

## Programming in Python

We've gone through a lot with python already!

- `JupyterLab` & Markdown
- Basic data types
    + Strings, Floats, Ints, Booleans
- Compound data types
    + Lists, Tuples, Dictionaries, `Numpy` arrays, `pandas` series & data frames
- Writing Functions
- Control flow (if/then/else, Looping)
- Data uses and basic summarizations

Next up we'll cover a bit more about writing our own functions that will greatly increase their usefulness! Then we'll talk about how to plot our data using `matplotlib` and `pandas`.

Then we'll really be ready to start talking about modeling, data sources, and generally moving towards doing fun things with big data!

---

## Writing Functions Recap

- Writing functions is super cool!
- Recall the basic syntax

In [None]:
def func_name(args):
    """
    Doc string
    """
    body
    return object

- We saw that there were many ways to set up your function arguments and to call your function
- Remember that variables defined within the function are not generally available outside of the function
  + That is, a new *symbol table* is used when the function is called
  + We can define global variables if we really want to
- We return what we want from the function with `return`
  + If we don't return anything then `None` is returned!

The topics we'll cover in this notebook are:
- Packing and unpacking with functions
  + Catching extra arguments given to a function
  + Passing your arguments to a function from an object
- **lambda** functions
- `map()`, `filter()`, and `functools.reduce()`

In a later notebook we'll talk about how to handle errors or exceptions!

---

## Packing and Unpacking

Reminder: We can **pack** a list by separating variables to create with commas:

`first, second, third = ...`

Let's look at an example:

In [2]:
animals = ["Dog", "Cat", "Horse", "Frog", "Cow", "Buffalo", "Deer", "Fish", "Bird", "Fox", "Racoon"]
short_animals = animals[:3]

first, second, third = short_animals
print(first + " " + second + " " + third)

Dog Cat Horse


We saw that we can pack leftover elements into a list using `*variable`:

In [3]:
first, second, third, *other = animals
print(first + " " + second + " " + third)
print(other)

Dog Cat Horse
['Frog', 'Cow', 'Buffalo', 'Deer', 'Fish', 'Bird', 'Fox', 'Racoon']


---

### Unlimited Positional Arguments

- This idea can be used when writing a function!
- In this case we define an argument to our function with `*variable`
- This allows us to pass unlimited **positional** arguments to our function ([variadic](https://docs.python.org/3/tutorial/controlflow.html#defining-functions) arguments)
- The inputs are handled as a `tuple` in the function!

Let's write a silly function to print out all arguments passed via this idea

In [12]:
def basic_print(*args):
  print(type(args))
  print(args)
  return None

We can pass this function as many things as we'd like and it will be accessible within the function body as a tuple. We can see this as the printed values are surrounded by `(...)`, which implies we are printing a tuple!

In [13]:
basic_print("hi", ["a list", "how fun"], 3, 10)

<class 'tuple'>
('hi', ['a list', 'how fun'], 3, 10)


As tuples are iterable, we can iterate across these elements via a loop!

In [16]:
def basic_print_elements(*args):
  for i in args:
    print(type(i),i)
  return None

In [17]:
basic_print_elements("hi", ["a list", "how fun"], 3, 10)

<class 'str'> hi
<class 'list'> ['a list', 'how fun']
<class 'int'> 3
<class 'int'> 10


Let's define a function that takes in as many 1D numpy arrays or pandas series the user would like and returns the means for each input.

We'll also take an argument for the number of decimal places to return for the means.

In [18]:
def find_means(*args, decimals = 4):
    """
    Assume that args will be a bunch of numpy arrays (1D) or pandas series
    Return the mean of each, rounded to `decimals` places
    """
    means = []
    for x in args: #iterate over the tuple values
        means.append(np.mean(x).round(decimals))
    return means

- Create some data with `numpy` to send to this

In [19]:
import numpy as np
from numpy.random import default_rng
rng = default_rng(3) #set seed to 3

#generate a few means from standard normal data
n5 = rng.standard_normal(5)       #sample size of 5
n25 = rng.standard_normal(25)     #sample size of 25
n100 = rng.standard_normal(100)   #sample size of 100
n1000 = rng.standard_normal(1000) #sample size of 1000

Let's pass these to our function!

In [20]:
find_means(n5, n25, n100, n1000, decimals = 2)

[-0.22, 0.11, -0.01, 0.04]

Awesome! This gives us a lot more functionality with our function writing.

---

### Unlimited Keyword Arguments

- You can also pass unlimited **keyword** arguments if you define the arg with a `**`
- Handled as a **dictionary** in the function

Let's write a basic function to print out the keywords with their values.

In [27]:
def print_key_value_pairs(**kwargs):
    """
    key word args can be anything
    """
    print(type(kwargs), kwargs)
    for x in kwargs:
        print(x + " : " + str(kwargs.get(x))) #cast the value to a string for printing

Now we pass as many named arguments as we'd like!

In [29]:
print_key_value_pairs(
  name = "Justin",
  job = "Professor",
  phone = 9195150637)

<class 'dict'> {'name': 'Justin', 'job': 'Professor', 'phone': 9195150637}
name : Justin
job : Professor
phone : 9195150637


---

## Unpacking Arguments

- Suppose we want to call our function but our *arguments are stored in a list or tuple*
  - We'll do this a bit when we do our machine learning models!
- We can *unpack* this list or tuple to be our function arguments by calling our function in a particular way.

In [30]:
#We want to call our find_means function with these arguments
call_args = [n5, n25, n100, n1000]

- Call the function using `*call_args` (unpacking)

In [31]:
find_means(*call_args, decimals = 3)

[-0.223, 0.114, -0.014, 0.04]

Nice! Now we can more easily call our function too!

- We can do the same thing with our keyword arguments.
- Suppose our *keyword arguments* are stored in a *dictionary*
- Can call the function using `**kw_call_args` (unpacking)

Define a quick function.

In [None]:
def print_items(name, job, number):
  print("Name is: ", name)
  print("Job is: ", job)
  print("Phone number is: ", number)
  return

Create a dictionary with key-value pairs corresponding to our inputs.

In [34]:
kw_call_args = {"name": "Justin Post", "job": "Professor", "number": "9195150637"}
kw_call_args

{'name': 'Justin Post', 'job': 'Professor', 'number': '9195150637'}

Call our function using `**` with our dictionary!

In [36]:
print_items(**kw_call_args)

Name is:  Justin Post
Job is:  Professor
Phone number is:  9195150637


- Passing named and unnamed arguments can both be done at once!
- Recall our `find_means` function inputs:
`def find_means(*args, decimals = 4):`


In [39]:
dec_dictionary = {"decimals": 6}
find_means(*call_args, **dec_dictionary)

[-0.223413, 0.114454, -0.014443, 0.039762]

---

## Lambda Functions

We often want to create a quick function for a single purpose that we don't want to reuse for later.

Rather than define a function and storing it as an object the way we've been doing it, we can create a **lambda** function (also sometimes called an **in-line** function or an **anonymous** function)

- Use keyword `lambda`
- Define arguments followed by a `:`
- Give the action for the function to perform
  - Syntax requires a *single* line. Cannot use `return` or some other keywords

In [40]:
square_it = lambda x : x**2
square_it(10)

100

In [41]:
square_then_add = lambda x, y : x**2 + y
square_then_add(10, 5)

105

- Can still define the arguments in many ways

In [49]:
my_print = lambda x, y = "ho": print(x, y)
my_print("hi")

hi ho


In [50]:
my_print = lambda *x: [print("Input: " + str(z)) for z in x]
my_print("hi", "ho")

Input: hi
Input: ho


[None, None]

Now, saving the function function in an object is really kind of counter to the point of an anonymous (lambda) function. We don't usually save these for later use! We'll see many uses for lambda functions. Let's cover one of those here.

### `map()`

Using lambda functions comes up a lot in the `map/reduce` idea. This is important for what we'll do!

Map/reduce idea:
- Apply (or *map*) a function to each element of an iterable object
- Combine (or *reduce*) the results where able

Example: Counting words
- Want to take a list of words and create a tuple with the word and the value 1
- Syntax for `map`:
  + `map(function, object_to_apply_function_to)`

In [51]:
res = map(
    lambda word: (word, 1),
    ["these", "are", "my", "words", "these", "words", "rule"]
    )

Similar to other functions like `range` or `zip`, we don't get back the actual object we think we would. Instead we get a `map` object that can be used to find the mapped values.

In [47]:
print(type(res))
res

<class 'map'>


<map at 0x785ed8b403d0>

We can convert the `map` object to a list using `list()`

In [48]:
list(res)

[('these', 1),
 ('are', 1),
 ('my', 1),
 ('words', 1),
 ('these', 1),
 ('words', 1),
 ('rule', 1)]

Let's return the square of some values without defining a square function via `map()`

In [52]:
map(lambda r: r **2, range(0,5))

<map at 0x785ed8b40cd0>

In [54]:
list(map(lambda r: r **2, range(0,5)))

[0, 1, 4, 9, 16]

Note: this can equivalently be done using a list comprehension!

In [55]:
[r ** 2 for r in range(0,5)]

[0, 1, 4, 9, 16]

Another example of using map with a lambda function might be to quickly uppercase a list of strings.

In [56]:
list(map(lambda x: x.upper(), ['cat', 'dog', 'wolf', 'bear', 'parrot']))

['CAT', 'DOG', 'WOLF', 'BEAR', 'PARROT']

Again, this could be done with a list comprehension!

In [57]:
[x.upper() for x in ['cat', 'dog', 'wolf', 'bear', 'parrot']] #equivalent

['CAT', 'DOG', 'WOLF', 'BEAR', 'PARROT']

---

# Lambda Functions and `map()`

- Can use lambda functions to create a function generator

.left35[

In [None]:
def raise_power(k):
    return lambda r: r ** k

square = raise_power(2)
square(10)
cube = raise_power(3)
cube(10)

]

---

# Lambda Functions and `map()`

- Can use lambda functions to create a function generator

.left35[

In [None]:
def raise_power(k):
    return lambda r: r ** k

square = raise_power(2)
square(10)
cube = raise_power(3)
cube(10)

]
.right45[

In [None]:
ident, square, cube = map(raise_power, range(1,4))
ident(4)
square(4)
cube(4)

]

---

# `filter()`

- Lambda functions can be used with `filter()`
    + `filter()` takes a **predicate** (statement to return what you want) as the first arg and an iterable as the second

In [None]:
list(filter(lambda x: x in "aeiou", "We want to return just the vowels."))
[x for x in "We want to return just the vowels." if x in "aeiou"] #equivalent

---

# `filter()`

- Lambda functions can be used with `filter()`
    + `filter()` takes a **predicate** (statement to return what you want) as the first arg and an iterable as the second

In [None]:
list(filter(lambda x: x in "aeiou", "We want to return just the vowels."))
[x for x in "We want to return just the vowels." if x in "aeiou"] #equivalent

In [None]:
list(filter(lambda x: (x % 2) != 0, range(0, 10)))
[x for x in range(0, 10) if (x % 2) != 0]#equivalent

---

# `functools.reduce()`

- Lambda functions can be used with `functools.reduce()`
    + `reduce()` takes in a function of two variables and an iterable, applies the function repetitively over the iterable, and returns the result

In [None]:
from functools import reduce
reduce(lambda x, y: x + y, range(1,11)) # sum first 10 numbers
sum(x for x in range(1,11))

---

# `functools.reduce()`

- Lambda functions can be used with `functools.reduce()`
    + `reduce()` takes in a function of two variables and an iterable, applies the function repetitively over the iterable, and returns the result

In [None]:
from functools import reduce
reduce(lambda x, y: x + y, range(1,11)) # sum first 10 numbers
sum(x for x in range(1,11))

In [None]:
#add an initial value to the computation
reduce(lambda x, y: x + y, range(1,11), 45) # sum first 10 numbers + 45
sum(x for x in range(1,11)) + 45

---

# `functools.reduce()`

- Lambda functions can be used with `functools.reduce()`
    + `reduce()` takes in a function of two variables and an iterable, applies the function repetitively over the iterable, and returns the result

In [None]:
#create a list of numbers to find the max of
my_list = [53, 13, 103, 2, 15, -10, 201, 6]
reduce(lambda x, y: x if x > y else y, my_list)
reduce(lambda x, y: x if x > y else y, my_list, 500)

---

# To JupyterLab!  

- Use lambda functions with `sorted()` to define the `key` to sort on

- Use `map()` to demonstrate the LLN

<!--
ids = ['id1', 'id2', 'id30', 'id3', 'id22', 'id100']
print(sorted(ids)) # Lexicographic sort
['id1', 'id100', 'id2', 'id22', 'id3', 'id30']
sorted_ids = sorted(ids, key=lambda x: int(x[2:])) # Integer sort
print(sorted_ids)-->

---

# Recap

- Catching extra arguments to a function

- Passing your arguments to a function from an object

- **lambda** functions

- `map()`, `filter()`, and `functools.reduce()`
