<a href="https://colab.research.google.com/github/wesleybeckner/python_foundations/blob/main/notebooks/S3_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Foundations, Session 3: Functions and Pandas Introduction

**Instructor**: Wesley Beckner<br>

**Contact**: wesleybeckner@gmail.com

<i>special thanks to [David Beck](https://www.cheme.washington.edu/facultyfinder/david-beck) for their contribution to this material</i>

<br>

---

<br>

Today, we will discuss **_functions_** in more depth.  We've seen them previously and used them, for example the `.append()` **_function_** for lists, or the even more general `print()` function.  Here, we'll dig into how you can make your own functions to encapsulate code that you will reuse over and over.  

<br>

---

## 3.0 Review from Session on Data Structures and Flow Control

In our last session, we discussed **_lists_**, **_dictionaries_**, and **_flow control_**.

**_Lists_** are **_ordered collections_** of data that can be used to hold multiple pieces of information while preserving their order.  We use `[` and `]` to access elements by their indices which start with `0`.  All things that operate on **_lists_** like slices use the concept of an inclusive lower bound and an exclusive upper bound.  So, the following gets elements from the **_list_** `my_list` with index values of `0`, `1`, and `2`, but **not** `3`!

```
my_list[0:3]
```



#### 🙋 Question 1: Slicing

> What other way is there of writing the same statement using **_slicing_**?  Hint, think about leaving out one of the numbers in the slice!

**_Dictionaries_** are **_named_** **_collections_** of data that can be used to hold multiple pieces of information as **_values_** that are addressed by **_keys_** resulting in a **_key_** to **_value_** data structure.  They are accessed with `[` and `]` but intialized with `{` and `}`.  E.g.

```
my_dict = { 'cake' : 'Tasty!', 'toenails' : 'Gross!' }
my_dict['cake']
```

Finally, we talked about **_flow control_** and using the concept of **_conditional execution_** to decide which code statements were executed.  Remember this figure?

<img src="https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/lnpls008.gif">Flow control figure</img>

> What are the **_if_** statments? <br> 
Where do **_for_** loops fit in? <br>

## 3.1 Functions

For loops let you repeat some code for every item in a list.  Functions are similar in that they run the same lines of code and, frequently, for new values of some variable (we call these **_parameters_**).  They are different in that functions are not limited to looping over items.

Functions are a critical part of writing easy to read, reusable code.

Create a function like:
```
def function_name (parameters):
    """
    optional docstring
    """
    function expressions
    return [variable]
```

Here is a simple example.  It prints a string that was passed in and returns nothing.

```
def print_string(string):
    """This prints out a string passed as the parameter."""
    print(string)
    return
```

In [None]:
def print_string(string):
  """This prints out a string passed as the parameter"""
  print(string)
  return

To call the function, use:
```
print_string("GIX is awesome!")
```

_Note:_ The function has to be defined before you can call it!

In [None]:
print_string("GIX is awesome!")

GIX is awesome!


### 3.1.1 Reserved words: def, return, and yield

Notice the highlighted words in our function definition: `def` and `return` these are *reserved words* in python used to define functions. Every function definition requires these reserved words. `yield` is another reserved word that is similar to `return` but operates slightly differently. It is beyond the scope of what we are covering in this session. This tutorial from [realpython](https://realpython.com/introduction-to-python-generators/) has good information on the topic.

In [None]:
# what is return doing in this function?
def my_square(a):
  return a ** 2

`return` is going to output whatever value(s) follow after the keyword `return` when we call upon our function 

In [None]:
a = 2
my_square(a)

4

I'm going to return two values...

In [None]:
def my_square(a):
  return a ** 2, a

and we see how the output updates accordingly

In [None]:
my_square(a)

(4, 2)

We can capture these values on the output with...

In [None]:
square, new_a = my_square(a)

In [None]:
print(square, new_a)

4 2


### 3.1.2 Global vs local variables and function parameters

In a function, new variables that you create are not saved when the function returns - these are **_local_** variables.  Variables defined outside of the function can be accessed but not changed - these are **_global_** variables.

let's define the following function

In [None]:
def my_little_func(a):
  b = 10
  return a * b

In [None]:
my_little_func(2)

20

if I run the following...

In [None]:
# b

Let's play with this a little further...

...now let's define b outside the function and call our function with `a=5`

In [None]:
# what happens here?
b = 100
my_little_func(5)

50

we see that b is still 100, instead of 10 as its defined within the function. This is because b inside of `my_little_func` is a *local* variable. 

it doesn't matter how I define b outside the function because within the function it is set locally.

... Let's do this A LITTLE MORE

In [None]:
def my_new_func(a):
  print(b)
  return a*b

now if I call on my new function, because `b` is not defined locally within the function, it takes on the global value. 

This is typically not happy happy fun fun behavior for us, we want to be explicit about how we define and use our variables (but there are some times when this is appropriate to do)

In [None]:
b= 1e4 # side note, what did I do here????
my_new_func(a)

10000.0


20000.0

#### 3.1.1.1 Function Parameters

Parameters (or arguments) in Python are all passed by reference.  This means that if you modify the parameters in the function, they are modified outside of the function. (Enrichment: Exceptions, see below)

See the following example:

```
def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)
```

In [None]:
def change_list(my_list):
   """This changes a passed list into this function"""
   my_list.append('four');
   print('list inside the function: ', my_list)
   return

my_list = [1, 2, 3];
print('list before the function: ', my_list)
change_list(my_list);
print('list after the function: ', my_list)

list before the function:  [1, 2, 3]
list inside the function:  [1, 2, 3, 'four']
list after the function:  [1, 2, 3, 'four']


#### 3.1.1.2 Enrichment: Global, local, and immutables

Let's go back to our former example...

immutables:

* integers, float, str, tuples

In [None]:
b = "a string"
b = 10
b = 10.2
b = (10, 2)
b = [10, 2]
a = 2

def my_little_func(a, b):
  if type(b) == str:
    b += "20"
  elif (type(b) == int) or (type(b) == float):
    b += 10
  elif (type(b) == tuple):
    print("AYYY no tuple changes, Dude")
    pass
  elif (type(b) == list):
    b.append('whoaaaa')
  print(b)
  return

print(b)
my_little_func(a, b)
print(b)

[10, 2]
[10, 2, 'whoaaaa']
[10, 2, 'whoaaaa']


There is a way to change a global variable within a function with the **_global_** keyword.  Generally, the use of **_global_** variables is not encouraged, instead use parameters. We won't cover the global keyword here but you can [explore further](https://www.programiz.com/python-programming/global-keyword) on your own if you are interested. 

In [None]:
b = 10
a = 2

def my_little_func(a):
  global b
  b += 20
  print(b)
  return 

print(b)
my_little_func(a)
print(b)

10
30
30


#### 🏋️ Exercise 1: My first function

Write a function that takes one parameter and returns any data structure

> If you are going to return multiple objects, what data structure that we talked about can be used?  Give and example below.

In [None]:
# Cell for excerise 1

### 3.1.3 Parameter types

**Function calling:**

* positional 
    * `func(10, 20)`
* keyword
    * `func(a=10, b=20)` or `func(b=20, a=10)`

**Function writing:**
* default
    * `def func(a=10, b=20)`



```
def print_name(first, last='Beckner'):
    print(f'Your name is {first} {last}')
    return
```

In [1]:
def print_name(first, last='Beckner'):
    print("Your name is {} {}".format(first, last))
    return

In [2]:
print_name(last='Beckner', first='Wesley')

Your name is Wesley Beckner


Play around with the above function.

In [3]:
print_name('Wesley', last='the Python Foundations Instructor')

Your name is Wesley the Python Foundations Instructor


Functions can contain any code that you put anywhere else including:
* `if`...`elif`...`else`
* `for`...`while`
* other function calls

```
def print_name_age(first, last, age):
    print_name(first, last)
    print('Your age is %d' % (age))
    if age > 25 and age < 40:
        print('You are a millenial!')
    return
```


In [None]:
def print_name_age(first, last, age):
    print_name(first, last)
    print('Your age is %d' % (age))
    if age > 25 and age < 40:
        print('You are a millenial!')
    return

```
print_name_age(age=29, last='Beckner', first='Wesley')
```

In [None]:
print_name_age(age=29, last='Beckner', first='Wesley')

Your name is Wesley Beckner
Your age is 29
You are a millenial!


## 3.2 The scientific python stack

In addition to Python's built-in modules like the ``math`` module we explored above, there are also many often-used third-party modules that are core tools for doing data science with Python.
Some of the most important ones are:

#### [``numpy``](http://numpy.org/): Numerical Python

Numpy is short for "Numerical Python", and contains tools for efficient manipulation of arrays of data.
If you have used other computational tools like IDL or MatLab, Numpy should feel very familiar.

#### [``scipy``](http://scipy.org/): Scientific Python

Scipy is short for "Scientific Python", and contains a wide range of functionality for accomplishing common scientific tasks, such as optimization/minimization, numerical integration, interpolation, and much more.
We will not look closely at Scipy today, but we will use its functionality later in the course.

#### [``pandas``](http://pandas.pydata.org/): Labeled Data Manipulation in Python

Pandas is short for "Panel Data", and contains tools for doing more advanced manipulation of labeled data in Python, in particular with a columnar data structure called a *Data Frame*.
If you've used the [R](http://rstats.org) statistical language (and in particular the so-called "Hadley Stack"), much of the functionality in Pandas should feel very familiar.

#### [``matplotlib``](http://matplotlib.org): Visualization in Python

Matplotlib started out as a Matlab plotting clone in Python, and has grown from there in the 15 years since its creation. It is the most popular data visualization tool currently in the Python data world (though other recent packages are starting to encroach on its monopoly).

#### [``scikit-learn``](https://scikit-learn.org/stable/): Machine Learning in Python

Scikit-learn is a machine learning library.

It features various classification, regression, and clustering algorithms, including support vector machines, random forests, gradient boosting, k-means, and DBSCAN.

The library is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.