# <span style="color:darkblue"> Lecture 9: Local/Global and Apply </span>

<font size = "5">

In the previous lecture we ...

- Worked through the definition of functions
- Illustrated some examples

In this lecture, we will ...

- Discuss the syntax of functions (local/global)
- Apply functions to multiple elements in a data frame
- Introduce ".py" files

## <span style="color:darkblue"> I. Import Libraries </span>

In [2]:
# the "pandas" library is for manipualting datasets

import pandas as pd


## <span style="color:darkblue"> II. Local/Global Variables </span>

<font size="5"> 

Most of the variables we've defined so far are "global"

- Stored in working environment
- Can be referenced in other parts of the notebook



<font size = "5">
Example:

In [1]:
message_hello = "hello"
number3       = 3

In [3]:
print(message_hello + " world")
print(number3 * 2)

hello world
6


<font size = "5">

Any "global" variable can be referenced inside functions

- However, this can lead to mistakes
- Preferrably, include **all** the inputs as parameters

<font size = "5">

$f(x,y,z) = x + y + z$

In [4]:
# Correct Example:
def fn_add_recommended(x,y,z):
    return(x + y + z)

print(fn_add_recommended(x = 1, y = 2, z = 5))
print(fn_add_recommended(x = 1, y = 2, z = 10))


8
13


In [5]:
# Example that runs (but not recommended)
# Python will try to fill in any missing inputs
# with variables in the working environment
def fn_add_notrecommended(x,y):
    return(x + y + z)

z = 5
print(fn_add_notrecommended(x = 1, y = 2))
z = 10
print(fn_add_notrecommended(x = 1, y = 2))



8
13


<font size ="5">

Variables defined inside functions are "local"

- Stored "temporarily" while running
- Includes: Parameters + Intermediate variables


<font size = "5">

Local variables supercede global variables

In [6]:
# This is an example where we define a quadratic function
# (x,y) are both local variables of the function
# 
# When we call the function, only the arguments matter.
# any intermediate value inside the function

def fn_square(x):
    y = x**2
    return(y)

x = 0
y = 1
print(fn_square(x = 10))


100


<font size = "5">

Local variables are **not** stored in the working environment

In [7]:
# The following code assigns a global variable x
# Inside the function

x = 5
y = 4

print("Example 1:")
print(fn_square(x = 10))
print(x)
print(y)

print("Example 2:")
print(fn_square(x = 20))
print(x)
print(y)


Example 1:
100
5
4
Example 2:
400
5
4


<font size = "5">

To permanently modify a variable, use the "global" command

In [7]:
def modify_x():
    global x
    x = x + 5

x = 1
# Now, running the function wil permanently increase x by 5.
modify_x()
print(x)

6


<font size = "5">

Try it yourself:

- What happens if we run "modify_x" twice?
- What happens if we add "global y" inside "fn_square"?

In [10]:
# Write your own code here

def modify_x():
    global y
    global x
    x = x + 5

x = 5
y=10
# Now, running the function wil permanently increase x by 5.
modify_x()
print(x)

modify_x()
print(x)


10
15


## <span style="color:darkblue"> III. Operations over data frames (apply/map) </span>


<font size = "5">

Create an empty data frame

In [12]:
data  = pd.DataFrame([])

<font size = "5">

Add variables

In [13]:
# The following are lists with values for different individuals
# "age" is the number of years
# "num_underage_siblings" is the total number of underage siblings
# "num_adult_siblings" is the total number of adult siblings

data["age"]                   = [18,29,15,32,6]
data["num_underage_siblings"] = [0,0,1,1,0]
data["num_adult_siblings"]    = [1,0,0,1,0]


<font size = "5">

Define functions

In [14]:
# The first two functions return True/False depending on age constraints
# The third function returns the sum of two numbers
# The fourt function returns a string with the age bracket

fn_iseligible_vote = lambda age: age >= 18
fn_istwenties      = lambda age: (age >= 20) & (age < 30)
fn_sum             = lambda x,y: x + y

def fn_agebracket(age):
    if (age >= 18):
        status = "Adult"
    elif (age >= 10) & (age < 18):
        status = "Adolescent"
    else:
        status = "Child"
    return(status)


<font size = "5">
Applying functions with one argument: <br>

```python
 apply(myfunction)
 ```
 - Takes a dataframe series (a column vector) as an input
 - Computes function separately for each individual


In [15]:
# The fucntion "apply" will extract each element and return the function value
# It is similar to running a "for-loop" over each element

data["can_vote"]    = data["age"].apply(fn_iseligible_vote)
data["in_twenties"] = data["age"].apply(fn_istwenties)
data["age_bracket"] = data["age"].apply(fn_agebracket)


# NOTE: The following code also works:
# data["can_vote"]    = data["age"].apply(lambda age: age >= 18)
# data["in_twenties"] = data["age"].apply(lambda age: (age >= 20) & (age < 30))

display(data)


Unnamed: 0,age,num_underage_siblings,num_adult_siblings,can_vote,in_twenties,age_bracket
0,18,0,1,True,False,Adult
1,29,0,0,True,True,Adult
2,15,1,0,False,False,Adolescent
3,32,1,1,True,False,Adult
4,6,0,0,False,False,Child


<font size = "5">

Mapping functions with one or more arguments <br>


```python
list(map(myfunction, list1,list2, ....))
```

In [16]:
# Repeat the above example with map
# We use list() to convert the output to a list
# The first argument of map() is a function
# The following arguments are the subarguments of the function

data["can_vote"] = list(map(  fn_iseligible_vote  , data["age"]   ))

In [17]:
# In this example, there are more than two arguments

data["num_siblings"] = list(map(fn_sum,data["num_underage_siblings"],data["num_adult_siblings"]))

<font size = "5">

<span style="color:darkgreen"> Recommended! </span>

- Arguments can be split into multiple lines!
- Start a separate line after a comma
- Experts recommend each line has 80 characters or less

In [18]:
data["num_siblings"] = list(map(fn_sum,
                                data["num_underage_siblings"],
                                data["num_adult_siblings"]))

<font size = "5">

Try it yourself!

- Write a function checking whether num_siblings $\ge$ 1
- Add a variable to the dataset called "has_siblings"
- Assign True/False to this variable using "apply()"

In [22]:
# Write your own code
fn_has_sibings = lambda num_siblings: num_siblings >= 1
data["has siblings"] = list(map(fn_has_sibings, data["num_siblings"]))


<font size = "5">

Try it yourself!

- Read the car dataset "data_raw/features.csv"
- Create a function that tests whether mpg $\ge$ 29
- Add a variable "mpg_above_29" which is True/False if mpg $\ge$ 29
- Store the new dataset to "data_clean/features.csv"


In [23]:
# Write your own code
dataset = pd.read_csv("data_raw/features.csv")
fn_mpg = lambda mpg: mpg >=29
dataset["mpg_above_29"] = dataset["mpg"].apply(fn_mpg)



<font size = "5">

Try it yourself!

- Map can also be applied to simple lists!
- Create a lambda function with arguments {fruit,color}.
- The function returns the string <br>
" A {fruit} is {color}"
- Create the following two lists:

``` list_fruits  = ["banana","strawberry","kiwi"] ```

``` list_colors  = ["yellow","red","green"] ```
- Use the list(map()) function to output a list with the form

In [24]:
# Write your own code
list_fruits = ["banana","strawberry","kiwi"]
list_colors = ["yellow","red","green"]

fn_match =lambda fruit,color: "A " + fruit + " is " + color

list(map(fn_match,list_fruits,list_colors))






['A banana is yellow', 'A strawberry is red', 'A kiwi is green']

## <span style="color:darkblue"> IV. (Optional) External Scripts </span>

<font size = "5">

".ipynb" files ...

- Markdown + python code
- Great for interactive output!

".py" files ...

- Python (only) script
- Used for specific tasks
- Why? Split code into smaller, more manageable files



<font size = "5">

<table><tr>
<td style = "border:0px"> <img src="figures/screenshot_py_functions.png" alt="drawing" width="300"/>  </td>
<td style = "border:0px">

File with functions

 </td>
</tr></table>




In [20]:
#------------------------------------------------------------------------------#
# The "%run -i" command executes a Python script that is
# in the subfolder "scripts/"
#
#      Breaking down the command:
#      (a) The percentage sign "%" is associated with "magic commands"
#          which are special functions used in iPython notebooks.
#      (b) The suboption "-i" ensures that the program can use any variables
#          defined in the curent working environment (in case it's necessary)
#------------------------------------------------------------------------------#

%run -i "scripts/example_functions.py"

x = 1
print(fn_quadratic(1))
print(fn_quadratic(5))


1
25



<font size = "5">

<table><tr>
<td style = "border:0px"> <img src="figures/screenshot_py_variables.png" alt="drawing" width="300"/>  </td>
<td style = "border:0px">

File with variables

- Storing values/settings
- Variables are global <br>
(can be referenced later)

</td>
</tr></table>

In [21]:
# When we run this program
# the value of alpha will be overwritten

alpha = 1
print(alpha)

%run -i "scripts/example_variables.py"
print(alpha)


1
5
