<h1 style='color:white'> Statistics 21 <br/> Python & Other Technologies for Data Science </h1>

<h3 style='color:white'>Vivian Lew, PhD - Wednesday, Week 3</h3>

# Functions in Python
## Many Thanks to Miles Chen, PhD who developed the materials
### Adapted from *Think Python* by Allen B. Downey and *A Whirlwind Tour of Python* by Jake VanderPlas

## Parameters and Arguments

Inside a function, the arguments of a function are assigned to variables called parameters.

In [1]:
# a silly function
def print_twice(bruce):
    print(bruce)
    print(bruce)

The function assigns the argument to a parameter named `bruce`. When the function is called, it prints the value of the parameter (whatever it is).

In [2]:
print_twice("spam")

spam
spam


In [3]:
import math
print_twice(math.sin(math.pi / 2))

1.0
1.0


In [4]:
print_twice("Spam " * 3)

Spam Spam Spam 
Spam Spam Spam 


In [5]:
print_twice(print_twice("Spam"))

Spam
Spam
None
None


What happened here?

The inner `print_twice()` ran first. It printed "Spam" on one line and printed "Spam" again on the next line.

However, the function `print_twice()` has no return value. It returns `None`. So the outer call of `print_twice()` prints `None` two times.

## Default arguments

you can also specify default arguments that will be used if they are not explicitly provided

In [6]:
# example without defaults
def stuff(a, b, c):
    print(a**1, b**2, c**3)

In [7]:
stuff(3, 6, 9)

3 36 729


In [8]:
stuff(1, 2) # if you do not provide the correct arguments, you get an error

TypeError: stuff() missing 1 required positional argument: 'c'

In [9]:
# example with defaults
def junk(a = 1, b = 2, c = 3):
    print(a**1, b**2, c**3)

In [10]:
junk()

1 4 27


In [11]:
junk(3) # specifying only one will put it in the first argument

3 4 27


In [12]:
junk(b = 4) # be specific

1 16 27


In [13]:
junk(5, 10, 0)

5 100 0


In [14]:
# python will get confused if you name only some of the arguments.
junk(5, a = 10, b = 0) 

TypeError: junk() got multiple values for argument 'a'

In [15]:
junk(c = 5, a = 10, b = 0) # but this works

10 0 125


## Function Variables and Parameters are Local
When you create a variable inside a function, it is local, which means that it only exists inside the scope of the function. 

In [16]:
def print_twice(bruce):
    print(bruce)
    print(bruce)

def cat_twice(part1, part2):
    cat = part1 + " " + part2
    print_twice(cat)

line1 = 'bidi bidi'
line2 = 'bom bom'
cat_twice(line1, line2)

bidi bidi bom bom
bidi bidi bom bom


When cat_twice terminates, the variable cat is destroyed. 

If we try to refer to `cat` outside of the function, we get an error. 

In [17]:
print(cat)

NameError: name 'cat' is not defined

Parameters are also local. For example, "outside" of `print_twice` function, there is nothing named `bruce`.

In [18]:
print(bruce)

NameError: name 'bruce' is not defined

## Error Tracebacks

If an error occurs during a function call, Python prints the offending line. If the offending line is a function, it prints out the contents of that function and the offending line there. It continues doing this until it reaches the point where initial function call was made

Values that are not defined inside a function are defined in the global environment (AKA the top-level script environment).  

For example, I modified the function `print_twice()`. It tries to access the variable `cat` which is not defined inside `print_twice()`. 

In [19]:
def print_twice(bruce):
    print(cat)
    print(cat)

def cat_twice(part1, part2):
    cat = part1 + " " + part2
    print_twice(cat)


In [20]:
line1 = 'bidi bidi'
line2 = 'bom bom'
cat_twice(line1, line2)

NameError: name 'cat' is not defined

```
Cell In[20], line 3
      1 line1 = 'bidi bidi'
      2 line2 = 'bom bom'
----> 3 cat_twice(line1, line2)
```

The traceback starts with the lines we just exectued. There are no problems with lines 1 and 2 where we simply assign some lyrics to variable names. Python tell us the offending line is line 3 when we called `cat_twice()`

```
Cell In[19], line 7, in cat_twice(part1, part2)
      5 def cat_twice(part1, part2):
      6     cat = part1 + " " + part2
----> 7     print_twice(cat)
```

The next part of the traceback enters the function `cat_twice()`
It tells us that the offending line is line 7: when we made a call to `print_twice()`

```
Cell In[19], line 2, in print_twice(bruce)
      1 def print_twice(bruce):
----> 2     print(cat)
      3     print(cat)

NameError: name 'cat' is not defined
```

Finally, the traceback shows us the contents of `print_twice()` and says the offending line is line 2: when we try to print the variable `cat`.

`NameError: name 'cat' is not defined`

It gives us a NameError, that is, Python cannot find the variable or function being used, it may be undefined or defined in a different scope.

## Global Scope

In the following cell, I run the same code but define `cat` in the global scope. Even though `cat` is not found inside the local scope of the function `print_twice()`, it is defined in the global scope. When `print_twice()` is called from within `cat_twice()`, the variable `cat` is found in the global environment and printed.

In [21]:
def print_twice(bruce):
    print(cat)
    print(cat)

def cat_twice(part1, part2):
    cat = part1 + " " + part2
    print_twice(cat)
    
line1 = 'bidi bidi'
line2 = 'bom bom'

cat = "something else entirely"

In [22]:
cat_twice(line1, line2)

something else entirely
something else entirely


## `%who`, `%whos`, and `%who_ls`

iPython has a few magic commands that list the objects defined in the global environment `%who` prints the names, `%whos` prints the names and details of each object, and `%who_ls` returns a list with object names as strings.

In [23]:
%who

cat	 cat_twice	 junk	 line1	 line2	 math	 print_twice	 stuff	 


In [24]:
%whos

Variable      Type        Data/Info
-----------------------------------
cat           str         something else entirely
cat_twice     function    <function cat_twice at 0x1089122a0>
junk          function    <function junk at 0x10846a3e0>
line1         str         bidi bidi
line2         str         bom bom
math          module      <module 'math' from '/Lib<...>h.cpython-311-darwin.so'>
print_twice   function    <function print_twice at 0x108913d80>
stuff         function    <function stuff at 0x108413740>


In [25]:
%who_ls

['cat', 'cat_twice', 'junk', 'line1', 'line2', 'math', 'print_twice', 'stuff']

## Scoping rules

Assignment operations only affect values inside the function and do not interact with those same values outside the function.

In [26]:
x = 5

In [27]:
x

5

In [28]:
def alter_x(x):
    x = x + 1
    return x

In [29]:
alter_x(x)

6

In [30]:
x

5

## Global variables

If you want your function to alter variables outside of its own scope, you can use the keyword `global`

Be careful with this keyword.  Really, try to avoid doing this...

In [31]:
def alter_global_x():
    global x
    x = x + 1
    return x

In [32]:
x = 5

In [33]:
alter_global_x()

6

In [34]:
x

6

If a function calls for a value that is not provided in the arguments or is not defined inside the function, the Python will search for the value in the higher scopes.

In [35]:
# in this function, we ask Python to print the value of x 
# even though we do not define its value. 
# Python finds x in the global environment

def search_for_x():
    print(x)
    return x

In [36]:
search_for_x()

6


6

## Scope Order in Python

Taken from: https://realpython.com/python-scope-legb-rule/

Python will search scopes in the following order:

- Local (or function) scope is the code block or body of any Python function. This Python scope contains the names that you define inside the function. These names will only be visible from the code of the function.

- Enclosing (or nonlocal) scope is a special scope that *only exists for functions nested inside other functions.* If the local scope is an inner or nested function, then the enclosing scope is the scope of the outer or enclosing function. This scope contains the names that you define in the enclosing function. The names in the enclosing scope are visible from the code of the inner and enclosing functions.

- Global scope is the top-most scope in a Python program, script, or module. This Python scope contains all of the names that you define at the top level of a program or a module. Names in this Python scope are visible from everywhere in your code.

- Built-in scope is a special Python scope that's created whenever you run a script or open an interactive session. This scope contains names such as keywords, functions, exceptions, and other attributes that are built into Python.

![](https://miro.medium.com/v2/resize:fit:654/0*hsE2OKgoLM3L6RL6.png)

A visual from https://miro.medium.com/v2/resize:fit:654/0*hsE2OKgoLM3L6RL6.png

In [37]:
x, y, z = 1, 1, 1

def f():
    y = 2  # changing y to 2, only affects the value inside the function
    return x, y, z   # it does not find x or z in the local environment, 
                     # so it searches the higher scope

print(f())
print(x, y, z)

(1, 2, 1)
1 1 1


In [38]:
x, y, z = 1, 1, 1

def f():
    y = 2
    def g():
        z = 3
        return x, y, z 
    return g()

print(f())
print(x, y, z)

(1, 2, 3)
1 1 1


- `g()` is defined inside `f()`
- When we call the function `f()`, the final line of `f()` calls `g()` and returns the value of `g()`.
- When `g()` runs, it sets `z = 3`. Inside `g()`, `x` and `y` are not defined. 
- To find those values, it searches the higher scope `f()` for x and y. It finds the value of `y = 2` defined inside `f()`. It finds `x = 1` in the top level scope.
- When `f()` runs, it returns `x = 1`, `y = 2`, `z = 3` while x, y, z are all equal to 1 in the top-level environment.

In [39]:
x, y, z = 1, 1, 1

def g():
    z = 3
    return x, y, z

def f():
    y = 2
    return g()

print(f())
print(x, y, z)

(1, 1, 3)
1 1 1


- `g()` and `f()` are both defined in the global environment.  
- The function `f()` returns the value of `g()`
- When `g()` runs, it sets `z = 3`. Inside `g()`, `x` and `y` are not defined. 
- To find those values, it searches the higher scope which is the global environment because `g()` is defined inside the global environment. It uses the values in the global environment `x = 1` and `y = 1`.
- It does not matter that `g()` was called from inside `f()`. When `g()` needs to search a higher scope, it searches the environment in which the function is defined.

In [40]:
# keyword global gives the function access to the value in the global environment
x, y, z = 1, 1, 1

def f():
    y = 2
    def g():
        global z  # calling global, gives g access to the global value of z
        z = 3     # will assign 3 to the global variable z
        return x, y, z
    return g()

print(f())
print(x, y, z)

(1, 2, 3)
1 1 3


- `g()` is defined inside `f()`

- When we call the function `f()`, the final line of `f()` calls `g()` and returns the value of `g()`.

- When `g()` runs, it accesses the global variable `z`. It sets `z = 3` in the global environment. 
- Inside `g()`, `x` and `y` are not defined. To find those values, g() searches the higher scope `f()` for x and y. 
- g() finds the value of `y = 2` defined inside `f()`. It finds `x = 1` in the top level scope.

- When `f()` runs, it returns `x = 1`, `y = 2`, `z = 3`.
- Because `g()` has access to `z` in the global environment, the value of z is now 3 after the function runs.

In [41]:
x, y, z = 1, 1, 1

def g():
    z = 3
    return x, y, z

def f():
    global y
    y = 2
    return g()

print(g()) # when we first run g(), it uses the global values of x and y, but the local value of z. Local value of z does not change global value z.
print(x, y, z)

(1, 1, 3)
1 1 1


- `g()` and `f()` are both defined in the global environment.

- When `g()` runs, it sets `z = 3`. 
- Inside `g()`, `x` and `y` are not defined. 
- To find those values, it searches the higher scope which is the global environment because `g()` is defined inside the global environment. 
- It uses the values in the global environment `x = 1` and `y = 1`.

In [42]:
print(f())  # when we run f(), the global value of y is changed.
print(x, y, z)

(1, 2, 3)
1 2 1


- When we call the function `f()`, it modifies the value of `y` in the global environment. The final line of `f()` calls and returns the value of `g()`. 
- This time, when `g()` looks for a value of `y`, it finds the value of `y` in the global environment which is now 2.

In [43]:
p, q = 1, 1

def f():
    global s   # will create s in the global
    s = 2
    return p, q, s
f()

(1, 1, 2)

In [44]:
s

2

If you use the keyword `global` inside a function it will create the variable in the global environment if necessary.

In [45]:
x, y, z = 1, 1, 1

def f():
    global y
    print("current value of y is " + str(y))
    y = 4
    def g():
        global y
        print("current value of y is now " + str(y))
        y = 10 
        print("current value of y is finally " + str(y))
        global z  
        z = 3
        return x, y, z
    return g() 

In [46]:
print(f())
print(x, y, z)

current value of y is 1
current value of y is now 4
current value of y is finally 10
(1, 10, 3)
1 10 3


Both the function `g()` and `f()` access the global variable `y`. Each time we assign a new value to `y`, it updates the value in the global environment.

In [47]:
x, y, z = 1, 1, 1

def f():
    y = 4
    def g():
        nonlocal y  
        y = 10  # affects the y defined inside f
        global z  
        z = 3
        return x, y, z
    print(x, y, z)  # this line is run before g() is called
    return g()  # when g() is called, y will be modified

print(f())
print(x, y, z)

1 4 1
(1, 10, 3)
1 1 3


- When we call the function `f()`, it sets a local variable `y = 4`. 
- It defines a function `g()` inside `f()`. It prints the values `x, y, z`. At this time, `y = 4`.
- The final line of `f()` calls `g()` and returns the value of `g()`. 
- When `g()` is called, it accesses the nonlocal variable `y`. The nonlocal keyword tells the function to search the higher scope, in this case, the scope of `f()`. It sets nonlocal `y = 10` and global `z = 3`. It returns `x = 1` global, `y = 10` nonlocal, `z = 3` global.

- Because `g()` has access to `z` in the global environment, the value of z is now 3 after the function runs. 
- However, the value `y` in the global environment remains 1 because it only modified the nonlocal variable `y`.

In [48]:
p, q = 1, 1

def f():
    nonlocal r   # will return an error because r does not exist in the nonlocal environment
    r = 2
    return p, q, r

f()

SyntaxError: no binding for nonlocal 'r' found (3454248193.py, line 4)

If you ask for a nonlocal variable but there is no higher scope (other than the global environment), Python will return an error.

<h1> Statistics 21 <br/> Have a good night! </h1>