## Some fundamental elements of programming IV
### How to create functions

### Functions

Imagine if you were to be taking a data science class and you were often asked to create:

  - Datasets of different means and standard deviations, say using code like: 
     `data = mu + sd*np.random.randn(siz,1)` 
  - Plots (like histograms) of the datasets. For example using `sns.kdeplot`.
  
You might be tempted to memorize, or even write down on a notepad, the specific code that seems to be used most often. 

Looking up the code snippet from a piece of paper is probably fine in a few cases. But imagine being a pro data scientist, handling dozens of code and analysis requests from multiple managers. Many of these requests will require reusing the same block of code. You *could* technically copy and paste from your note pad, but... 

Given that your work day will likely only end after all requests have been attended to, copy and pasting might become "suboptimal" soon, to say the least.

Methods exist to speed up your work, reduce mustakes and, and eliminate the typing needed to reuse code. These methods are called [**functions**](https://en.wikipedia.org/wiki/Subroutine).

**Functions** are often used to address situations just like the ones described above, where a section of code is used over and over again.

[Functions, a.k.a. subroutines](https://en.wikipedia.org/wiki/Subroutine) or "methods" if they are functions that objects (like numpy arrays) can perform, are sequences of code packaged as units. The units can be called at different locations in code to quickly and efficiently perform the oprations implemented in the packaged code.

Functions accept inputs and return outputs. Functions, contain useful code that performs operations that are likely to be used multiple times. 

Functions can help make code shorter, more nimble, and easier to read. In python, functions are defined with a function name and the special word `def` in front of it. Let's look at our first example of a function.

Imagine wanting to make a function that returns the equivalent of a coin flip. The code below will do that for us. Let's analyze this function.

In [7]:
def coinflip() :
    import numpy as np
    flip = np.random.randint(2) # return a random 0 or 1
    if flip == 0 :
        result = 'Tails!'
    else :
        result = 'Heads!'
    return result

In [8]:
# Let's test this function. 
# Run this cell multiple times
coinflip()

'Heads!'

You should get Tails or Heads!

### Anatomy of a function

Let's analyze the function above. 

  - def: The function begins with the word def.
  
  `def`
  
  - The function name: def is followed by the function's name. The name is chosen by the programmer to reflect what the function does.
  
  `def coinflip`
  
  - The parenthesis: The function name is followed by a pair of parenthesis `()`. 
  
  `def coinflip()`
  
  - The column sign (`:`): The parenthesis are followe by a column sign. 
  
  `def coinflip() :`

  - The code lines: Indented within the def are the "actual code" lines which make up the function. When a function runs, the computer runs its body lines from top to bottom.

  `import numpy as np
   flip = np.random.randint(2) # return a random 0 or 1
   if flip == 0 :
      result = 'Tails!'
   else :
      result = 'Heads!'
   return result`

Importantly, all the code that belongs inside the function must be indented so as to align with the first letter of the function name. In other words, the following is incorrect:

In [25]:
def coinflip() :
import numpy as np
flip = np.random.randint(2) # return a random 0 or 1
if flip == 0 :
    result = 'Tails!'
else :
    result = 'Heads!'
return result

IndentationError: expected an indented block (2455045183.py, line 2)

Just like in loops, *indentation is critical in Python*!

#### Operations and variables inside a function are isolated

The code inside the function is a module, isolated from the outside world. It is not affected by the code outside of the function except. This means that the variables definitions done outside of the function are different than those done inside a function. Even those with the same names, say an `A` var defined inside the function and one defined outside of the function are not the same.

Let's test this. Let's define variable say A, and assign to it a value.

In [1]:
A = 10

Next, let's define a function with a variable inside also called A. Let's assign a different value to the `A` inside the function.

In [2]:
def myFuncCodeIsIsolated() :
    A = 0

Next let's run our function (this will assign the value 0 to the variable A inside the function).

In [4]:
myFuncCodeIsIsolated()

Let's evaluate the variable `A` outside of the function (remmeber it was set to 0 inside):

In [5]:
A

10

Now, the code above should return `10`. That is the value assigned to A **outside** of the function. Indeed, the variable A inside the function was assigned the value `0`. The variable `A` inside the function is never returned outside of the fuction. 

This example is to prove that the variable `A` inside of the function and that outside are different. This is because the code inside of the function is isolated from that outside.

Because the code is isolated, we need to make sure we do all the imports needed and define all the variables needed **inside the function**. Imports done outside of the function will permeate inside a function. 

So the second line above imports numpy as we will call numpy functions ("hey!") inside our function: `import numpy as np`

The following lines are standard code. They are actually the lines of code we care for. The ones we would like to modularize and make easily reusable by embedding them into the function. 

```
flip = np.random.randint(2) # return a random 0 or 1
    if flip == 0 :
        result = 'Tails!'
    else :
        result = 'Heads!'
```

The final line of our example function is a special one. It express the variable that needs to be returned to the rest fo the code, outside of the function.

`return result`

The variable `result` is the only variable that communicates outside of this function. The variable is generated inside `coinflip()` and its value is returned outside of `coinflip()` for the rest of the code to use. 

### A more advanced version of a function

Let's go back to one of our needs. Imagine wanting to use what you just learned about functions to write one that generates distributions of normally distributed random numbers with a certain standard deviation.

The code we have been using is the following: `data = mu + sd*np.random.randn(siz,1)`

Let's make a function out of that so to not have to rewrite the same lines over and over.

In [None]:
def my_data(mu,sd,siz):
    import numpy as np
    data = mu + sd*np.random.randn(siz,1);
    return data

In [None]:
my_data(5,2,10)

OK. It works. The function, imports numpy, then calls randn to generate the data and then returns the data. That is pretty similar to the previous example.

But, this new function, has inputs. Yes, the code needs to receive inputs from the code outside the function. These inputs define the mean of the distribution (`mu`), the standard deviation (`sd`), and the size of the dataset (`siz`).

The number and type of inputs ("arguments") that a function can take are all defined when the function is created.

We will learn more about functions in the exercise and in future tutorials. Below, try to write a function that generates correlated datasets: