# Basics of Programming II

## 2.1 Functions, For Loops and If Statements

Functions, 'for' loops and 'if' statements are by far the most ubiquitous and useful commands in your programming arsenal. They allow us to solve more complicated (and interesting) problems by creating scripts that are both recursive and iterative. They also allow us to make our code more succinct, generalisable and efficient.

## Functions

A function is a piece of resuable code that can be used to perform some sort of action. In Python there are 'built in' functions -- which we have seen above, but there are also *user-defined functions*. These are functions that we write ourselves. Functions in programming are similar to mathematical functions, you take a set of inputs, pass them through the function and get some outputs.


![For Loop](https://github.com/varunsatish/Coding-Tutorials/blob/master/images/function_dia.jpg?raw=true)

In order to use a function, we must define it with the call `def`, give it a name, and define its inputs.

**NOTE**: When I use the function in practice, my input variable does not have to be the same name as what I specify when I define the function.

In [1]:
def my_function(x):  # We need to specify a function handle, and some arguments
    y = x**2  # Raises x to the power 2
    return (y)

This function takes in some object ```x``` as an input, and squares it. It is clear from this then that the object ```x``` must be an integer or float. 

In [4]:
my_function(5)

25

We can have functions that take in multiple inputs

In [23]:
def other_function(x , y):
    z = x*y + y**2 + 3 # ** takes a power
    return (z)

There are two ways to call this function, either I can keep the ordering that has been specified in the function definition, or I can assign values to the inputs directly

In [26]:
number_1 = 4
number_2 = 9

#ordering kept the same 

other_function(number_1, number_2)

#inputs specified

other_function(y = number_2, x = number_1)


120

### Why are functions useful?

Functions allow us to write code that is compact and efficient. If you are writing code that follows a structured, repeated pattern, it is much easier and much more efficient to define functions. 

## For Loops

It is hard to understate the importance of For Loops in any programming language. They allow us to create iterative blocks of code, meaning we can 'cycle' through elements of an object. Let's suppose that we are working with a list first. **These are fundamental**.

![For Loop](https://github.com/varunsatish/Coding-Tutorials/blob/master/images/for_loop.jpg?raw=true)

In [7]:
numbers = [1,2,3,4,5]
print(numbers)

[1, 2, 3, 4, 5]


In [8]:
def complex_function(x):
    y = x + x**5 + (3*x - 5)**2
    return(y)

In [9]:
for number in numbers:
    out = complex_function(number)
    print(out)

6
35
262
1077
3230


In [10]:
out_list = []

for x in numbers: #note that the choice of index (in general) does not matter, there are some choices which cause problems but you will recieve an error message  
    out = complex_function(x)
    out_list.append(out)
    
print(out_list)
                    

[6, 35, 262, 1077, 3230]


### Why are for loops useful?

For loops allow us to write code that is *recursive*. This means that for loops allow us to write code that repeats many times. This is useful for many computational exercises, but also things such as data cleaning.

## A (kind of important) note about indexing

We use indexes in order to keep track of data in a convenient fashion. The example most of us will be most familar with is the idea of indexing a variable $x$ with respect to the an individual $i$, this is where we get the $x_i$ notation from, the quantity of $x$ belonging to the $i^{th}$ individual. 

We can do a similar thing with programming. When we initialise a for loop we need to tell the for loop which objects it must operate on. When I have an initialisation that looks like `for <element> in <list>:`, I am telling the loop that I want to operate on each object within the list. 

Lets say that we have a list that looks like this: `friends = ['Lea', 'Tommy', 'Pat', 'Chris']`. We have seen already that we can 'index' a list, meaning I can draw out the $i^{th}$ element of the list. `friends[1] = 'Tommy'`. Depending on what we are trying to achieve, we may want to iterate over each **object** within the list or we may want to iterate over each **index** within the list. What does this mean?

Suppose I want to print each element in the list, I can write one piece of code that will look like:

`for name in friends:
    print(name) `

This is iterating over each object within the list, note that it has nothing to do with the indexes. It simply prints each object. 

We could also specify the for loop in the following way

`for i in range(0, len(friends)):
    print(friends[i])`
    
Here, we are not iterating over each object in the list, we are iterating over a set of numbers which we can use to index objects within the list.

Which one to use? It doesn't really matter, it's easier to use the first way, but if some of you are familiar with MATLAB the second way might be a little easier. Also, the second way only works with lists, there are objects in Python which cannot be iterated over, however there are some applications where you might just need to iterate over numbers in order to select variables by their indicies.



## If Statements

If statements act like a yes/no statement. If a condition is met, we can program the script to carry out an action. We can also specify what happens if the statement does not hold, we can also combine them with for loops and functions we have defined ourselves.

![Boolean Logic](https://github.com/varunsatish/Coding-Tutorials/blob/master/images/boolean_logic.png?raw=true)

In [None]:
numbers = [0,1,2,3,4,5]

for element in numbers:
    if element == 2: # == is a boolean logical
        print('Great')
    else:
        print('Not Great')

## Storing Data: Appending a List

Lists are extremely useful for storing unstructured data. One of the most useful methods of storing data is appending an empty list. You can think of this intuitivley as creating an empty box and iterativley storing variables in the box. In practice, this method works really well with for loops. 

The algorithm looks like this:

![append list](https://github.com/varunsatish/Coding-Tutorials/blob/master/images/app_list.jpg?raw=true)

This is an example of the code:

In [3]:
output_list = [] #creating an empty list

inputs = [1,2,3,4,5]

for i in range(0, len(inputs)):
    output = inputs[i]**(2)
    output_list.append(output)
    
print(output_list)

[1, 4, 9, 16, 25]


## Exercise 2.1

Suppose we are conducting linear regression in the univariate case, that is our model looks like this: 

$$y = x_i\beta + u_i, \quad u_i \sim N(0, \sigma^2)$$

Where $x_i$ is a a regressor capturing the $i^{th}$ individuals age, and $u_i$ is the error term. You should hopefully be familiar with something that looks like this from your Econometrics/Statistics classes.

**Question 1**: We know the formula for $\hat{y}$ is as follows:

$$\hat{y} = x_i \hat{\beta}$$

Now, suppose we have estimate $\hat{\beta}$ to be 0.43. Define a function that would predict $\hat{y}$ given some$ x_i$

**Question 2**: Using a for loop, construct a list of $\hat{y}$ values given a list of `x` which looks like `x = [1, -5, 9, 4, 1, -3]`.

**Question 3**: For each  $\hat{y_i}$, apply the following and store the results in a list

![question](https://github.com/varunsatish/Coding-Tutorials/blob/master/images/2_1q3.jpg?raw=true)





**Advanced**: We know that the formula for OLS estimate of $\beta$ is 
$$\hat{\beta} = \frac{\sum_{i = 1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i = 1}^n (x_i - \bar{x})^2}$$




Define a function that will calculate $\hat{\beta}$ for a given list of $x = [x_1, x_2 \dots x_n]$ and $y = [y_1, y_2 \dots y_n]$. **Hint**: Try to think about what we are actually doing if we were to do this by hand, we calculate $(x_1 - \bar{x})(y_1 - \bar{y}), (x_2 - \bar{x})(y_2 - \bar{y}) \dots (x_n - \bar{x})(y_n - \bar{y})$ and $(x_1 - \bar{x})^2, (x_2 - \bar{x})^2 \dots (x_n - \bar{x})^2$ then take sums and divide.

In [None]:
# Solution.

# Question one.

def compute_y_hat(beta, x):
    y_hat = beta*x
    return(y_hat)

# Question two.

x = [1. -5, 9, 4, 1, -3]
y = []

for x_i in x:
    y.append(compute_y_hat(0.43, x_i)) 
    
    


# Advanced.

# Numpy is a package that allows for mathematical computation.
import numpy as np

def beta_hat(x,y):
    # Finding the means -- we use a package known as numpy, typically read in as np.
    x_bar = np.mean(x)
    y_bar = np.mean(y)
    
    # Creating empty lists so that we can store individual numerators and denominators.
    numerator_list = []
    denominator_list = []
    for i in range(len(x)):
        # For each individual calculate the numerator
        numerator = (x[i] - x_bar)*(y[i] - y_bar)
        # For each individual calculate the denominator
        denominator = (x[i] - x_bar)**2
        # Store both the numerator and denominator in lists
        numerator_list.append(numerator)
        denominator_list.append(denominator)
        # Take the sum of all the individual numerators and denominators to find beta hat
    b = sum(numerator_list)/(denominator_list)
    # Return the output, this is outside of the loop
    return(b)

        
        