# Lecture 4 - Intro to Python for Econometrics

## Software

### What is Python?

Python is a high-level, interpreted programming language that has gained massive popularity in the last decade. Its simple and readable syntax makes it relatively easy to learn, while its many libraries and modules allow for advanced functionality.

Python is a versatile language that can be used for various applications, from web development to data analysis, scientific computing, machine learning, and more. In econometrics, Python's powerful data analysis libraries, such as NumPy, Pandas, Scikit-learn and Statsmodels, make it an ideal tool for statistical analysis and modelling.

### What is a Jupyter Notebook?

A Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualisations, and text. The term "notebook" references the traditional notebook used in labs or research to keep track of one's work and observations.

Jupyter notebooks support over 40 programming languages, including Python, and are widely used in data cleaning and transformation, numerical simulation, statistical modelling, data visualisation, machine learning, and more.

Each notebook is made up of a series of cells. These cells can contain either code or markdown text. The markdown cells are used for text and include formatted text, images, hyperlinks, LaTeX equations, etc. The code cells contain the actual code that you want to run. You can run each cell individually, and it will display any output directly below the cell.

### What is Binder?

Binder is a free, open-source online service that allows you to turn a GitHub repository into a collection of interactive notebooks. It's powered by BinderHub, an open-source tool that deploys the Binder service in the cloud. This makes it an excellent tool for sharing reproducible research, as others can view your work and interact with the code and data.

In this class, we'll use Binder to host our Python notebooks. This means you can run and experiment with the code provided without needing to install Python or any of the necessary libraries on your local machine, and is what this notebook is running on right now.

## Python Basics


### Variables
Variable assignment associates a value to a variable.

Below, we assign the value “Hello World” to the variable x

In [1]:
x = "Hello World"

Once we have assigned a value to a variable, Python will remember this variable as long as the current session of Python is still running.

Notice how writing x into the prompt below outputs the value “Hello World”.

In [2]:
x

'Hello World'

### Comments
Comments are lines of code that are not run by Python. They are used to leave short notes or explanations to make your code easier to understand. 

A comment is created by using the # symbol. Any text that follows the # symbol on the same line is ignored by Python.

In [3]:
i = 10   # Assign 10 to i
j = 100  # Assign 100 to j

# Add i and j and assign the result to k
k = i + j
k

110

### Functions
Functions are blocks of code that perform a specific task. They are useful for executing repetitive tasks without having to rewrite the same code over and over again.

Functions can take in inputs, called arguments, and return outputs.

For example, the `len()` function takes in a string or list as an argument and returns the length of the string or list.

In [4]:
len("Hello World") # Length of a string

11

In [5]:
len([1, 2, 3, 4, 5]) # Length of a list

5

#### Help
We can figure out what a function does by asking for help.

In Jupyter notebooks, this is done by placing a `?` after the function name (without using parenthesis) and evaluating the cell.

For example, we can ask for help on the print function by writing `print?`.

Depending on how you launched Jupyter, this will either launch

JupyterLab: display the help in text below the cell.
Classic Jupyter Notebooks: display a new panel at the bottom of your screen. You can exit this panel by hitting the escape key or clicking the x at the top right of the panel.


In [6]:
print?

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method

#### Objects and Types
Everything in Python is an object.

Objects are “things” that contain 1) data and 2) functions that can operate on the data.

Sometimes we refer to the functions inside an object as methods.

We can investigate what data is inside an object and which methods it supports by typing `.` after that particular variable, then hitting `TAB`.

It should then list data and method names to the right of the variable name like this:

![Auto Compete list](img/obj_list.png)

We often refer to this as “tab completion”.

Let’s do this below. Keep going down until you find the method split.



In [7]:
# Type a period after x and then find split in the list of methods
# Once you find it, press enter and add () to the end of it like split()
x 

'Hello World'

We often want to identify what kind of object some value is– called its “type”.

A “type” is an abstraction which defines a set of behavior for any “instance” of that type i.e. `2.0` and `3.0` are instances of `float`, where `float` has a set of particular common behaviors.

In particular, the type determines:

- the available data for any “instance” of the type (where each instance may have different values of the data).
- the methods that can be applied on the object and its data.
We can figure this out by using the `type` function.

The `type` function takes a single argument and outputs the type of that argument.



In [8]:
type(2)

int

In [9]:
type("Hello World")

str

In [10]:
type([1, 2, 3])

list

## Modules
Python takes a modular approach to tools, meaning that it does not come with all the functionality built in. Instead, we need to import the modules that we want to use.

The main modules that we will use are:

- `numpy` for numerical computing
- `pandas` for working with data
- `matplotlib` for plotting
- `statsmodels` for statistical modelling
- `scikit-learn` for machine learning

We can import a module using the `import` keyword.

```
import package_name
```

We can also import a module and give it a shorter alias using the `as` keyword.

```
import package_name as alias
```

We are then able to use the functions and methods from the module by prefixing them with the module name or alias.

```
package_name.function_name()  # or
alias.function_name()
```
For example, to find the python version we are using, we can use the `sys` module.

In [11]:
import sys
sys.version

'3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:39:40) [Clang 15.0.7 ]'

We can also import specific functions from a module using the `from` keyword.

```
from package_name import function_name
```

for example, to import the `sqrt` function from the `math` module, we can use


In [12]:
from math import sqrt
sqrt(4)

2.0

## Operations

### Arithmetic Operations
Python supports the following arithmetic operations:

- Addition: `+`
- Subtraction: `-`
- Multiplication: `*`
- Division: `/`
- Exponentiation: `**`
- Modulo: `%`
- Floor Division: `//`

Use `( )` to group operations

In [13]:
a = 4
b = 2.0

print("a + b is", a + b)
print("a - b is", a - b)
print("a * b is", a * b)
print("a / b is", a / b)
print("a ** b is", a**b)
print("a // b is", a//b)

# Putting operations together
print("(a^2 + b^3)/(1/2) is", (a**2 + b**3)/(1/2))

a + b is 6.0
a - b is 2.0
a * b is 8.0
a / b is 2.0
a ** b is 16.0
a // b is 2.0
(a^2 + b^3)/(1/2) is 48.0


### Booleans Comparison Operations

A boolean is a type that denotes true or false.

In [14]:
x = True
y = False  # Note the capitalisation

type(x)

bool

In [15]:
y

False


We usually create Boolean by making comparisons. Python supports the following comparison operations:

- Equal to: `==`
- Not equal to: `!=`
- Greater than: `>`
- Less than: `<`
- Greater than or equal to: `>=`
- Less than or equal to: `<=`
- Identity: `is`
- Negated identity: `is not`
- Membership: `in`
- Negated membership: `not in`
- Boolean OR: `or` 
- Boolean AND: `and`
- Boolean NOT: `not`

In [16]:
# Boolean values
a = 4
b = 2

print("a > b", "is", a > b)
print("a == b", "is", a == b)
print("a >= b", "is", a >= b)

# Boolean operators
print("a in [1, 2, 3]", "is", a in [1, 2, 3])
print("a not in [1, 2, 3]", "is", a not in [1, 2, 3])


a > b is True
a == b is False
a >= b is True
a in [1, 2, 3] is False
a not in [1, 2, 3] is True


What do you think the following expressions will evaluate to? (Try to guess before uncommenting and evaluating the cell below)

In [27]:
# bool_1 =  a in [1, 2, 3] or a in [4, 5, 6] 
# print("a in [1, 2, 3] or a in [4, 5, 6] : ", bool_1)

# bool_2 =  a in [1, 2, 3] and a in [4, 5, 6]
# print("a in [1, 2, 3] and a in [4, 5, 6] : ", bool_2)

## Strings
Python works well with text data. Text data is represented using strings.

To denote that you would like something to be stored as a string, you place it inside of quotation marks.

For example,

```
"This is a string"      # Double quotes
'This is also a string' # Single quotes
This is not a string    # No quotes
```
```

Some operations that we can perform on strings include:

- Putting two strings together: `+`
- Repeating a string: `*`

In [56]:
x = "Hello"
y = "World"

print("x + y is", x + y)
print("x * 3 is", x * 3)

x + y is HelloWorld
x * 3 is HelloHelloHello


Other arithmetic operations are not supported on strings, and will result in a `TypeError`.

A `TypeError` is an error that is raised when an operation or function is applied to an object of an inappropriate type.

In [57]:
x - y

TypeError: unsupported operand type(s) for -: 'str' and 'str'

There are many methods that we can use on strings. We can see a list of them by using tab completion.

In [61]:
# Add a `.` after x and press tab to see the list of methods
x.find("e")

1

In [64]:
x.upper()

'HELLO'

In [60]:
x.count("l")

2

## Exercises

Create the following variables 

- `D`: A floating point number with a value of $10,000$
- `r`: A floating point number with a value of $0.025$
- `T`: An integer with a value of $30$

In [None]:
# Your code here


Now let's calculate the present discounted value of the annuity using the formula

$$
PDV = \frac{D}{(1 + r)^T}
$$

Assign the result to a variable called `pdv`.

In [None]:
# Your code here


## Resources

There are many ways to set up Python on your computer. The easiest way is to download and install the Anaconda distribution, which comes with Python and many of the necessary libraries pre-installed, including Jupyter Notebooks. You can find out more about Anaconda and download it from [here](https://docs.anaconda.com/free/anaconda/install/). You can find out more about working with Anaconda [here](https://docs.anaconda.com/free/anaconda/getting-started/) and the references therein.




