# Introduction to Python

## Python 3


### Variables and data types

Variables allow us to store values. They can be created by asssigning a value to them




In [None]:
x = 5
y = "Finance"
print(x)

### Comment

To insert a comment in your code, type # and Python will regard the rest of the line as comment.

In [None]:
print(y) # The print function takes an object and prints an output associated with it on the screen

Variables store values of different types. These can be specified directly. 

<b>Strings</b> can store any sequence of characters. They must be defined with quotes (preferrable double).

<b>Integers</b> can store whole numbers.

<b>Floating points</b> can store numbers with decimals.

You can also get the the variable type with the type() function.

In [None]:
a = str("Bubble")   # Data Type = string
b = int(10)         # Data Type = integer
c = float(1.3)     # Data Type = float

print(type(a))
print(type(b))
print(type(c))

### Naming conventions

To name variables it is important we follow certain conventions:

- Variable name should be meaningful;
- Variable name cannot start with a number;
- Variable name can olny have alpha-numeric characters and underscore;
- Variable name cannot contain blank spaces. For that use underscore;
- Variable name is case sensitive.

In [None]:
var2 = 1
variableName = 2
var_name = 3

print(var2, variableName, var_name)

### Basic mathematical Operations

Arithmetic operators:

| Operator | Name | Example |
| :-: | :-: | :-: |
| + | Addition | a + b |
| - | Subtraction | a - b |
| * | Multiplication | a * b|
| / | Division | a / b|
| ** | Exponentiation | a ** b|



In [None]:
a = 4 + 7
b = 4 ** 2
print(a)
print(b)

Comparison operators:

| Operator | Name | Example |
| :-: | :-: | :-: |
| == | Equals | a == b |
| != | Not equals | a != b |
| > | Greater than | a > b	|
| <	| Less than | a < b |
| >= | Greater than or equals to | a >= b |	
| <= | Less than or equas to | a <= b |

In [None]:
a != b

Assignment operators:

| Operator | Name | Example | Represents |
| :-: | :-: | :-: | :-: |
| `+=` | Addition assignment | a `+=` b | a = a + b |
| `-=` | Subtraction assignment | a `-=` b | a = a - b |
| `*=` | Multiplication assignment | a `*=` b | a = a * b |
| `/=` | Division assignment | a `/=` b | a = a / b |
| `**=` | Exponentiation assignment | a `**=` b | a = a ** b |

In [None]:
a = 11
a += 5
print (a)

### If statements

Conditional statements can be constructed with <b>if</b> statements

Python requires indentation to identify a block of code.


In [None]:
if a > b:
    print ("a is greater than b")

<b>Else</b> statements can be combined with <b>if</b> to execute a code when the previous condition is not satisfied.

In [None]:
if a > b:
    print ("a is greater than b")
else:
    print("a is less than b")

<b>Elif</b> statements allow to test for multiple conditions

In [None]:
if a > b:
    print ("a is greater than b")
elif a == b:
    print("a equals b")
elif a < b:
    print("a is less than b")

### For loops

For statements allows us to loop through a set of code or objects a specified amount of times.

The <b>range()</b> function is useful to create a counter.

This function has 0 as its starting value and increases at increments of 1 for the number of times specified by us.

In [None]:
for c in range(5):
    print(c)

It is also possible to specify a different initial value and an ending value for the <b>range()</b> function. The ending value is not included.

In [None]:
for c in range(2, 5):
    print(c)

<b>Else</b> and <b>Elif</b> statements can be also used to exeucte an opoeration after the loop is concluded.

In [None]:
for c in range(2, 5):
    print(c)
else:
    print("Task completed")

### Libraries

Python libraries contain built-in functionalities that are helpful to perfom mathematical operations.

[<b>Numpy</b>](https://numpy.org/doc/stable/reference/index.html) is a library used for working with arrays.

In [None]:
import numpy as np

array = np.array((0,1,2,3,4,5,6,7,8)) # creating an array

print(array)

In [None]:
x = array[0] # getting the value of the first array item (index 0) and assigning to variable x

print("The value of the first item is", x)

In [None]:
print (array[1:4]) # slicing items from index 1 to 4 (excluded)

In [None]:
print (array[2:]) # slicing items from index 2 to the end of the array

In [None]:
print (array[:5]) # slicing items from the beginning of the array to index 5 (excluded)

In [None]:
np.exp(4) # Natural exponentiation

In [None]:
np.log(4) # Natural logarithm

In [None]:
np.sum(array) # Sum 

In [None]:
np.mean(array) # Mean

In [None]:
np.std(array) # Standard Deviation

[<b>Pandas</b>](https://pandas.pydata.org/docs/) is a library used for working with data sets.

In [None]:
import pandas as pd

data = pd.read_excel("caschool.xlsx") # Importing our data set as a Pandas Data Frame

data

We can also use Pandas to create tables from arrays.

Suppose we wish to create the following table:


| row name    | column 1 | column 2
| -------- | ------- | -------
| first row| 1   | 2
| second row|3     | 4



In [None]:
# Initialize data in rows
data = [['first row', 1, 2], ['second row', 3, 4]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['row name', 'column 1', 'column 2'])

print(df)

# Practice Questions

## Question 1

Using the same table below, answer the following questions

|  | Unemployed (Y=0) | Employed (Y=1) | Total |
| :-: | :-: | :-: | :-: | 
| Non-college grads (X=1) | 0.078 | 0.673 | 0.751 |
| College grads (X=1)| 0.042 | 0.207 | 0.249 |
| Total | 0.12 | 0.88 | 1.000 |

<b> a. Recreate this table in Python as a Pandas Data Frame </b>

<b> b. Compute the marginal distribution of Y </b> 

<b> c. Find E(Y) </b> 

<b> d. Find E(Y|X=0) and E(Y|X=1) </b>

<b> e. Find the difference in means </b>

## Question 2

The spreadsheet 'Age_HourlyEarnings.xlsx', contains the joint distribution of age (Age) and average hourly earnings (AHE) for 25- to 34-year-old full-time workers in 2015 with an education level that exceeds a high school diploma. Use this joint distribution to carry out the following exercises.

The dataset is used in the reference textbook Introduction to Econometrics, 4th edition (Stock and Watson).

<b>

Download the dataset [here](https://www.princeton.edu/~mwatson/Stock-Watson_3u/Students/EE_Datasets/Age_HourlyEarnings.xlsx)
</b>

In [None]:
ahe = pd.read_excel('Age_HourlyEarnings.xlsx')

<b> a. Compute the marginal distribution of Age<b/>

<b> b. Compute the mean of AHE for each value of Age; that is, compute, E(AHE|Age = 25), and so forth </b>

<b> c. Compute and plot the mean of AHE versus Age. Are average hourly earnings and age related? Explain. <b/>