# QF 625 Introduction to Programming
## Lesson 2 | An Introduction to `NumPy` | `RE`view

> Welcome back, Team :) Let's begin our first meeting of Week 2.

> In the previous lesson, you have learned about `methods and functions` that are available in built-in Python, along with variables and data types.

> First, let us begin with some basic built-in Python that you need to fully understand before we proceed.

> Here's a quick reminder regarding the useful hotkeys for scripting on Jupyter Notebook :)

- `a` inserts a cell above 
- `b` inserts a cell below
- `dd` (double d) deletes a cell 
- `esc` jumps out of the cell
- `return/enter` gets into the cell
- `m` makes your cell markdown
- `y` makes your cell code
- `shift + control + -` splits the cell
- `shift + m` merges the cells
- `shift + l` adds line numbers into the cell

> Now you will start learning about how to use `packages`.

> As Python is open-source language, using the humongous ecosystem of packages will help you work more efficiently.

> A `package is a collection of Python modules and scripts` giving you new data types, functions, and methods.


### How to `install` a ***package***?

> To use a package, you need to download it first.

> Let's use the command `pip3 install target_package` so that you can install packages of your interest.

> For downloading a package, you need to do this just once, yet you should import in your workspace whenever you wish to use it.

> The command below will load the NumPy package into Python for your use.

In [None]:
import numpy as np

***Wait, why do we use alias here (i.e., as `np`)? ?*** 

> As you will see below, 

- To access the array() function, you need to use np.array() to indicate that the function is from the NumPy package.

> Yes, the reason why we have used np as alias is to minimize our typing task.

### The Basics

> Using `NumPy`, you can create a new data type called `array`.

> Why use data type `array`?

**`array` is useful for financial analysis because...**

- array `stores` data more efficiently

- array `performs` faster than built-in Python lists in terms of computations (access in reading and writing items faster as the package is optimized for numerical analyses)

- array `shows` better performance with relatively larger datasets

- array, most importantly, **`enables` you to utilize `array-related functions`**--you can perform statistical modelling and visualization easier, which is critical for financial analysis.





> **A good way to understand about the usefulness of NumPy is to compare array with list (yes, that list that you learned in the previous lesson).**

#### Differences 1. Arrays can contain only a single data type (unlike lists).

In [None]:
your_list = ["Year", 2019, False]
print(your_list)

In [None]:
type(your_list)

> As you will see below, arrays in NumPy will convert the elements in the list to the most compatible data types.

In [None]:
# Note that function array() takes a list as its input. 
your_array = np.array(["Year", 2019, False])

# As noted above, to use array(), you need to use np.array()
print(your_array)

your_array2 = np.array([2020, "Coronavirus", True])
print(your_array2)

In [None]:
print(type(your_array))

In [None]:
# Here are lists.
earnings_list = [10.09, 10.28, 2.21, 6.19, 8.24]
prices_list = [99.98, 87.68, 154.23, 162.12, 121.11]

> How would you make objects `earnings` and `prices` arrays?

In [None]:
earnings_array = np.array(earnings_list)
prices_array = np.array(prices_list)

print(earnings_array);print(prices_array)

#### Differences 2. Arrays have different ways of operations (than lists).

> Let's see how lists behave first.

In [None]:
pe_ratio_list = prices_list + earnings_list
print(pe_ratio_list)

> The two objects were merely concatenated. That's not what we want...

> ***Arrays allow for efficient numerical manipulation of its elements.***

> Let's calculate `the dollar amount an investor can expect to invest in a company to receive one dollar of that company’s earnings`--yes, the `price to earnings ratio`--using two arrays, earnings_array and prices_array above.

In [None]:
pe_ratio_array = prices_array / earnings_array
print(pe_ratio_array)

> You could see here that arrays perform `element-wise mathematical operations`.

#### Indexing, Subsetting, Filtering, & Slicing: Similarities between `array` and `list`

> We have seen differences between arrays and lists.

> Here are also similarities.

In [None]:
earnings_subset_three_in_the_middle = earnings_array[1:4]
print(earnings_subset_three_in_the_middle)

In [None]:
earnings_subset_the_last_two = earnings_array[-2:]
print(earnings_subset_the_last_two)

In [None]:
earnings_subset_every_other_element = earnings_array[0:5:2]

> Please address the error message above.

### Arrays in NumPy can be `multi`dimensional.

![](ndim.png)
#### How to add image (CLICK HERE TWICE)

> A common form of financial data comes with a rectangular form of data that contains rows and columns. 

> Such data can be represented with two-dimensional arrays.

> To create a two-dimensional array using NumPy, you can use the same function array().

> Instead of providing a single list as your input, let's pass in a list of two lists as your input.

> Here, let's pass earnings and prices to create a two-dimensional array.

In [None]:
pe_array = np.array([[10.09, 10.28, 2.21, 6.19, 8.24],[99.98, 87.68, 154.23, 162.12, 121.11]])
print(pe_array)

# Recall that there were two lists of earnings_list and prices_list
pe_array2 = np.array([earnings_list, prices_list])
print(pe_array2)

In [None]:
pe_array == pe_array2

> You might want to use `boolean` arrays as well. 

> As you will see below, Boolean arrays are quite useful for subsetting--stay tuned :)

#### Methods in Array

> Like list, array also has many useful methods.

##### array.shape

In [None]:
np.shape(pe_array)

##### array.size

In [None]:
np.size(pe_array)

##### array.transpose

In [None]:
pe_array_transposed = np.transpose(pe_array)

In [None]:
print(pe_array_transposed)

In [None]:
print(pe_array_transposed.shape);print(pe_array_transposed.size)

> Remember how to subset nested lists? Subsetting two-dimensional arrays is similar to subsetting nested lists. 

> In a 2D array, the indexing/slicing should be specific to the dimension of the array: **`array[row, column]`**

##### How would you subset `earnings` from the `transposed pe_array`? 

In [None]:
earnings = pe_array_transposed[ : , 0]
print(earnings)

##### How would you subset `prices` from the `transposed pe_array`? 

In [None]:
prices = pe_array_transposed[ : , -1]
print(prices)

##### How would you subset the `earnings and prices for third and forth companies` from the `transposed pe_array`?

In [None]:
pe_34 = pe_array_transposed[2:4, : ]
print(pe_34)

> ***Review & Expansion of Your Vocabulary: Below are some useful basics for array.***

In [None]:
# Get Dimension
pe_array_transposed.ndim

In [None]:
# Get Shape
pe_array_transposed.shape

In [None]:
# Get Type
pe_array_transposed.dtype

In [None]:
# Get Size (One Element in Your Array)
pe_array_transposed.itemsize

In [None]:
# Get Total Size
pe_array_transposed.nbytes

In [None]:
# Get the Number of Elements
pe_array_transposed.size

In [None]:
# Get a specific element [row, column]
pe_array_transposed[1, 1]

In [None]:
# Get a specific row 
pe_array[1, :]

In [None]:
# Get a specific column
pe_array[:, 4]

In [None]:
# Getting a little more fancy [start:end:step]
pe_array[0, 1:-1:2]

#### `WARNING`: Please be careful when `copying arrays`!

In [None]:
a = np.array([1,2,3]) # Imagine that we have one array.
b = a
b[0] = 100

print(b) # This is fine.
print(a) # This is weird.

In [None]:
c = np.array([1,2,3])
d = c.copy() # use copy method 
d[0] = 100

print(c) # Now this will be fine :)

### Mathematics with NumPy

> **`We all love mathematics`. For a lot more**, [check this out](https://docs.scipy.org/doc/numpy/reference/routines.math.html).

- For example, `linear algebra`, look at [here](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html).

#### Statistics

> Not only can you perform element-wise calculations on NumPy arrays, you can also calculate summary statistics such as range, mean, and standard deviation of arrays using functions from NumPy.

##### Calculating the range (minimum and maximum values)

In [None]:
print(pe_array)

In [None]:
np.min(pe_array)

In [None]:
np.max(pe_array, axis=1)

In [None]:
np.sum(pe_array, axis=0)

##### Calculating the mean (`mean`) and standard deviation (`std`)

In [None]:
earnings_mean = np.mean(earnings_array)
print(earnings_mean)

In [None]:
earnings_mean2 = np.mean(pe_array[0,:])
earnings_mean == earnings_mean2

In [None]:
prices_std = np.std(prices_array)
print(prices_std)

In [None]:
prices_std2 = np.std(pe_array[-1,:])
prices_std == prices_std2

##### Generating a sequence of numbers

> Often you may want to create an array of a range of numbers (e.g., 1 to 500) without having to type in every single number. 

> The NumPy function `arange()` is an efficient way to create numeric arrays of a range of numbers--using arange() can be much faster than typing each individual element.

> The arguments for `arange()` include the `start`, `stop`, and `step interval` as follows: `np.arange(start, stop, step)`


In [None]:
ticker_ids = np.arange(1, 501 , 1)
print(ticker_ids)

> How would you create `odd numbers only`?

In [None]:
ticker_ids_odd = np.arange(1, 501, 2)
print(ticker_ids_odd)

> How would you create **`even`** numbers only then? 

In [None]:
ticker_ids_even = ticker_ids_odd + 1
print(ticker_ids_even)

#### Boolean arrays can be a very powerful way to subset arrays. 

> As a case in point, let's try to identify the earnings that are greater than average from a list of earnings.

> To do so, let's find the mean value of earnings first.

In [None]:
earnings_mean = np.mean(earnings_array)

##### How would you index earnings that are lesser than average

> Hint: You might want to create a boolean array first.

In [None]:
boolean_array = (earnings_array < earnings_mean)
print(boolean_array)

In [None]:
earnings_below_mean = earnings[boolean_array]
print(earnings_below_mean)

In [None]:
earnings_below_mean2 = earnings[earnings_array < earnings_mean]
earnings_below_mean == earnings_below_mean2

> Boolean array can be used for strings as well. 

> Let's create the names of companies with their associated industry first. 

> Here, your want to find all companies that are categorized as `Investment Services` industry.

In [None]:
company_array = np.array(["Facebook", "Amazon", "Goldman Sachs", "Red Bull", 
                         "Wells Fargo", "McKinsey", "Tesla"])
industry_array = np.array(["Internet", "Internet", "Investment Services", "Food & Beverage", 
                          "Investment Services", "Management Consulting", "Mobility"])

company_industry_array = np.array([company_array, industry_array])
print(company_industry_array)

##### How would you subset Investment Services industry and print companies in Investment Servecies?

In [None]:
bool_array = (industry_array == "Investment Services")
print(bool_array)

In [None]:
investment_services = company_array[bool_array]
print(investment_services)

> For your information, there is numpy_financial package that contains a collection of elementary financial functions. 

> It will make your life easier when working with financial values.

> For example, the function .pv(rate, nper, pmt, fv) allows you to calculate the present value of an investment with some parameters:

- `rate` The rate of return of the investment
- `nper` The lifespan of the investment
- `pmt` The (fixed) payment at the beginning or end of each period
- `fv` The future value of the investment

> You can use this formula in many ways (e.g., you can calculate the present value of future investments in today's dollars).

In [None]:
import numpy_financial as npf

> Before you run the code above, you should have installed the package `numpy-financial`.

In [None]:
your_investment = npf.pv(rate=0.04, nper=20, pmt=0, fv=15000)

> Here, the present value returned is negative, so we multiply the result by -1

In [None]:
print("Your Investment is worth " + str(round(-your_investment, 2)) + " in today's dollars")

In [None]:
your_friend_investment = npf.pv(rate=0.02, nper=40, pmt=0, fv=15000)
print("Your friend's investment is worth " + str(round(-your_friend_investment, 2)) + " in today's dollars")

> Similarly, you can also calculate the future value of an investment the following parameters:

- `rate` The rate of return of the investment
- `nper` The lifespan of the investment
- `pmt` The (fixed) payment at the beginning or end of each period (which is 0 in our example)
- `pv` The present value of the investment

> Here, you can use the function .fv(rate, nper, pmt, pv).

> Note that you should `input a negative value into the pv parameter` if it represents `a negative cash flow (cash going out)`. 

> That is, if you were to compute the future value of an investment, requiring an up-front cash payment, you would need to `input a negative value to the pv parameter` in the function .fv().

# Estimate Your Investment's Future Value

In [None]:
your_investment_future = npf.fv(rate=0.04, nper=20, pmt=0, pv=-20000)
print("Your investment will return a total of $" + str(round(your_investment_future, 2)) + " in 20 years")

# Estimate the Future Value of Your Friend's Investment

In [None]:
your_friend_investment_future = npf.fv(rate=0.08, nper=20, pmt=0, pv=-20000)
print("The future value of your friend's investment will return a total of $" + str(round(your_friend_investment_future, 2)) + " in 20 years")

##### Now let's adjust future values of your investment for inflation with the following steps:

**1. forecast the future value of an investment given a rate of return**

**2. discount the future value of the investment by a projected inflation rate**

> Here, we will `utilize both functions .fv() and .pv()` to estimate the projected value of a given investment in today's dollars, adjusted for inflation.

> ***Scenario***: `Investment returning 7% per year for 25 years`

In [None]:
your_brother_investment = npf.fv(rate=0.07, nper=25, pmt=0, pv=-15000)
print("Your brother's investment will return a total of $" + str(round(your_brother_investment, 2)) + " in 25 years")

> ***Scenario***: `Inflation rate of 2.5% per year for 25 years`

In [None]:
your_brother_investment_discounted = npf.pv(rate=0.025, nper=25, pmt=0, fv=your_brother_investment)
print("After adjusting for inflation, your brother's investment is worth $" + str(round(-your_brother_investment_discounted, 2)) + " in today's dollars")

> `Thank you for working with the script :)`

In [None]:
exit()