# **NumPy And Pandas**


## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">**Preparing the Notebook**</span>


## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">Importing libraries and functions</span>

<span style="font-family: 'Garamond', serif;
    font-size: 16px;
    text-indent: 0.25in;
    line-height: 1.5;">

It's good practice to import all required libraries and modules in the first code cell of a Notebook. Unless a new function or module is being introduced, this chapter follows the practice. If a library is already installed, it's imported to the Notebook. If a library isn't installed, it's installed and then imported.</span>

<span style="font-family: 'Garamond', serif;
    font-size: 16px;
    text-indent: 0.25in;
    line-height: 1.5;">
The standard Python installation includes five module imports: <font color='green'>os</font>, <font color='green'>sys</font>, <font color='green'>requests</font>, <font color='green'>types</font>, and <font color='green'>datetime</font>. The imports <font color='green'>os</font> and <font color='green'>sys</font> facilitate interaction with the operating system. The <font color='green'>requests</font> module enables sending and retrieving data from external URLs; in this notebook, the module is specifically used to access files from Dropbox. The <font color='green'>types</font> module lets us create the module from the Python code accessed from Dropbox. Finally, <font color='green'>datetime</font> allows for the creation and manipulation of date objects.</span>

<span style="font-family: 'Garamond', serif;
    font-size: 16px;
    text-indent: 0.25in;
    line-height: 1.5;">
The below code imports these modules, and specifically accesses <font color='green'>types</font> and <font color='green'>datetime</font> to import <font color='green'>ModuleType</font> and <font color='green'>date</font>, respectively. <font color='green'>ModuleType</font> will allow us to *instantiate* new modules, and we'll use <font color='green'>date</font> for our date calculations later.
</span>

<span style="margin-left: 1in;">

```
import os
import sys
import requests
from types import ModuleType
from datetime import date
```
</span>

<br>

<span style="font-family: 'Garamond', serif;
    font-size: 16px;
    text-indent: 0.25in;
    line-height: 1.5;">

NumPy and Pandas, though not standard Python libraries, are often preinstalled in Jupyter Notebooks as well as included in Notebook environments like Google Colaboratory (Colab). To ensure library availability, a <font color='green'>try</font> and <font color='green'>except</font> block is used to import NumPy and Pandas in the code below. The <font color='green'>try</font> and <font color='green'>except</font> block is used to resolve instances where the import fails: If the attempted import within the <font color='green'>try</font> segment fails, the <font color='green'>except</font> portion will install the libraries using <font color='green'>pip</font>, a package installer for Python. The exclamation mark preceding <font color='green'>pip</font> indicates that <font color='green'>pip</font> is running through the machine's console rather than within the Notebook. Upon successful import, NumPy and Pandas are *aliased as* (or assigned the names of) <font color='green'>np</font> and <font color='green'>pd</font>, respectively.$^{2}$
</span>

<span style="margin-left: 1in;">

```
try:
    import numpy as np
except:
    !pip install numpy
    import numpy as np
try:
    import pandas as pd
except:
    !pip install pandas
    import pandas as pd
```

</span>

<br>

<span style="font-family: 'Garamond', serif;
    font-size: 16px;
    text-indent: 0.25in;
    line-height: 1.5;">
The following cell shows our completed code cell for imports.
</span>



<hr>

<span style="font-family: 'Garamond', serif;
          font-size: 10.5px;
          line-height: 1.3;">

$^{2}$ For more information on <font color='green'>try</font> and <font color='green'>except</font> statements, see "<a href='https://patrickjhess.github.io/Introduction-To-Python-For-Financial-Python/Control_Statements.html#the-try-and-except'>Control Statements</a>."
</span>

In [None]:
import os
import sys
import requests
from types import ModuleType
try:
    import numpy as np
except:
    !pip install numpy
    import numpy as np
try:
    import pandas as pd
except:
    !pip install pandas
    import pandas as pd

## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">NumPy Arrays</span>


<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">

The NumPy library offers efficient numerical procedures that simplify complex calculations.$^{3}$ The fundamental object in this library is the NumPy array. An array is simply a data structure in which elements are arranged and accessed. It can be formed from a single value or from an iterable data type like a list, which holds a series of values.$^{4}$ NumPy must be imported before use; recall that in this Notebook, it was imported and aliased as <font color='green'>np</font> in the first code cell. The <font color='green'>array()</font> method is used to convert variables into NumPy arrays.$^{5}$ The following code creates an array from the list <font color='green'>[3%, 5%, 7%]</font> (lists are stylized with square brackets) and assigns that array to <font color='green'>interest_rates</font>. This is then passed to <font color='green'>display()</font>, which simply displays the contents of <font color='green'>interest_rates</font>.
</span>



<hr>

<span style="font-family: 'Garamond', serif;
          font-size: 10.5px;
          text-indent: 0.13in;
          line-height: 1.3;">

$^{3}$ For more information on NumPy, see "<a href='https://patrickjhess.github.io/Introduction-To-Python-For-Financial-Python/An_Introduction_To_NumPy.html#numpy'>A Quick Introduction To NumPy</a>."

$^{4}$ For more information on list objects, see "<a href='https://patrickjhess.github.io/Introduction-To-Python-For-Financial-Python/A_First_Look_At_Lists.html#a-first-look-at-lists'>A First Look at Lists</a>."

$^{5}$ Like other libraries, the alias <font color='green'>np</font> must be appended to the method, as in <font color='green'>np.array</font> instead of <font color='green'>array</font>.
</span>

In [None]:
interest_rates=np.array([0.03,0.05,0.07])
display(interest_rates)

array([0.03, 0.05, 0.07])

## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">NumPy Calculations</span>


<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">
    
Now that we've created an array and assigned it to <font color='green'>interest_rates</font>, we can perform calculations on that array. NumPy calculates array elements individually. For instance, the <font color='green'>exp()</font> method used below raises each element of <font color='green'>interest_rates</font> to Euler's number, $e$. This approach is faster and more convenient than iterating through the rates.
</span>

In [None]:
display(np.exp(interest_rates))

array([1.03045453, 1.0512711 , 1.07250818])

## <font color='green'>***Application: Create a NumPy Array***</font>


<div style="background-color:LightGray;
    border-left: 12px solid green;
    font-family: 'Garamond', serif;
    font-size: 17px;
    line-height: 1.5;
    padding: 15px">
<br>

Create a NumPy array of the present value factors for the rates of <font color='green'>interest_rates</font>. For hints, see [Chapter One Hints: Create a NumPy Array](https://patrickjhess.github.io/Hints-Results/Chapter_One_Hints.html#create-a-numpy-array), and check the [expected results here](https://patrickjhess.github.io/Hints-Results/Chapter_One_Results.html#create-a-numpy-array).

<br>
</div>


## <span style="text-align:center;font-family:Franklin Gothic Book', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">Two-Dimensional NumPy arrays</span>

<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">
    
The <font color='green'>pv_rates</font> array in the code below is a *two-dimensional* NumPy array (or matrix). It's created by combining two one-dimensional arrays&mdash;<font color='green'>interest_rates</font> and <font color='green'>pv_factors</font>&mdash;within the <font color='green'>array()</font> method. To access the rows of this two-dimensional array, we'll use the corresponding row *index* value. (The index value is simply an element's sequential position in the array. Remember that in programming, sequences begin at position 0, not 1). Finally, both the interest rates and present-value factors are displayed below using *f-strings*. These are like strings, but begin with an "f" and are used to include variables and expressions within a string. The variable must be enclosed in curly brackets (i.e., { }), and the indexed value must be enclosed in square brackets ([ ]).

</span>


In [None]:
#A calculation is make for each element of a numpy array
pv_factors=1/np.exp(interest_rates)
pv_rates=np.array([interest_rates,pv_factors])
display(pv_rates)
#Create an f string for each row.
#Each row must also be encapsulated in curly brackets
display(f'Interest Rates {pv_rates[0]}')
display(f'Present Value Factors {pv_rates[1]}')

array([[0.03      , 0.05      , 0.07      ],
       [0.97044553, 0.95122942, 0.93239382]])

'Interest Rates [0.03 0.05 0.07]'

'Present Value Factors [0.97044553 0.95122942 0.93239382]'

## <font color='green'>***Application: Manipulate a Two-Dimensional Array***</font>

<div style="background-color:LightGray;
    border-left: 12px solid green;
    font-family: 'Garamond', serif;
    font-size: 17px;
    line-height: 1.5;
    padding: 15px">
<br>

Create a two-dimensional array by raising the interest-rate row to Euler's number and multiplying the results by the present value factors. For hints, see [Chapter One Hints: Manipulate a Two-Dimensional Array](https://patrickjhess.github.io/Hints-Results/Chapter_One_Hints.html#manipulate-two-dimensional-array), and check the [expected results here](https://patrickjhess.github.io/Hints-Results/Chapter_One_Results.html#manipulate-two-dimensional-array).

</div>

## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">Creating a Pandas DataFrame from NumPy Arrays</span>

### <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">A simple example</span>


<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">
    
The Pandas library is a powerful resource for data analysis. Recall that like NumPy, we imported Pandas in our earlier code cell. For this section, we'll be using Pandas to access a structure called a *DataFrame*. A Pandas DataFrame organizes data into rows and columns, much like an Excel spreadsheet or a NumPy array. It can be created from a two-dimensional array, such as our newly created <font color='green'>pv_rates</font>. The simplest DataFrames automatically assign row and column index values; a more functional approach allows for the explicit naming of both. In the simple example below, the <font color='green'>DataFrame()</font> method of Pandas uses three arrays as arguments, generating a row for each array.

</span>


In [None]:
# Dataframe created with DataFrame() method
simple_example=pd.DataFrame((interest_rates,
              np.exp(interest_rates),
              1/np.exp(interest_rates)))
display(simple_example)

Unnamed: 0,0,1,2
0,0.03,0.05,0.07
1,1.030455,1.051271,1.072508
2,0.970446,0.951229,0.932394


## <font color='green'>***Application:Create a DataFrame from a Two-Dimensional Array***</font>

<div style="background-color:LightGray;
    border-left: 12px solid green;
    font-family: 'Garamond', serif;
    font-size: 17px;
    line-height: 1.5;
    padding: 15px">

<br>
    
Create a DataFrame with the two-dimensional array <font color='green'>***pv_rates***</font>.

For hints, see [Chapter One Hints: Create A DataFrame From A Two-Dimensional Array](https://patrickjhess.github.io/Hints-Results/Chapter_One_Hints.html#create-a-dataframe-from-a-two-dimensional-array).

Check [expected results here](https://patrickjhess.github.io/Hints-Results/Chapter_One_Results.html#create-a-dataframe-from-a-two-dimensional-array)
</div>

## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">Making rows into columns and labeling the columns</span>



<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">

In the below code, rows are converted to columns by using the <font color='green'>transpose()</font> method on our earlier DataFrame, <font color='green'>simple_example</font>. This method changes the rows into columns.$^{6}$ The transposed result is then assigned to <font color='green'>rows_to_columns</font>, and the columns are then assigned the elements of a list (<font color='green'>['Rates', 'Future Value', 'Present Value']</font>).  Transposing the DataFrame makes <font color='green'>'Rates'</font>, <font color='green'>'Future Value'</font>, and <font color='green'>'Present Value'</font> the variables and makes the rows the observations of those variables.

</span>

<hr>
<span style="font-family: 'Garamond', serif;
          font-size: 10.5px;
          text-indent: 0.13in;
          line-height: 1.3;">

6.&nbsp;The method can also be applied as simply <font color='green'>T</font>. See "<a href='https://patrickjhess.github.io/Introduction-To-Python-For-Financial-Python/An_Introduction_To_Pandas.html#the-transpose'>A Quick Introduction To Pandas</a>."

</span>



In [None]:
#transpose the simple_example dataframe and call it rows_to_columns
rows_to_columns=simple_example.transpose()
#(the transpose method can also be written as simple_example.T)
rows_to_columns.columns=['Rates','Future Value','Present Value']
display(rows_to_columns)

Unnamed: 0,Rates,Future Value,Present Value
0,0.03,1.030455,0.970446
1,0.05,1.051271,0.951229
2,0.07,1.072508,0.932394


## <span style="text-align:center;font-family:Franklin Gothic Medium', sans-serif;margin-top: 1.0em;margin-bottom: 0.5em;ont-style: italic;">The set_index() method</span>


Once the DataFrame has been transposed, the next step is to make the <font color='green'>Rates</font> column the index.  Why? The reason is that the values of <font color='green'>Rates</font> determine the values of the columns <font color='green'>Future Value</font> and <font color='green'>Pesent Value</font>. This change isn't necessary, but it draws our eye to a natural connection. Instead of an index that points only to a row number, the new index points to the rate that results in the future and present values.

The method <font color='green'>set_index()</font> takes a column name and removes the column from the DataFrame. Pandas has a built-in *fail-safe-switch*. By default, this change (and others we'll encounter later) doesn't permanently alter the DataFrame. This might seem a bit strange, but <font color='SeaGreen'>this characteristic prevents mistakes that are difficult to fix.</font> To make the change permanent, you can do the following:$^{7}$

<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">

1.&nbsp;Assign the transposed value of the DataFrame to another DataFrame:
</span>

<span style="margin-left: 1in;">

```
name_index_set_index=rows_to_columns.set_index('Rates')
display(name_index_set)
display(rows_to_columns)
```
</span>
<span style="font-family: 'Garamond', serif;
    font-size: 14px;
    text-indent: 0.25in;
    line-height: 1.5;">

2.&nbsp;Change the default value of the <font color='green'>inplace</font> argument of <font color='green'>set_index</font>  to True
</span>

<span style="margin-left: 1in;">

```
rows_to_columns.set_index('Rates',inplace=True)
display(rows_to_columns)
```
</span>

<hr>
<span style="font-family: 'Garamond', serif;
          font-size: 10.5px;
          text-indent: 0.13in;
          line-height: 1.3;">

7.&nbsp;See "<a href='https://patrickjhess.github.io/Introduction-To-Python-For-Financial-Python/An_Introduction_To_Pandas.html#making-a-column-the-index'>A Quick Introduction To Pandas</a>."

</span>


In [None]:
name_index_set_index=rows_to_columns.set_index('Rates')
display(name_index_set_index)
display(rows_to_columns)
rows_to_columns.set_index('Rates',inplace=True)
display(rows_to_columns)

Unnamed: 0_level_0,Future Value,Present Value
Rates,Unnamed: 1_level_1,Unnamed: 2_level_1
0.03,1.030455,0.970446
0.05,1.051271,0.951229
0.07,1.072508,0.932394


Unnamed: 0,Rates,Future Value,Present Value
0,0.03,1.030455,0.970446
1,0.05,1.051271,0.951229
2,0.07,1.072508,0.932394


Unnamed: 0_level_0,Future Value,Present Value
Rates,Unnamed: 1_level_1,Unnamed: 2_level_1
0.03,1.030455,0.970446
0.05,1.051271,0.951229
0.07,1.072508,0.932394
