# NumPy Operations
NumPy is a Python package that provides lots of the underlying functionality of pandas. In fact we encounter NumPy every time we see a NaN value. Pandas uses NumPy under-the-hood to optimise several of its internal computations too.

Before we start, let's load `pandas`, `numpy` and our dataset.  Notice that NumPy also has a preferred shortform.

In [None]:
import numpy as np
import pandas as pd


In [None]:
df = pd.read_CSV('TSLA_clean.csv')
df.Date = pd.to_datetime(df.Date)
df.set_index('Date', inplace=True)
## Data Frame Clean Up

## Log Returns

Logarithmic returns are often used in finance due to their useful statistical properties. They are **additive over time**, making them ideal for analysing historical returns across multiple periods.

Consider the following example:

* You invest £100
* On the first day, the simple return is +10% → your investment grows to £110
* On the second day, the simple return is -10% → your investment drops to £99

Using simple returns, you might assume the net change is 0%, since +10% and -10% appear to cancel out. But in reality, you've lost £1 — a **-1%** return overall.

Logarithmic returns correctly account for compounding. Here's how:

* Day 1: $\ln(110 / 100) ≈ 0.09531$
* Day 2: $\ln(99 / 110) ≈ -0.10536$
* Total log return: $0.09531 + (-0.10536) = -0.01005$

To convert back to a simple return:
$e^{-0.01005} - 1 ≈ -0.01$ → **-1%**, matching the actual loss.

The formula for calculating log returns is:

$\ln\left(\frac{{\text{price}_{\text{current}}}}{{\text{price}_{\text{previous}}}}\right)$

To compute daily log returns in code, use `.shift()` to get the previous day's price, then apply NumPy’s `log` function.


In [None]:
df['PrevClose'] = df.Close.shift()
df['LogReturns'] = np.log(df.Close / df.PrevClose)
## Created a new column for Previous Close in order to facilitate easier calculations
## 'LogReturns' is the application of the formula

df['SimpleReturns'] = df.Close.pct_change()
np.log(1+df.SimpleReturns)
## Calculating LogReturns from SimpleReturns


### Exercise: Cumulatively Comparing

The sum of the log returns is the natural logarithm of the cumulative return. To calculate the cumulative simple return from the log returns, sum the log returns over the period and exponentiate (NumPy has an `exp` function for this) the sum.

Calculate the cumulative return based on the simple returns, and then compare this to the cumulative simple return calculated from the log return.

In [None]:
## YOUR CODE GOES HERE
df['DailyReturn'] = df.Close.pct_change()
df["CumulativeReturn"] = (1 + df["DailyReturn"]).prod() - 1
print(f'Cumulative Simple Returns is {CumulativeReturn}')
## Derived from the formula Rtotal = 1 + Product of all R -1
df['CumulativeLogReturns'] = np.exp(df.LogReturns.sum()) - 1
print(f'Cumulative Logarithmic Returns is {CumulativeLogReturns}')
## Derived from the formula LogReturnTotal = e^Rsum - 1


## Other useful functions

Another useful NumPy function is `np.where()`, often used for populating columns with a signal or indicator, depending on if a condition is met. Let's create a column to colour code our trading days. Days will have a different colour depening on if the market closes higher (green) or lower (red) than the opening.

In [None]:
df['Colour'] = np.where(df.Close > df.Open, 'green', 'red')
## .Where requires three arguments (if this, then this, else this)


## The VWAP

VWAP (Volume-Weighted Average Price) over a period can be an important metric for evaluating trading activity. It is often thought of as the true average price for a stock, and is calculated using the following formula:

$$ \text{VWAP} = \frac{\sum_{i=1}^{n} \text{typical price}_i \cdot \text{volume}_i}{\sum_{i=1}^{n} \text{volume}_i} $$

The *typical price* for each period is calculated as:

$$ \text{typical price}_i = \frac{\text{high}_i + \text{low}_i + \text{close}_i}{3} $$

We can calculate VWAP with the help of the `np.dot()` function. This performs a dot product, which will multiply corresponding elements in the price and volume columns and then find the sum. Let's calculate the VWAP for our stock over this period, which will give us an indication of the price at which the bulk of trading took place.


In [None]:
df['TP'] = (df.High + df.Low + df.Close) / 3
## Important to always define
vwap = np.dot(df.TP, df.Volume) / np.sum(df.Volume)
##Derived from the formula VWAP = sum of (typical price * Volume) / sum of Volume

There is more to NumPy that we'll explore over the next days, and even more that we won't get a chance to use.