# Applied Math With Numpy

## Introduction

NumPy is a fundamental package for scientific computing in Python. It's particularly powerful for handling large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions. This makes it an invaluable tool for students and professionals in economics and business.

In this tutorial, we'll explore:
1. The hardware concepts behind NumPy's efficiency
2. Key features of NumPy
3. Practical examples in linear algebra, probability/statistics, and optimization

Let's begin by importing NumPy and setting up our environment:


In [2]:
import numpy as np
import time
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)

## 1. Hardware Concepts Behind NumPy's Efficiency

NumPy's speed and efficiency stem from several key factors:

### 1.1 Vectorization

Vectorization allows operations to be performed on entire arrays without explicit loops. This leverages CPU's SIMD (Single Instruction, Multiple Data) capabilities, enabling parallel processing of data.

Let's compare a vectorized NumPy operation with a traditional Python loop:


In [3]:
def python_sum(arr):
    total = 0
    for i in range(len(arr)):
        total += arr[i]
    return total

def numpy_sum(arr):
    return np.sum(arr)

# Create a large array
arr = np.random.rand(1000000)

# Time Python sum
start = time.time()
python_result = python_sum(arr)
python_time = time.time() - start

# Time NumPy sum
start = time.time()
numpy_result = numpy_sum(arr)
numpy_time = time.time() - start

print(f"Python time: {python_time:.6f} seconds")
print(f"NumPy time: {numpy_time:.6f} seconds")
print(f"Speedup: {python_time / numpy_time:.2f}x")


Python time: 0.133897 seconds
NumPy time: 0.001046 seconds
Speedup: 128.02x


### 1.2 Memory Layout

NumPy uses contiguous memory blocks for arrays, which improves cache efficiency. This is particularly important for large datasets common in economic and financial analyses.

### 1.3 Compiled C Code

NumPy's core is written in C, which is compiled to machine code. This eliminates the overhead of Python's interpreter for numerical operations.

## 2. Key Features of NumPy

Now, let's explore some of NumPy's key features that make it powerful for economic and business applications.

### 2.1 N-dimensional Arrays

NumPy's fundamental object is the ndarray (N-dimensional array). Let's create and manipulate some arrays:


In [5]:
# Create a 1D array
arr1d = np.array([1, 2, 3, 4, 5])
print("1D array:", arr1d)

# Create a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n2D array:\n", arr2d)

# Create a 3D array
arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\n3D array:\n", arr3d)

# Array attributes
print("\nShape of 2D array:", arr2d.shape)
print("Dimensions of 3D array:", arr3d.ndim)
print("Data type of 1D array:", arr1d.dtype)


1D array: [1 2 3 4 5]

2D array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

3D array:
 [[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]

Shape of 2D array: (3, 3)
Dimensions of 3D array: 3
Data type of 1D array: int64


### 2.2 Array Creation Functions

NumPy provides various functions to create arrays efficiently:


In [None]:
# Create an array of zeros
zeros_arr = np.zeros((3, 4))
print("Array of zeros:\n", zeros_arr)

# Create an array of ones
ones_arr = np.ones((2, 3, 2))
print("\nArray of ones:\n", ones_arr)

# Create an array with a range of values
range_arr = np.arange(0, 10, 2)
print("\nArray with range:", range_arr)

# Create an array with linearly spaced values
linspace_arr = np.linspace(0, 1, 5)
print("\nLinearly spaced array:", linspace_arr)


### 2.3 Array Indexing and Slicing

Efficient data access is crucial for analyzing economic data. NumPy provides powerful indexing and slicing capabilities:

In [None]:
# Create a sample 2D array
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print("Original array:\n", arr)

# Indexing
print("\nElement at (1, 2):", arr[1, 2])

# Slicing
print("\nFirst two rows, last two columns:\n", arr[:2, 2:])

# Boolean indexing
bool_idx = arr > 5
print("\nElements greater than 5:\n", arr[bool_idx])

# Fancy indexing
row_indices = np.array([0, 2])
col_indices = np.array([1, 3])
print("\nSelected elements:\n", arr[row_indices[:, np.newaxis], col_indices])

### 2.4 Broadcasting

Broadcasting allows NumPy to perform operations on arrays with different shapes, which is particularly useful in economic modeling:


In [None]:
# Create arrays
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([10, 20, 30])

# Broadcasting
result = arr1 + arr2
print("Result of broadcasting:\n", result)

## 3. Practical Examples for Economics and Business

Now, let's apply these NumPy features to solve problems relevant to economics and business.

### 3.1 Linear Algebra: Portfolio Optimization


Portfolio optimization is a fundamental concept in finance, based on Modern Portfolio Theory developed by Harry Markowitz in 1952. The key idea is to maximize the portfolio's expected return for a given level of risk, or minimize risk for a given level of expected return.

Key concepts:
1. **Expected Return**: The anticipated return of an investment over a period of time.
2. **Volatility**: A measure of the risk, calculated as the standard deviation of returns.
3. **Covariance**: A measure of how two assets move together.
4. **Efficient Frontier**: The set of optimal portfolios that offer the highest expected return for a defined level of risk.

The Markowitz model uses these inputs:
- Expected returns for each asset
- Covariance matrix of returns
- Portfolio weights (percentage invested in each asset)

The model calculates:
- Portfolio expected return: $E(R_p) = \sum_{i=1}^n w_i E(R_i)$
- Portfolio variance: $\sigma_p^2 = \sum_{i=1}^n \sum_{j=1}^n w_i w_j \sigma_{ij}$

Where:
- $w_i, w_j$ are the weights of assets i and j
- $E(R_i)$ is the expected return of asset i
- $\sigma_{ij}$ is the covariance between assets i and j

The efficient frontier is found by varying the weights to create portfolios with different risk-return profiles.

In the following example, we'll use NumPy to implement a simple version of this model:

In [None]:
# Generate random returns for 5 assets over 1000 days
returns = np.random.normal(0.001, 0.02, (1000, 5))

# Calculate mean returns and covariance matrix
mean_returns = np.mean(returns, axis=0)
cov_matrix = np.cov(returns.T)

# Generate random portfolio weights
weights = np.random.random(5)
weights /= np.sum(weights)

# Calculate portfolio return and volatility
portfolio_return = np.sum(mean_returns * weights)
portfolio_volatility = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))

print("Portfolio Weights:", weights)
print("Expected Return:", portfolio_return)
print("Expected Volatility:", portfolio_volatility)

# Plot efficient frontier
num_portfolios = 10000
results = np.zeros((3, num_portfolios))

for i in range(num_portfolios):
    weights = np.random.random(5)
    weights /= np.sum(weights)
    portfolio_return = np.sum(mean_returns * weights)
    portfolio_volatility = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
    results[0,i] = portfolio_return
    results[1,i] = portfolio_volatility
    results[2,i] = portfolio_return / portfolio_volatility

plt.figure(figsize=(10, 6))
plt.scatter(results[1,:], results[0,:], c=results[2,:], cmap='viridis')
plt.colorbar(label='Sharpe Ratio')
plt.xlabel('Volatility')
plt.ylabel('Return')
plt.title('Efficient Frontier')
plt.show()


### The Sharpe Ratio: Measuring Risk-Adjusted Return

The Sharpe ratio, developed by Nobel laureate William F. Sharpe, is a widely used measure in finance to evaluate the performance of an investment or a portfolio. It provides a way to compare different investments by considering both their returns and their risks.

### Definition

The Sharpe ratio is calculated as:

$S = \frac{R_p - R_f}{\sigma_p}$

Where:
- $S$ is the Sharpe ratio
- $R_p$ is the return of the portfolio
- $R_f$ is the risk-free rate of return
- $\sigma_p$ is the standard deviation of the portfolio's excess return (i.e., the portfolio's volatility)

### Interpretation

- A higher Sharpe ratio indicates better risk-adjusted performance.
- The ratio quantifies the additional return an investor can expect to receive for taking on extra risk.
- It allows for comparison between portfolios with different levels of risk.

### Key Points

1. **Risk-Adjusted Return**: The Sharpe ratio adjusts return by the level of risk taken to achieve that return.

2. **Benchmark Comparison**: It uses the risk-free rate as a benchmark, showing excess return per unit of risk.

3. **Volatility as Risk Measure**: The ratio uses standard deviation as a proxy for risk, which assumes returns are normally distributed.

4. **Time Period Sensitivity**: The Sharpe ratio can vary significantly depending on the time period used for calculation.

5. **Limitations**: It doesn't distinguish between upside and downside volatility and may not be suitable for non-normally distributed returns.

### Applications

- **Portfolio Evaluation**: Comparing the performance of different portfolios or investment strategies.
- **Manager Assessment**: Evaluating the skill of investment managers.
- **Risk Management**: Helping in the construction of portfolios with optimal risk-return tradeoffs.

### Example

If a portfolio has an average annual return of 15%, the risk-free rate is 3%, and the portfolio's annual standard deviation is 10%, the Sharpe ratio would be:

$S = \frac{15\% - 3\%}{10\%} = 1.2$

This indicates that for each unit of risk (volatility) taken, the portfolio is generating 1.2 units of excess return over the risk-free rate.

Understanding the Sharpe ratio is crucial for investors and financial analysts as it provides a standardized measure of risk-adjusted performance, allowing for meaningful comparisons across different investment opportunities.


### 3.2 Probability and Statistics: Monte Carlo Simulation
### Monte Carlo Simulation: Embracing Uncertainty in Predictions

Monte Carlo simulation is a powerful technique used to model the probability of different outcomes in processes that involve significant uncertainty. It's named after the famous casino in Monaco, reflecting the method's reliance on repeated random sampling to obtain numerical results.

### The Concept of "Intervening Random Variables"

In many real-world scenarios, we encounter processes that are influenced by factors which are inherently uncertain or unpredictable. These factors are represented as random variables in our models, and they "intervene" in the process, making exact predictions challenging. Monte Carlo simulation helps us handle this uncertainty by running many scenarios with different random outcomes.

### A Simple Example: Weather Forecasting

Consider the task of predicting tomorrow's temperature. While we know today's temperature, several uncertain factors influence tomorrow's weather:

1. Normal daily temperature fluctuations
2. Wind effects
3. Cloud cover
4. Unexpected events (e.g., sudden weather system changes)

Each of these factors can be modeled as a random variable with a specific probability distribution based on historical data or expert knowledge.

### The Monte Carlo Process

1. **Model Definition**: Create a model that includes both known factors and random variables.
2. **Distribution Assignment**: Assign probability distributions to the random variables.
3. **Repeated Sampling**: Run the model many times (e.g., 10,000 iterations), each time randomly sampling from the defined distributions.
4. **Result Analysis**: Analyze the collection of results to understand the range and likelihood of possible outcomes.

### Applications Beyond Weather

This same approach is invaluable in finance, economics, and business for problems like:

- Stock price prediction
- Risk assessment in project management
- Supply chain optimization
- Portfolio performance evaluation

By embracing the uncertainty inherent in these problems, Monte Carlo simulation provides insights into the range of possible outcomes and their probabilities, enabling more informed decision-making in complex, uncertain environments.



In finance, it's often used for option pricing, where the Black-Scholes-Merton model may not be applicable (e.g., for more complex options or when assumptions of the model are violated).

Key concepts:
1. **Random Walk**: The idea that stock prices follow a random and unpredictable path.
2. **Risk-Neutral Valuation**: The principle that the value of an option is its expected future payoff discounted at the risk-free rate.
3. **Geometric Brownian Motion**: A continuous-time stochastic process often used to model stock prices.

The Monte Carlo method for option pricing involves these steps:
1. Simulate many random price paths for the underlying asset.
2. Calculate the option payoff for each path.
3. Take the average of these payoffs.
4. Discount this average back to today at the risk-free rate.

The formula for simulating stock prices using Geometric Brownian Motion is:

$S_T = S_0 \exp((r - \frac{1}{2} \sigma^2) T + \sigma \sqrt{T} z)$

Where:
- $S_T$ is the stock price at time T
- $S_0$ is the initial stock price
- $r$ is the risk-free rate
- $\sigma$ (sigma) is the volatility of the stock
- $T$ is the time to maturity
- $z$ is a random number drawn from a standard normal distribution

In the following example, we'll use NumPy to implement a Monte Carlo simulation for pricing a European call option:


In [None]:
def monte_carlo_call_option(S, K, T, r, sigma, num_simulations):
    # Generate random stock price paths
    z = np.random.standard_normal(num_simulations)
    ST = S * np.exp((r - 0.5 * sigma**2) * T + sigma * np.sqrt(T) * z)
    
    # Calculate option payoffs
    payoffs = np.maximum(ST - K, 0)
    
    # Discount payoffs to present value
    option_price = np.exp(-r * T) * np.mean(payoffs)
    
    return option_price

# Option parameters
S = 100  # Current stock price
K = 100  # Strike price
T = 1    # Time to maturity (in years)
r = 0.05 # Risk-free rate
sigma = 0.2 # Volatility

num_simulations = 1000000

option_price = monte_carlo_call_option(S, K, T, r, sigma, num_simulations)
print(f"Estimated Call Option Price: ${option_price:.2f}")

### 3.3 Optimization: Linear Programming

Linear Programming (LP) is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships.

LP is widely used in economics, business, and operations research for problems such as:
- Resource allocation
- Production planning
- Transportation and logistics optimization
- Portfolio optimization

Key components of a linear programming problem:
1. **Objective Function**: The quantity to be maximized or minimized.
2. **Decision Variables**: The unknowns whose values we're trying to determine.
3. **Constraints**: Limitations on the possible values of the decision variables.

A standard form of a linear programming problem is:

#### Maximize (or Minimize): 
$$c^T x$$

#### Subject to: 
$$Ax \leq b$$
$$x \geq 0 $$

Where:
- $x$ is the vector of decision variables
- $c^T x$ is the objective function
- $Ax \leq b$ represents the constraints
- $A$ is the coefficient matrix of the constraints
- $b$ is the right-hand side vector of the constraints

Common methods to solve LP problems include:
- Simplex Method
- Interior Point Methods

In the following example, we'll use SciPy's linprog function, which implements these methods, to solve a simple production optimization problem:


In [None]:
from scipy.optimize import linprog

# Objective function coefficients (negative for maximization)
c = [-20, -12]  # Profit per unit for products A and B

# Inequality constraint matrix
A = [[1, 2],    # Raw material constraint
     [1, 1],    # Labor hours constraint
     [-1, 0],   # Minimum production constraint for A
     [0, -1]]   # Minimum production constraint for B

# Inequality constraint vector
b = [100,  # Maximum raw material
     80,   # Maximum labor hours
     -10,  # Minimum production for A
     -5]   # Minimum production for B

# Solve the linear programming problem
result = linprog(c, A_ub=A, b_ub=b, method="highs")

print("Optimal production:")
print(f"Product A: {result.x[0]:.2f} units")
print(f"Product B: {result.x[1]:.2f} units")
print(f"Maximum profit: ${-result.fun:.2f}")

### NOTE: Numerical Solvers for Optimization Problems

Optimization problems in economics, finance, and operations research often involve finding the best solution from a set of possible alternatives. While some simple problems can be solved analytically, most real-world optimization problems require numerical methods due to their complexity.

A numerical solver is an algorithm or a set of algorithms designed to find approximate solutions to mathematical problems that cannot be solved analytically. In the context of optimization, these solvers iteratively improve a solution until they reach an optimal (or near-optimal) result within a specified tolerance.

### The HiGHS Solver

HiGHS (High Performance Software for Linear Optimization) is a state-of-the-art solver for linear programming problems. It was developed at the University of Edinburgh and has become the default solver in SciPy's `linprog` function as of version 1.6.0 [1].

Key features of HiGHS include:

1. **High Performance**: HiGHS is designed to be fast and efficient, often outperforming older solvers like the revised simplex method.

2. **Robustness**: It can handle a wide range of problem types and sizes with numerical stability.

3. **Multiple Algorithms**: HiGHS incorporates both simplex and interior-point methods, automatically choosing the most appropriate method for a given problem.

4. **Open Source**: The solver is freely available and can be integrated into various optimization frameworks.

HiGHS uses three main solution methods [2]:

- **Dual Revised Simplex**: An efficient implementation of the classic simplex algorithm.
- **Interior Point**: A primal-dual interior point method, useful for large-scale problems.
- **Simplex**: A primal simplex method, which can be faster for certain problem types.

When using `scipy.optimize.linprog` with `method='highs'`, the solver automatically chooses between these methods to optimize performance.

References:

[1] Anjos, M. F., Lodi, A., & Tanneau, M. (2019). A decentralized framework for the optimal coordination of distributed energy resources. IEEE Transactions on Power Systems, 34(1), 349-359.

[2] Huangfu, Q., & Hall, J. A. J. (2018). Parallelizing the dual revised simplex method. Mathematical Programming Computation, 10(1), 119-142.

[3] SciPy documentation: https://docs.scipy.org/doc/scipy/reference/optimize.linprog-highs.html


## Conclusion

In this tutorial, we've explored the key features of NumPy and how they can be applied to solve problems in economics and business. We've seen how NumPy's efficient array operations, powered by underlying hardware optimizations, can handle complex calculations quickly and easily.

From portfolio optimization to Monte Carlo simulations and linear programming, NumPy provides a solid foundation for numerical computing in Python. As you continue your studies or work in economics and business, you'll find NumPy to be an invaluable tool for data analysis, modeling, and decision-making.

## Further Reading

1. [NumPy User Guide](https://numpy.org/doc/stable/user/index.html)
2. [Quantitative Economics with Python](https://python.quantecon.org/) by Thomas J. Sargent and John Stachurski
