# NumPy

## What is NumPy?

NumPy (Numerical Python) is a powerful library for numerical computing in Python. It provides support for arrays, matrices, and a large collection of mathematical functions.

## Why use NumPy? 
Fast and efficient operations on large datasets.


Installation:

In [2]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.


## 1. Basics of NumPy Arrays
### 1.1. Creating Arrays
Array from Lists:

In [3]:
import numpy as np
arr = np.array([1, 2, 3])
print(arr)

[1 2 3]


Converting Python lists into NumPy arrays is the foundation of using the library.

- Multidimensional Arrays:

In [4]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_2d)

[[1 2 3]
 [4 5 6]]


## 1.2. Array Properties
- Shape, Size, and Data Types:

In [5]:
print(arr_2d.shape)   # Dimensions of the array
print(arr_2d.size)    # Total elements
print(arr_2d.dtype)   # Data type

(2, 3)
6
int32


## 1.3. Initializing Arrays
- Zeros and Ones:

In [6]:
zeros = np.zeros((3, 3))
ones = np.ones((2, 2))
print(zeros)
print()
print(ones)


[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

[[1. 1.]
 [1. 1.]]


- Array with a specific value or uninitialized:

In [7]:
full_array = np.full((2, 2), 7)
empty_array = np.empty((3, 3))

print(full_array)
print()
print(empty_array)

[[7 7]
 [7 7]]

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


## 1.4. Ranges and Random Arrays
- arange (Similar to Python’s range):

In [8]:
range_arr = np.arange(10)
print(range_arr)

[0 1 2 3 4 5 6 7 8 9]


- random array

In [9]:
random_arr = np.random.rand(3, 3)
print(random_arr)

[[0.7631761  0.71362149 0.08107203]
 [0.13656534 0.83951101 0.13888289]
 [0.28902778 0.55899107 0.35547378]]


## 2. Array Indexing and Slicing
### 2.1. Accessing Elements
- 1D Arrays:

In [10]:
print(arr_2d[1])  # Accessing the second element


[4 5 6]


- 2D Arrays:



In [11]:
print(arr_2d[0, 1])  # Element from the first row, second column


2


## 2.2. Slicing Arrays
- Slicing 1D and 2D Arrays:

In [12]:
slice_arr = arr[0:2]  # First two elements
slice_2d = arr_2d[0, 0:2]  # First row, first two columns

print(slice_arr)
print()
print(slice_2d)

[1 2]

[1 2]


Slicing allows you to extract specific portions of arrays.



## 3. Array Operations
### 3.1. Basic Arithmetic
- Element-wise Operations:

In [13]:
arr_sum = arr + 5
arr_prod = arr * 2


In [14]:
print(arr_sum)
print()
print(arr_prod)

[6 7 8]

[2 4 6]


NumPy supports broadcasting, allowing operations to be applied element-wise between arrays or scalars.

### 3.2. Matrix Operations
- Matrix Multiplication:

In [15]:
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
result = np.dot(mat1, mat2)
print(result)

[[19 22]
 [43 50]]


NumPy allows easy matrix multiplication using the dot() function.

### 3.3. Aggregate Functions
- Sum, Mean, Max, Min:

In [16]:
print(np.sum(arr))
print(np.mean(arr))
print(np.max(arr))
print(np.min(arr))


6
2.0
3
1


### 3.4. Axis-wise Operations
- Sum along axis:

In [17]:
sum_cols = np.sum(arr_2d, axis=0)  # Sum of columns
sum_rows = np.sum(arr_2d, axis=1)  # Sum of rows

print(sum_cols)
print(sum_rows)


[5 7 9]
[ 6 15]


## 4. Advanced Array Manipulation
### 4.1. Reshaping Arrays
- Reshape and Flatten:

In [18]:
reshaped = arr_2d.reshape(3, 2)
flattened = arr_2d.flatten()

print(reshaped)
print()
print(flattened)



[[1 2]
 [3 4]
 [5 6]]

[1 2 3 4 5 6]


Reshaping changes the dimensions, and flattening converts an array into a 1D array.

### 4.2. Stacking and Splitting Arrays
- Vertical and Horizontal Stacking:

In [19]:
stacked = np.vstack((arr, arr))
hstacked = np.hstack((arr, arr))

print(stacked)
print()
print(hstacked)


[[1 2 3]
 [1 2 3]]

[1 2 3 1 2 3]


- Splitting

In [20]:
split_arr = np.split(arr_2d, 2, axis=0)
print(split_arr)

[array([[1, 2, 3]]), array([[4, 5, 6]])]


## 5. Boolean and Advanced Indexing
### 5.1. Boolean Masking
- Creating a Boolean Mask

In [21]:
bool_mask = arr > 1
print(arr[bool_mask])  # Elements greater than 1

[2 3]


### 5.2. Fancy Indexing
- Using specific indices:

In [22]:
fancy_index = arr[[0, 2]]
print(fancy_index)


[1 3]


### 6. Broadcasting in NumPy
 Broadcasting simplifies arithmetic operations on arrays of different shapes.

In [23]:
arr_large = np.array([[1, 2], [3, 4], [5, 6]])
arr_small = np.array([1, 2])
result = arr_large + arr_small

print(result)

[[2 4]
 [4 6]
 [6 8]]


## 7. Performance Comparison with Lists
### 7.1. Time Comparison with Python Lists
 - Using timeit:

In [24]:
import timeit
python_list = list(range(1000))
numpy_array = np.arange(1000)
print(timeit.timeit(lambda: [x * 2 for x in python_list], number=1000))
print(timeit.timeit(lambda: numpy_array * 2, number=1000))


0.12460509999999658
0.007357100000007222


## 8. Advanced Functions
### 8.1. Linear Algebra
- Inverse, Determinants, and Eigenvalues:

In [26]:
matrix = np.array([[1, 2], [3, 4]])
inverse = np.linalg.inv(matrix)
det = np.linalg.det(matrix)
eigenvalues, eigenvectors = np.linalg.eig(matrix)

print(inverse)
print(det)
print(eigenvalues)
print(eigenvectors)

[[-2.   1. ]
 [ 1.5 -0.5]]
-2.0000000000000004
[-0.37228132  5.37228132]
[[-0.82456484 -0.41597356]
 [ 0.56576746 -0.90937671]]


## 8.2. Random Numbers and Distributions
- Normal Distribution:
NumPy allows you to generate random numbers from different distributions.

In [27]:
normal_dist = np.random.normal(0, 1, (2, 2))

print(normal_dist)

[[-1.56675637 -0.20813885]
 [-0.72898846 -0.48740057]]


## 9. Working with Multidimensional Arrays
### 9.1. Creating 3D Arrays
- 3D Array Example:

In [28]:
arr_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr_3d)


[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


## 9.2. Indexing and Slicing 3D Arrays
- Accessing Elements in 3D Arrays:

In [29]:
print(arr_3d[0, 1, 1])  # Accessing element at depth 0, row 1, column 1

4


- Slicing in 3D Arrays:

In [30]:
slice_3d = arr_3d[:, 1, :]  # Slicing across the second row of each depth
print(slice_3d)


[[3 4]
 [7 8]]


## 9.3. Reshaping Multidimensional Arrays
- Reshaping a 3D Array:
Reshaping allows you to change the structure of multidimensional arrays while maintaining the data.



In [31]:
reshaped_3d = arr_3d.reshape(4, 2)  # Flatten and reshape the 3D array
print(reshaped_3d)

[[1 2]
 [3 4]
 [5 6]
 [7 8]]


## 10. Broadcasting with Higher Dimensional Arrays
### 10.1. Advanced Broadcasting Examples
- Broadcasting with 3D Arrays:

In [32]:
arr_a = np.array([[1, 2], [3, 4]])
arr_b = np.array([1, 2])
broadcasted = arr_a + arr_b  # Broadcasts along the matching dimensions
print(broadcasted)


[[2 4]
 [4 6]]


Broadcasting enables operations on arrays of different dimensions without explicitly reshaping them.

## 10.2. Broadcasting Rules
### Broadcasting follows these rules:
- If arrays have different numbers of dimensions, the smaller array is reshaped.
- Arrays are compared element-wise starting from the trailing dimension.

## 11. NumPy and Pandas Integration
### 11.1. Converting between NumPy Arrays and Pandas DataFrames
- Array to DataFrame:

In [36]:
import pandas as pd
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df = pd.DataFrame(arr, columns=['A', 'B', 'C'])
print(df)

   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9


- DataFrame to Array:

In [37]:
arr_from_df = df.values
print(arr_from_df)


[[1 2 3]
 [4 5 6]
 [7 8 9]]


NumPy arrays are frequently used as the data structure within Pandas for efficient computation and manipulation.

## 12. Handling Missing Values with NumPy
### 12.1. Using np.nan for Missing Values
- Creating Arrays with NaNs: np.nan is used to represent missing or invalid data.

In [38]:
arr_with_nan = np.array([1, 2, np.nan, 4])
print(arr_with_nan)


[ 1.  2. nan  4.]


### 12.2. Functions Ignoring NaN Values
- Using np.nan* Functions: NumPy provides functions like np.nanmean() and np.nansum() to handle arrays containing NaNs.

In [39]:
print(np.nanmean(arr_with_nan))  # Mean ignoring NaNs


2.3333333333333335


## 13. Memory Management and Efficiency
### 13.1. Memory Footprint of Arrays
- Checking Memory Usage: NumPy arrays are more memory-efficient than native Python lists, especially for large datasets.

In [40]:
print(arr.nbytes)  # Total bytes consumed by the array

36


## 13.2. Saving and Loading Arrays
Saving Arrays to Disk:

In [41]:
np.save('array.npy', arr)  # Save to a .npy file


- Loading Arrays from Disk: NumPy allows you to save and load large arrays efficiently using .npy format.

In [42]:
loaded_arr = np.load('array.npy')
print(loaded_arr)


[[1 2 3]
 [4 5 6]
 [7 8 9]]


## 14. Structured Arrays and Record Arrays
### 14.1. Structured Arrays for Heterogeneous Data
- Creating Structured Arrays:

In [43]:
structured_arr = np.array([(1, 'Apple', 2.5), (2, 'Banana', 3.0)], dtype=[('ID', 'i4'), ('Fruit', 'U10'), ('Price', 'f4')])
print(structured_arr)


[(1, 'Apple', 2.5) (2, 'Banana', 3. )]


- Accessing Structured Array Data: Structured arrays allow you to work with heterogeneous data, like rows containing mixed data types.

In [44]:
print(structured_arr['Fruit'])  # Accessing 'Fruit' column


['Apple' 'Banana']


## 15. NumPy for Mathematical and Statistical Functions
### 15.1. Mathematical Functions
- Common Mathematical Operations:

In [47]:
arr = np.array([1, 2, 3, 4])
print(np.sqrt(arr))   # Square root
print(np.exp(arr))    # Exponential


[1.         1.41421356 1.73205081 2.        ]
[ 2.71828183  7.3890561  20.08553692 54.59815003]


## 15.2. Statistical Functions
- Variance, Standard Deviation:

In [48]:
print(np.var(arr))    # Variance
print(np.std(arr))    # Standard deviation


1.25
1.118033988749895


 NumPy offers a variety of built-in mathematical and statistical functions that operate efficiently on large arrays.



## 16. Advanced NumPy Operations with Broadcasting and Vectorization
### 16.1. Vectorization
- Vectorized Operations for Speed:

In [50]:
large_arr = np.arange(100000)
result = large_arr * 2  # Element-wise multiplication
print(result)

[     0      2      4 ... 199994 199996 199998]


Vectorization allows operations to be performed without explicit loops, making computations faster.

## 16.2. Broadcasting with Advanced Functions
- Using np.meshgrid() and Broadcasting:

In [52]:
x = np.arange(0, 5)
y = np.arange(0, 5)
xx, yy = np.meshgrid(x, y)
z = xx ** 2 + yy ** 2  # Broadcasts the operations

print(z)


[[ 0  1  4  9 16]
 [ 1  2  5 10 17]
 [ 4  5  8 13 20]
 [ 9 10 13 18 25]
 [16 17 20 25 32]]


 Broadcasting is highly efficient for multidimensional computations without the need for explicit iteration.



### Practice Exercises
1. Basics of NumPy Arrays
- Create a 1D NumPy array of integers from 10 to 50.
- Convert the 1D array to a 2D array with 4 rows and 5 columns.
- Select the first 3 rows and the last 2 columns of the 2D array you just created.
2. Array Operations
- Create two arrays arr1 = [1, 2, 3, 4, 5] and arr2 = [10, 20, 30, 40, 50]. Perform element-wise addition, subtraction, multiplication, and division.
- Use broadcasting to add a 1D array [1, 2, 3] to a 2D array [[10, 20, 30], [40, 50, 60]].
3. Array Indexing and Slicing
- Create an array of shape (4, 5) filled with random numbers between 0 and 1. Extract the sub-array consisting of the last two rows and the last two columns.
- Given an array arr = np.array([1, 2, 3, 4, 5, 6]), reverse the order of the elements.
4. Reshaping and Resizing
- Create an array of 16 random integers between 1 and 100. Reshape it into a 4x4 matrix.
- Flatten the matrix back into a 1D array using the ravel() function.
5. Advanced Array Manipulation
- Create a 3x3 identity matrix.
- Create an array of 100 equally spaced numbers between 0 and 10. Reshape it into a 10x10 matrix and find the sum of each row.
- Stack two 1D arrays arr1 = [1, 2, 3] and arr2 = [4, 5, 6] vertically and horizontally.
6. Statistical Operations
- Create an array of 20 random integers between 0 and 100. Compute the mean, median, standard deviation, and variance of the array.
- Create a 5x5 matrix of random integers between 1 and 100. Find the maximum and minimum values of each row.

7. Mathematical Functions
- Create an array of angles in radians and compute their sine, cosine, and tangent values.
- Create a random array and compute the exponential, logarithm, and square root of each element.


## 17. NumPy for Stock Market Analysis
### 17.1. Fetching Stock Data
- Using yfinance to Download Historical Data:

In [58]:
import yfinance as yf

# Download historical data for a stock (e.g., Apple)
stock_data = yf.download('Reliance.NS', start='2020-01-01', end='2024-10-07')
print(stock_data.head())


[*********************100%%**********************]  1 of 1 completed

                   Open         High          Low        Close    Adj Close  \
Date                                                                          
2020-01-01  1387.957031  1396.277466  1376.527954  1380.276611  1356.042725   
2020-01-02  1382.471069  1408.941040  1382.471069  1403.775024  1379.128662   
2020-01-03  1401.671997  1409.581055  1392.528687  1405.466553  1380.790527   
2020-01-06  1389.785767  1397.008911  1369.670410  1372.870605  1348.766846   
2020-01-07  1388.871338  1403.043579  1383.842529  1393.991699  1369.516968   

              Volume  
Date                  
2020-01-01   7002234  
2020-01-02   8855158  
2020-01-03  10492349  
2020-01-06  12259588  
2020-01-07   8341811  





 This example shows how to use the yfinance library to download historical stock market data. The data will be loaded into a Pandas DataFrame, but can easily be converted to a NumPy array for further analysis.

### 17.2. Calculating Daily Stock Returns
- Daily Returns Using NumPy: Daily returns are calculated by finding the percentage change between consecutive closing prices. NumPy makes this operation efficient even for large datasets.

In [60]:
# Convert 'Close' prices to a NumPy array
close_prices = stock_data['Close'].values

# Calculate daily returns
daily_returns = (close_prices[1:] - close_prices[:-1]) / close_prices[:-1]
print(daily_returns[:5])  # Print the first 5 daily returns
print(daily_returns[-5:])  # Print the last 5 daily returns

[ 0.01702442  0.00120499 -0.02319226  0.01538462 -0.00751017]
[ 0.01884248 -0.03249961 -0.0079576  -0.03949276 -0.01453469]


### 17.3. Calculating Moving Averages
- Simple Moving Average (SMA):
A simple moving average smooths out price data to identify trends. This function uses NumPy’s np.convolve() method to calculate the moving average over a specified window size.

In [61]:
def moving_average(data, window_size):
    weights = np.ones(window_size) / window_size
    sma = np.convolve(data, weights, mode='valid')
    return sma

sma_10 = moving_average(close_prices, 10)
print(sma_10[:5])  # First 5 values of 10-day SMA
print(sma_10[-5:])  # Last 5 values of 10-day SMA

[1398.01931152 1399.32224121 1399.55996094 1403.56931152 1406.39002686]
[2972.70500488 2973.75       2972.25498047 2960.95998535 2944.32998047]


### 17.4. Exponential Moving Average (EMA)
- EMA Calculation: Unlike the simple moving average, the exponential moving average places more weight on recent prices, making it more responsive to price changes.

In [63]:
def exponential_moving_average(data, span):
    alpha = 2 / (span + 1)
    ema = np.zeros_like(data)
    ema[0] = data[0]  # First value is the same as the data point
    for t in range(1, len(data)):
        ema[t] = alpha * data[t] + (1 - alpha) * ema[t - 1]
    return ema

ema_10 = exponential_moving_average(close_prices, 10)
print(ema_10[:5])  # First 5 values of 10-day EMA
print(ema_10[-5:])  # Last 5 values of 10-day EMA

[1380.27661133 1384.54905007 1388.35223237 1385.53739112 1387.07453805]
[2986.18704744 2980.18029378 2970.99294989 2942.43967739 2911.64156311]


### 17.5. Volatility Calculation
- Volatility as Standard Deviation of Returns: Volatility represents the risk associated with a stock and is calculated using the standard deviation of the stock’s returns. The example above calculates the annualized volatility by multiplying by the square root of 252 (number of trading days in a year).

In [64]:
volatility = np.std(daily_returns) * np.sqrt(252)  # Annualized volatility (252 trading days in a year)
print(f"Annualized Volatility: {volatility}")


Annualized Volatility: 0.30120735261588094


### 18.1. Portfolio Returns
- Calculating Portfolio Returns Using NumPy:

In [65]:
# Sample weights and returns for a 3-stock portfolio
weights = np.array([0.3, 0.4, 0.3])  # Portfolio weights
stock_returns = np.array([[0.02, 0.03, 0.01],  # Daily returns for 3 stocks
                          [0.01, 0.02, 0.00],
                          [-0.01, 0.01, 0.02]])

portfolio_returns = np.dot(stock_returns, weights)
print(portfolio_returns)


[0.021 0.011 0.007]


 In this example, we compute the returns of a portfolio consisting of three stocks. Portfolio returns are calculated by taking the dot product of the individual stock returns and their respective portfolio weights.

### 18.2. Portfolio Risk (Variance and Standard Deviation)
- Calculating Portfolio Variance and Standard Deviation : Portfolio risk is measured by the variance and standard deviation of the portfolio's returns. The covariance matrix captures the relationship between the returns of the stocks in the portfolio, and the overall portfolio risk is calculated using the portfolio weights.



In [66]:
# Covariance matrix of stock returns
cov_matrix = np.cov(stock_returns.T)

# Portfolio variance
portfolio_variance = np.dot(weights.T, np.dot(cov_matrix, weights))
portfolio_std_dev = np.sqrt(portfolio_variance)

print(f"Portfolio Standard Deviation: {portfolio_std_dev}")


Portfolio Standard Deviation: 0.007211102550927978


## 19. Risk-Adjusted Returns with NumPy
### 19.1. Sharpe Ratio Calculation
- Calculating the Sharpe Ratio:

In [67]:
risk_free_rate = 0.01  # Assume 1% risk-free rate
excess_returns = portfolio_returns - risk_free_rate

sharpe_ratio = np.mean(excess_returns) / portfolio_std_dev
print(f"Sharpe Ratio: {sharpe_ratio}")


Sharpe Ratio: 0.41602514716892175


The Sharpe ratio is a measure of risk-adjusted returns. It calculates how much excess return a portfolio generates per unit of risk (as measured by the portfolio's standard deviation).

### **Conclusion**

This notebook has taken you from the basics of NumPy to advanced applications, especially in stock market analysis. We started with fundamental array operations, explored broadcasting, and learned how to handle data efficiently using NumPy's functions.

In the stock market sections, we applied NumPy to calculate returns, moving averages, portfolio risk, and even ran Monte Carlo simulations. These real-world examples highlight the practical use of NumPy in financial analysis.

NumPy's speed, flexibility, and scalability make it essential for data analysis. Keep practicing and building on these concepts to harness its full potential in your projects!