### What is NumPy?

- **Fundamental Package for Scientific Computing:**
  - NumPy is the essential package for scientific computing in Python.
  - Provides support for large, multi-dimensional arrays and matrices.

- **Library Components:**
  - Offers a multidimensional array object.
  - Includes derived objects such as masked arrays and matrices.
  - Provides a collection of routines for fast array operations.

- **Array Operations:**
  - **Mathematical:** Supports a wide range of mathematical operations.
  - **Logical:** Facilitates logical operations on arrays.
  - **Shape Manipulation:** Allows for efficient reshaping and resizing of arrays.
  - **Sorting and Selecting:** Includes functions for sorting and selecting elements.
  - **I/O Operations:** Supports input and output operations.
  - **Discrete Fourier Transforms:** Offers tools for performing discrete Fourier transforms.
  - **Basic Linear Algebra:** Provides basic linear algebra operations.
  - **Statistical Operations:** Includes basic statistical functions.
  - **Random Simulations:** Facilitates random number generation and simulations.

- **Core Object - ndarray:**
  - The central feature of NumPy is the `ndarray` object.
  - Encapsulates n-dimensional arrays of homogeneous data types.

### NumPy Arrays vs. Python Sequences

- **Fixed Size:**
  - NumPy arrays have a fixed size at creation.
  - Unlike Python lists, which can grow dynamically.
  - Changing the size of an ndarray creates a new array and deletes the original.

- **Homogeneous Data Type:**
  - All elements in a NumPy array must be of the same data type.
  - Elements are the same size in memory.

- **Efficiency in Operations:**
  - NumPy arrays support advanced mathematical and other operations on large data sets.
  - Such operations are typically more efficient and require less code than using Python’s built-in sequences.

- **Wide Adoption in Scientific and Mathematical Packages:**
  - Many scientific and mathematical Python-based packages use NumPy arrays.
  - These packages often support Python-sequence input but convert such input to NumPy arrays for processing.
  - They frequently output results as NumPy arrays.

### Use Cases of NumPy in Algo Trading

NumPy is an essential library for algorithmic trading due to its efficiency in numerical computations and array manipulations. Here are some key use cases of NumPy in algo trading:

1. **Data Preparation and Cleaning**
   - **Data Transformation:** Convert raw trading data into structured formats for analysis.
     ```python
     import numpy as np
     raw_data = [[100, '2023-06-01'], [102, '2023-06-02'], [104, '2023-06-03']]
     structured_data = np.array(raw_data, dtype=[('price', 'f4'), ('date', 'U10')])
     print(structured_data)
     ```
   - **Handling Missing Values:** Fill or interpolate missing data points in price series.
     ```python
     prices = np.array([100, 102, np.nan, 105])
     cleaned_prices = np.where(np.isnan(prices), np.nanmean(prices), prices)
     print(cleaned_prices)
     ```

2. **Technical Analysis**
   - **Calculating Indicators:** Compute moving averages, Bollinger Bands, RSI, etc.
     ```python
     prices = np.array([100, 102, 104, 103, 105, 107])
     window = 3
     moving_avg = np.convolve(prices, np.ones(window)/window, mode='valid')
     print(moving_avg)
     ```
   - **Trend Analysis:** Detect trends and patterns in price movements.
     ```python
     prices = np.array([100, 101, 102, 103, 104, 105])
     gradient = np.gradient(prices)
     print(gradient)
     ```

3. **Portfolio Management**
   - **Risk Assessment:** Calculate covariance matrices and perform risk analysis.
     ```python
     returns = np.array([[0.01, 0.02, -0.01], [0.03, 0.01, 0.02], [0.01, -0.02, 0.03]])
     cov_matrix = np.cov(returns, rowvar=False)
     print(cov_matrix)
     ```
   - **Optimization:** Use linear algebra to find the optimal portfolio weights.
     ```python
     import numpy.linalg as la
     returns = np.array([[0.01, 0.02], [0.03, 0.01], [0.01, -0.02]])
     mean_returns = np.mean(returns, axis=0)
     cov_matrix = np.cov(returns, rowvar=False)
     weights = la.solve(cov_matrix, mean_returns)
     weights /= np.sum(weights)
     print(weights)
     ```

4. **Statistical Analysis**
   - **Simulating Price Paths:** Generate synthetic price data for backtesting.
     ```python
     np.random.seed(42)
     n_steps = 1000
     steps = np.random.choice([-1, 1], size=n_steps)
     random_walk = np.cumsum(steps)
     print(random_walk)
     ```
   - **Hypothesis Testing:** Perform statistical tests to validate trading strategies.
     ```python
     from scipy import stats
     returns = np.array([0.01, 0.02, -0.01, 0.03, -0.02])
     t_stat, p_value = stats.ttest_1samp(returns, 0)
     print(t_stat, p_value)
     ```

5. **Performance Measurement**
   - **Sharpe Ratio Calculation:** Measure risk-adjusted returns.
     ```python
     returns = np.array([0.01, 0.02, -0.01, 0.03, -0.02])
     risk_free_rate = 0.01
     sharpe_ratio = (np.mean(returns) - risk_free_rate) / np.std(returns)
     print(sharpe_ratio)
     ```
   - **Drawdown Analysis:** Identify and analyze maximum drawdowns in the portfolio.
     ```python
     prices = np.array([100, 110, 105, 115, 108])
     peak = np.maximum.accumulate(prices)
     drawdown = (prices - peak) / peak
     max_drawdown = np.min(drawdown)
     print(max_drawdown)
     ```






### Creating Numpy Arrays

In [1]:
!pip install numpy



In [12]:
# np.array
import numpy as np
np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [2]:
# 2D and 3D
np.array([[1,2,3],[1,2,3]])

array([[1, 2, 3],
       [1, 2, 3]])

In [13]:
np.array([[[1,2,3],[1,2,3]],[[1,2,3],[1,2,3]]])

array([[[1, 2, 3],
        [1, 2, 3]],

       [[1, 2, 3],
        [1, 2, 3]]])

In [16]:
# dtype
np.array([1,2,3], dtype=np.float32)

array([1., 2., 3.], dtype=float32)

In [19]:
# np.arange

np.arange(2,12,4)

array([ 2,  6, 10])

In [26]:
# with reshape
np.arange(0,12).reshape(2,2,3)

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])

In [32]:
# np.ones and np.zeros
np.ones((3,3,3))

array([[[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]]])

In [33]:
np.zeros((4,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [49]:
# np.random
np.random.random((2,2))

array([[0.32994856, 0.09030289],
       [0.8766488 , 0.69467847]])

In [55]:
# np.linspace
np.linspace(1,100,36)

array([  1.        ,   3.82857143,   6.65714286,   9.48571429,
        12.31428571,  15.14285714,  17.97142857,  20.8       ,
        23.62857143,  26.45714286,  29.28571429,  32.11428571,
        34.94285714,  37.77142857,  40.6       ,  43.42857143,
        46.25714286,  49.08571429,  51.91428571,  54.74285714,
        57.57142857,  60.4       ,  63.22857143,  66.05714286,
        68.88571429,  71.71428571,  74.54285714,  77.37142857,
        80.2       ,  83.02857143,  85.85714286,  88.68571429,
        91.51428571,  94.34285714,  97.17142857, 100.        ])

In [57]:
# np.identity
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

### Array Attributes

In [58]:
a1 = np.array([1,2,3],dtype=str)
a2 = np.array([[1,2,3],[1,2,3]])
a3 = np.array([[[1,2,3],[1,2,3]],[[1,2,3],[1,2,3]]])


In [60]:
# ndim
a1.ndim

1

In [61]:
# shape
a3.shape

(2, 2, 3)

In [65]:
# size
a1.size

3

In [64]:
# itemsize
a3.itemsize

4

In [68]:
# dtype
a3.astype(str)



array([[['1', '2', '3'],
        ['1', '2', '3']],

       [['1', '2', '3'],
        ['1', '2', '3']]], dtype='<U11')

### Changing Datatype

In [58]:
# astype
a3.astype(np.int64)

array([[[1, 2, 3],
        [1, 2, 3]],

       [[1, 2, 3],
        [1, 2, 3]]])

### Array Operations

In [80]:
# scalar operations

# arithmetic
ohlc_data_day1 * ohlc_data_Day2

array([[20000.  , 20452.25, 29800.25, 20301.  ],
       [20301.  , 20604.  , 20150.25, 20543.24],
       [20543.24, 20909.  , 20301.  , 20756.25]])

In [None]:
# relational


In [None]:
# vector operations
# arithmetic



```python
import numpy as np

# Creating a NumPy array for stock prices
stock_prices = np.array([100.5, 101.3, 102.8, 101.9, 100.7])
print("Stock Prices Array:", stock_prices)

# Creating a 2D NumPy array for OHLC (Open, High, Low, Close) data
ohlc_data_day1 = np.array([
    [100.0, 101.5, 99.5, 101.0],
    [101.0, 102.0, 100.5, 101.8],
    [101.8, 103.0, 101.0, 102.5]
])

ohlc_data_Day2 = np.array([
    [200.0, 201.5, 299.5, 201.0],
    [201.0, 202.0, 200.5, 201.8],
    [201.8, 203.0, 201.0, 202.5]
])


print("OHLC Data Array:\n", ohlc_data)


```


### Array Functions

In [69]:
import numpy as np

# Creating a NumPy array for stock prices
stock_prices = np.array([100.5, 101.3, 102.8, 101.9, 100.7])
print("Stock Prices Array:", stock_prices)

# Creating a 2D NumPy array for OHLC (Open, High, Low, Close) data
ohlc_data_day1 = np.array([
    [100.2, 101.5, 99.5, 101.0],
    [101.0, 102.0, 100.5, 101.8],
    [101.8, 103.0, 101.0, 102.5]
])

ohlc_data_Day2 = np.array([
    [200.0, 201.5, 299.5, 201.0],
    [201.0, 202.0, 200.5, 201.8],
    [201.8, 203.0, 201.0, 202.5]
])


print("OHLC Data Array:\n", ohlc_data_day1)


Stock Prices Array: [100.5 101.3 102.8 101.9 100.7]
OHLC Data Array:
 [[100.  101.5  99.5 101. ]
 [101.  102.  100.5 101.8]
 [101.8 103.  101.  102.5]]


In [73]:
# max/min/sum/prod
# 0 -> col and 1 -> row
np.prod(ohlc_data_day1)


1.167014155587062e+24

In [82]:
# mean/median/std/var
np.mean(ohlc_data_day1)
np.median(ohlc_data_day1)
np.std(ohlc_data_day1)

0.9660917830792956

In [83]:
# trigonomoetric functions
np.sin(ohlc_data_day1)


array([[-0.50636564,  0.82433986, -0.85779535,  0.45202579],
       [ 0.45202579,  0.99482679, -0.03095997,  0.95481453],
       [ 0.95481453,  0.62298863,  0.45202579,  0.92174542]])

In [None]:
# dot product


In [None]:
# log and exponents


In [88]:
# round/floor/ceil

np.round(ohlc_data_day1)
np.ceil(ohlc_data_day1)

array([[100., 102., 100., 101.],
       [101., 102., 101., 102.],
       [102., 103., 101., 103.]])

### Indexing and Slicing

In [None]:

import numpy as np

a5 = np.array([100.5, 101.3, 102.8, 101.9, 100.7])
print("Stock Prices Array:", stock_prices)

ohlc_data_day1 = np.array([
    [100.2, 101.5, 99.5, 101.0],
    [101.0, 102.0, 100.5, 101.8],
    [101.8, 103.0, 101.0, 102.5]
])

ohlc_data_Day2 = np.array([
    [200.0, 201.5, 299.5, 201.0],
    [201.0, 202.0, 200.5, 201.8],
    [201.8, 203.0, 201.0, 202.5]
])





In [135]:
for i in np.ravel(ohlc_data_day1):
    print(i)

100.0
101.5
99.5
101.0
101.0
102.0
100.5
101.8
101.8
103.0
101.0
102.5


In [127]:
test = np.arange(0,24).reshape(2,3,4)

In [133]:
# test
test[1,1:2 , 2:]

array([[18, 19]])

In [110]:
a1 = [1,2,3,4,5,6,7,8,9,10]

In [114]:
a1[-5:-1]

[6, 7, 8, 9]

### Iterating

### Reshaping

In [149]:
# reshape
np.arange(0,32).reshape(2,4,4)

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])

In [None]:
# Transpose


In [150]:
# ravel
ohlc_data_day1.ravel()

array([100. , 101.5,  99.5, 101. , 101. , 102. , 100.5, 101.8, 101.8,
       103. , 101. , 102.5])

### Stacking

In [153]:
# horizontal stacking
np.hstack((ohlc_data_day1,ohlc_data_Day2))

array([[100. , 101.5,  99.5, 101. , 200. , 201.5, 299.5, 201. ],
       [101. , 102. , 100.5, 101.8, 201. , 202. , 200.5, 201.8],
       [101.8, 103. , 101. , 102.5, 201.8, 203. , 201. , 202.5]])

In [154]:
# Vertical stacking
np.vstack((ohlc_data_day1,ohlc_data_Day2))

array([[100. , 101.5,  99.5, 101. ],
       [101. , 102. , 100.5, 101.8],
       [101.8, 103. , 101. , 102.5],
       [200. , 201.5, 299.5, 201. ],
       [201. , 202. , 200.5, 201.8],
       [201.8, 203. , 201. , 202.5]])

### Splitting

In [None]:
# horizontal splitting


In [None]:
# vertical splitting