# Introduction to NumPy 

NumPy (Numerical Python) is a powerful library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a vast collection of mathematical functions that operate on these arrays efficiently.

NumPy is an essential tool to help you perform fast numerical computations.

# Why NumPy?

**Choosing NumPy over ordinary Python lists offers several key advantages:**

- **Homogeneous Data Storage:**  
  NumPy arrays store elements of the same data type in contiguous memory blocks. This uniformity allows for efficient memory usage and faster access compared to Python lists, which store pointers to heterogeneous objects.

- **Vectorized Operations:**  
  NumPy leverages vectorized operations written in C. Instead of looping over elements in Python (which is slower due to interpreter overhead), operations are applied to entire arrays at once, significantly boosting performance.

- **Optimized Low-Level Implementation:**  
  Under the hood, many NumPy operations are implemented using optimized libraries such as BLAS and LAPACK. This means that mathematical and linear algebra operations run at speeds close to those of compiled languages.

- **Memory Efficiency:**  
  Since NumPy arrays use a fixed, compact data type for all elements, they are more memory efficient than lists, which store full Python objects with additional overhead.

- **Convenient and Expressive:**  
  NumPy's syntax is concise and allows for complex operations (like slicing, reshaping, and broadcasting) to be performed with minimal code, making it both powerful and easy to use.

Overall, the combination of these features makes NumPy a go-to tool for numerical and scientific computing in Python, particularly when working with large datasets where performance and memory efficiency are critical.

In [1]:
import random
import time
import numpy as np

def create_random_matrix_list(m, n):
    """Creates an m x n matrix as a list of lists with random float values."""
    return [[random.random() for _ in range(n)] for _ in range(m)]

def matrix_mult_list(A, B):
    """
    Multiplies two matrices A and B using nested loops.
    A should be of shape (m, n) and B of shape (n, p).
    Returns the result as a list of lists of shape (m, p).
    """
    m = len(A)
    n = len(A[0])
    p = len(B[0])

    C = [[0] * p for _ in range(m)]
    for i in range(m):
        for j in range(p):
            s = 0
            for k in range(n):
                s += A[i][k] * B[k][j]
            C[i][j] = s
    return C

m, n, p = 512, 512, 512


A_list = create_random_matrix_list(m, n)
B_list = create_random_matrix_list(n, p)
start_time_list = time.time()
C_list = matrix_mult_list(A_list, B_list)
end_time_list = time.time()
print("List matrix multiplication took {:.3f} seconds".format(end_time_list - start_time_list))


A_np = np.random.rand(m, n)
B_np = np.random.rand(n, p)
start_time_np = time.time()
C_np = np.dot(A_np, B_np)
end_time_np = time.time()
print("NumPy matrix multiplication took {:.6f} seconds".format(end_time_np - start_time_np))

List matrix multiplication took 10.901 seconds
NumPy matrix multiplication took 0.003024 seconds


# Creating a Virtual Environment

A **virtual environment** is an isolated workspace for your Python project. It allows you to install and manage packages independently from the global Python installation. This isolation prevents dependency conflicts between projects and helps maintain a consistent development environment.

---

## Steps to Create a Virtual Environment

### Windows

1. **Open Command Prompt or PowerShell.**
2. **Navigate to your project directory:**
   ```bash
   cd path\to\your\project

   python -m venv env

   env\Scripts\activate

   deactivate
   ```

### Linux and MacOS 

1. **Open Terminal**
2. **Navigate to your project directory**
   ```bash
      cd /path/to/your/project
   ```
3. **Create the virtual environment using the built-in venv module**
   ```bash
      python3 -m venv env
   ```
4. **Activate the virtual environment**
   ```bash
      source env/bin/activate
      
      ...

      deactivate
   ```

## Installing NumPy

If you don't already have NumPy installed, you can install it using pip. Run the following command in a notebook cell or your terminal:

In [None]:
!pip install numpy  

## Importing NumPy

After installation, import NumPy. By convention, we import it as `np` for brevity.

In [2]:
import numpy as np 

## Creating Arrays

NumPy arrays are similar to Python lists, but they allow for efficient computation. Here we create a one-dimensional array from a Python list.

In [3]:
# Creating a 1D array from a Python list
a = np.array([1, 2, 3, 4, 5])
print("1D array:", a)

1D array: [1 2 3 4 5]


In [8]:
zeros_arr = np.zeros((2, 3))
ones_arr = np.ones((2, 3))
ran = np.ndarray((2, 3))

print("Zeros array:\n", zeros_arr)
print("Ones array:\n", ones_arr)
print("Random values array:\n", ran)

Zeros array:
 [[0. 0. 0.]
 [0. 0. 0.]]
Ones array:
 [[1. 1. 1.]
 [1. 1. 1.]]
Random values array:
 [[3.53627e-319 0.00000e+000 2.47033e-322]
 [2.96439e-322 3.95253e-322 4.44659e-322]]


## Every NumPy array has a data type (dtype) that specifies the type of its elements. Some common data types include:

* Integers: np.int32, np.int64
* Floats: np.float32, np.float64
* Complex Numbers: np.complex64, np.complex128
* Booleans: np.bool_
* Strings: np.str_ 

In [9]:
# Creating arrays with specified data types
arr_int = np.array([1, 2, 3, 4], dtype=np.int32)
arr_float = np.array([1, 2, 3, 4], dtype=np.float64)
print("Integer array:", arr_int, "with dtype:", arr_int.dtype)
print("Float array:", arr_float, "with dtype:", arr_float.dtype)

Integer array: [1 2 3 4] with dtype: int32
Float array: [1. 2. 3. 4.] with dtype: float64


In [10]:
# Converting an integer array to a float array
arr_float_converted = arr_int.astype(np.float64)
print("Converted to float:", arr_float_converted, "with dtype:", arr_float_converted.dtype)

Converted to float: [1. 2. 3. 4.] with dtype: float64


## Creating Arrays with `arange` and `linspace`

NumPy provides several functions to create arrays with evenly spaced values. Use `np.arange` for step-based sequences and `np.linspace` for a specified number of points between two values.

In [4]:
# Using arange to create an array of values from 0 to 9
b = np.arange(10)
print("Array using arange:", b)

# Using linspace to create an array of 5 values between 0 and 1
c = np.linspace(0, 1, 5)
print("Array using linspace:", c)

Array using arange: [0 1 2 3 4 5 6 7 8 9]
Array using linspace: [0.   0.25 0.5  0.75 1.  ]


## Array Attributes

Each NumPy array has attributes that describe its structure and properties such as shape, size, and data type.

In [5]:
print("Shape of a:", a.shape)  # Dimensions of the array
print("Size of a:", a.size)    # Total number of elements
print("Data type of a:", a.dtype)  # Data type of array elements

Shape of a: (5,)
Size of a: 5
Data type of a: int64


## Indexing and Slicing

Access individual elements or slices of an array using indexing and slicing, much like with Python lists.

In [6]:
# Accessing elements
print("First element of a:", a[0])

# Slicing: get elements from index 1 to 3 (4 is excluded)
print("Slice of a [1:4]:", a[1:4])

First element of a: 1
Slice of a [1:4]: [2 3 4]


In [None]:
arr2d = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

# Access the element at row 1, column 2 (remember indexing starts at 0)
print("Element at row 1, column 2:", arr2d[1, 2]) 

# Get the first two rows and all columns
print("First two rows:\n", arr2d[0:2, :])

# Get all rows and the last two columns
print("Last two columns:\n", arr2d[:, 1:])

# Get a subarray (bottom-right 2x2 subarray)
print("Bottom-right 2x2 subarray:\n", arr2d[1:3, 1:3])

Element at row 1, column 2: 60
First two rows:
 [[10 20 30]
 [40 50 60]]
Last two columns:
 [[20 30]
 [50 60]
 [80 90]]
Bottom-right 2x2 subarray:
 [[50 60]
 [80 90]]


## Mathematical Operations
### Element-wise
Perform element-wise arithmetic operations on arrays. NumPy also provides many universal functions (ufuncs) that operate on arrays.

In [13]:
a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])
c = np.array([1, 2, 3, 0])

In [20]:
# Addition: adds corresponding elements
print("Addition:", a + b)        

# Subtraction: subtracts corresponding elements
print("\nSubtraction:", a - b)     

# Multiplication: multiplies corresponding elements
print("\nMultiplication:", a * b)   

# Division: divides corresponding elements
print("\nDivision:", b / a)         

print("\nDivision:", a / c)

Addition: [11 22 33 44]

Subtraction: [ -9 -18 -27 -36]

Multiplication: [ 10  40  90 160]

Division: [10. 10. 10. 10.]

Division: [ 1.  1.  1. inf]


  print("\nDivision:", a / c)


In [19]:
# Advanced Element-wise operations

print("Exponentiation:", np.power(a, 2))  

print("\nSquare Root:", np.sqrt(a))

print("\nNatural Logarithm:", np.log(a))
print("\nBase-10 Logarithm:", np.log10(a))

print("\nSine:", np.sin(a))
print("\nCosine:", np.cos(a))

Exponentiation: [ 1  4  9 16]

Square Root: [1.         1.41421356 1.73205081 2.        ]

Natural Logarithm: [0.         0.69314718 1.09861229 1.38629436]

Base-10 Logarithm: [0.         0.30103    0.47712125 0.60205999]

Sine: [ 0.84147098  0.90929743  0.14112001 -0.7568025 ]

Cosine: [ 0.54030231 -0.41614684 -0.9899925  -0.65364362]


In [21]:
# Comparison operation returns a boolean array
print("Elements greater than 2:", a > 2)

Elements greater than 2: [False False  True  True]


# Broadcasting in NumPy

Broadcasting is a mechanism that allows NumPy to perform arithmetic operations on arrays of different shapes. Instead of requiring arrays to have identical shapes, NumPy "stretches" or "replicates" the smaller array along the dimensions where its size is 1 (or missing) to match the shape of the larger array. This enables element-wise operations without the need for manual reshaping or copying data.

---

## How Broadcasting Works

1. **Shape Alignment:**  
   When performing operations on two arrays, NumPy compares their shapes starting from the trailing (rightmost) dimensions. If the shapes differ, the smaller array is virtually "expanded" by prepending dimensions of size 1 until both arrays have the same number of dimensions.

2. **Compatibility Rules:**  
   For two dimensions to be compatible:
   - They are equal, **or**
   - One of them is 1.
   
   If these conditions are met, NumPy proceeds with the operation by virtually replicating the elements along the dimension with size 1.

3. **Result Shape:**  
   The resulting array will have a shape where each dimension is the maximum size along that dimension from the two input arrays. Importantly, broadcasting does not physically duplicate data; it only simulates the expansion of the smaller array.

In [23]:
A = np.array([
    [0, 0, 0],
    [10, 10, 10],
    [20, 20, 20],
    [30, 30, 30]
])                      # shape: (4, 3)

B = np.array([1, 2, 3]) # shape: (1, 3)

# Broadcasting: B is virtually reshaped to (1, 3) and then broadcast to (4, 3)
result = A + B

print("2D Array A:\n", A)
print("\n1D Array B:", B)
print("\nResult of A + B:\n", result)

2D Array A:
 [[ 0  0  0]
 [10 10 10]
 [20 20 20]
 [30 30 30]]

1D Array B: [1 2 3]

Result of A + B:
 [[ 1  2  3]
 [11 12 13]
 [21 22 23]
 [31 32 33]]


## Reshaping Arrays

Reshape an array without changing its data. This is useful when you need to convert a 1D array into a 2D array (or vice versa) for computations.

In [None]:
d = np.arange(6)

# Reshape the array to a 2x3 matrix
d_reshaped = d.reshape(2, 3)
print("Reshaped array (2x3):\n", d_reshaped)

# Flatten the array back to 1D
d_flat = d_reshaped.flatten()
print("Flattened array:", d_flat)

Reshaped array (2x3):
 [[0 1 2]
 [3 4 5]]
Flattened array: [0 1 2 3 4 5]


In [None]:
d2 = np.arange(7)

try:
    d2_reshaped = d2.reshape(2, 3)
    print("Reshaped array (2x3):\n", d2_reshaped)
except ValueError as e:
    print("Error:", e)

# Rule of thumb if you want to reshape array you need to make sure that MxN = number of elements in the array

Error: cannot reshape array of size 7 into shape (2,3)


## Stacking and Splitting Arrays

Combine arrays using stacking functions or split arrays into smaller ones.

In [27]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Stack the arrays vertically
vertical_stack = np.vstack((arr1, arr2))
print("Vertical Stack:\n", vertical_stack)

# Stack the arrays horizontally
horizontal_stack = np.hstack((arr1, arr2))
print("Horizontal Stack:", horizontal_stack)

# Splitting an array into two equal parts
split_arr = np.split(np.arange(10), 2)
print("Split arrays:", split_arr)

Vertical Stack:
 [[1 2 3]
 [4 5 6]]
Horizontal Stack: [1 2 3 4 5 6]
Split arrays: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]


## Random Numbers

Generate random numbers and arrays using NumPy's random module. You can also set a seed for reproducibility.

In [None]:
# Generate a random array of 5 numbers between 0 and 1
random_array = np.random.random(5)
print("Random array:", random_array)

# Generate an array of 5 random integers between 0 and 10
random_int_array = np.random.randint(0, 10, size=5)
print("Random integer array:", random_int_array)

Random array: [0.19400809 0.61943459 0.52008566 0.2260407  0.31405049]
Random integer array: [1 9 2 5 3]
Random array with fixed seed: [0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]


## Matrix Multiplication

Perform matrix multiplication using `np.dot` or the `@` operator. This is essential for many linear algebra applications.

In [29]:
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])

# Matrix multiplication using np.dot
product = np.dot(mat1, mat2)
print("Matrix product using np.dot:\n", product)

# Alternatively, using the @ operator (python +3.5)
product_operator = mat1 @ mat2
print("Matrix product using @ operator:\n", product_operator)

Matrix product using np.dot:
 [[19 22]
 [43 50]]
Matrix product using @ operator:
 [[19 22]
 [43 50]]


# Array Aggregations in NumPy

NumPy provides a rich set of aggregation functions that allow you to compute summary statistics and reduce the dimensions of an array. These functions help you quickly obtain useful information about your data.

---

## Common Aggregation Functions

- **`np.sum()`**: Computes the sum of all elements in an array or along a specified axis.
- **`np.mean()`**: Calculates the arithmetic mean of the array elements.
- **`np.max()` / `np.min()`**: Finds the maximum and minimum values in the array.
- **`np.std()` / `np.var()`**: Compute the standard deviation and variance, respectively.
- **`np.argmax()` / `np.argmin()`**: Return the indices of the maximum and minimum values in the array.

---

## How They Work

- **Global Aggregation:**  
  When you don't specify an axis, these functions aggregate over all the elements in the array and return a single scalar value.
  
- **Axis-Specific Aggregation:**  
  By providing an `axis` argument, you can perform the aggregation along a specific dimension:
  - `axis=0`: Aggregates along columns (i.e., summarizes each column).
  - `axis=1`: Aggregates along rows (i.e., summarizes each row).

In [30]:
arr = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

# Global Aggregations
total_sum = np.sum(arr)
mean_value = np.mean(arr)
max_value = np.max(arr)
min_value = np.min(arr)

print("Total Sum:", total_sum)
print("Mean Value:", mean_value)
print("Max Value:", max_value)
print("Min Value:", min_value)

# Aggregations along specific axes
sum_columns = np.sum(arr, axis=0)  # Sum over rows for each column
sum_rows = np.sum(arr, axis=1)     # Sum over columns for each row

print("Sum along columns:", sum_columns)
print("Sum along rows:", sum_rows)

# Finding indices of maximum values
argmax_flat = np.argmax(arr)       # Index in the flattened array
argmax_columns = np.argmax(arr, axis=0)  # Indices of max values along columns

print("Index of maximum (flattened):", argmax_flat)
print("Indices of maximum along columns:", argmax_columns)

Total Sum: 45
Mean Value: 5.0
Max Value: 9
Min Value: 1
Sum along columns: [12 15 18]
Sum along rows: [ 6 15 24]
Index of maximum (flattened): 8
Indices of maximum along columns: [2 2 2]


## More Info.

Beyond these basics, NumPy offers many advanced features such as advanced indexing, masked arrays, and interoperability with other libraries like pandas and SciPy. For further learning, refer to the [official NumPy documentation](https://numpy.org/doc/stable/reference/index.html#reference).