## 📦 What is NumPy?

[NumPy](https://numpy.org/doc/stable/user/whatisnumpy.html) is a fundamental package for scientific computing in Python. It provides:
- Support for large, multi-dimensional arrays and matrices
- A collection of mathematical functions to operate on these arrays
- Internals written in C, making it highly efficient for numerical operations

It is primarily used for numerical operations, especially with large, multi-dimensional arrays. It is typically imported using import numpy as NP. The fundamental object in NumPy. Unlike Python lists, NumPy arrays are designed for efficient numerical operations and consistent data types.

In [None]:
import numpy as np

arr = np.array([1,2,3])
arr
arr.shape

In [None]:
b = np.array([[1,2,3], [4,5,6]])
b.shape

NumPy arrays are “N-dimensional,” meaning they can have any number of dimensions—1D, 2D, 3D, and so on. However, in multi-dimensional arrays, NumPy requires that the structure be regular. This means that each row in a 2D array must have the same number of columns, and each sub-array in higher dimensions must also be uniform in size. You cannot create arrays with uneven or jagged shapes—doing so will result in an error.

In [None]:
#Valid NumPy 2D Array (All rows same length)
# All rows have the same number of columns
valid_array = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

print(valid_array)

In [None]:
#❌ Invalid Ragged Array (Different row lengths)
# Rows have different lengths this will throw error "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions..."
ragged_array = np.array([
    [1, 2, 3],
    [4, 5],
    [6]
])

## Why is NumPy fast?
NumPy is fast for three key reasons — and all of them boil down to avoiding Python's usual overhead and leveraging efficient, low-level optimizations.

#### 1. Contiguous, Typed Memory (Homogeneous Arrays)
        Python lists:
         - Store references to objects
         - Can contain mixed types (e.g., [3, "cat", True])
         - Require dynamic type resolution at runtime
 
        NumPy arrays:
         - Store data in contiguous blocks of memory
         - Use fixed, uniform types (like int32, float64)
         - Allow for cache-friendly access and low-level optimization
#### 2. Precompiled C/Fortran Backend
        NumPy is essentially a Python wrapper over optimized C libraries.
        Examples:
         - Matrix operations → BLAS, LAPACK
         - Fast transforms → FFTW
         - Linear algebra → optimized C routines
         These libraries are heavily optimized over decades for high-performance computing, and Python simply calls them with a clean interface.
#### 3. Vectorization (No Python Loops)
        Vectorization is a programming technique where you avoid explicit loops and indexing in your code by applying operations to whole arrays or collections at once. Instead of writing a loop to process each element, you rely on libraries (like NumPy) that perform the underlying operations using highly optimized, pre-compiled C or Fortran code. Instead of using Python's slow, interpreted for loops, NumPy performs operations on entire arrays at once using:
                - C or Fortran-level loops under the hood
                - Single function calls that process data in bulk
        This avoids Python’s loop overhead and type checking on every iteration.

In [None]:
#Non-vectorized Code (Using an Explicit Loop)
import time
import random

# Create two lists of 1 million random numbers.
size = 10**6
a = [random.random() for _ in range(size)]
b = [random.random() for _ in range(size)]

# Time the explicit loop multiplication
start_time = time.time()
c = []
for i in range(size):
    c.append(a[i] * b[i])
end_time = time.time()

print("Time with explicit loop:", end_time - start_time, "seconds")


In [None]:
# Vectorized Code (Using NumPy)
import time
import numpy as np

# Create two NumPy arrays of 1 million random numbers.
a = np.random.random(size)
b = np.random.random(size)

# Time the vectorized multiplication
start_time = time.time()
c = a * b
end_time = time.time()

print("Time with vectorized operation:", end_time - start_time, "seconds")