## Numpy-1 : Introduction to Numpy

### Motivation

- Must-have for **AI/ML coding rounds** in product-based companies  
- Essential for working on **machine learning**, **data science**, and **deep learning** projects  
- Required when writing **custom loss functions**, building ML models **from scratch** (without libraries like Scikit-learn or TensorFlow)


### Introduction to NumPy

- NumPy (Numerical Python) is a Python library that is the core library for scientific computing in Python. 
- It contains a collection of tools and techniques that can be used to solve mathematical models of problems in science and engineering
- One of these tools is a high-performance multidimensional array object that is a powerful data structure for the efficient computation of arrays and matrices.
- To work with these arrays, there’s a vast amount of high-level mathematical functions operate on these matrices and arrays.



### Why Learn NumPy?

#### 1. Model Inputs Need NumPy Arrays
- ML libraries like **Scikit-learn**, **TensorFlow**, **PyTorch**, etc. expect data in NumPy array format.
- Even Pandas and OpenCV internally rely on NumPy arrays.

#### 2. NumPy is Much Faster for Large Numerical Computation
- NumPy is written in **optimized C code** under the hood.
- It supports:
  - Matrix multiplication
  - Dot products
  - Broadcasting (auto-expanding arrays in operations)
- Much faster than native Python loops or lists.

#### 3. NumPy Handles N-Dimensional Arrays
- NumPy can create and operate on arrays of any dimension (1D, 2D, 3D, ..., nD).
- Useful for:
  - Scientific computing
  - Deep learning tensors
  - Image processing

#### 4. You Need NumPy to Build Custom Algorithms
- For hands-on ML or from-scratch models, NumPy is essential.
- You'll use it for:
  - Linear regression
  - Gradient descent
  - Backpropagation
  - Activation functions



### NumPy vs Pandas
| Feature     | NumPy                         | Pandas                          |
|-------------|-------------------------------|----------------------------------|
| Focus       | Numerical computing           | Tabular data analysis            |
| Structure   | N-dimensional arrays          | Labeled 1D/2D tables (Series/DataFrame) |
| Performance | Very fast, low-level          | Built on top of NumPy            |
| Use case    | Math behind ML                | EDA, data wrangling, preprocessing |

- Use **Pandas** to prepare data, and **NumPy** to power the math behind ML.



#### Why NumPy is More Memory Efficient than Python Lists

**Python List Memory Model**

- Stores **references (pointers)** to separate Python `int` objects.
- Each `int` object has metadata:
  - Type information
  - Reference count
  - Value
- ❌ High overhead, especially for large arrays.
- ❌ Not cache-friendly — elements are scattered in memory.



**NumPy Array Memory Model**

- Stores data in a **contiguous C-style memory block**.
- Only stores **raw values** (e.g., `int32`, `float64`) — no per-element metadata.
- ✅ Very compact: no need to store Python object info for each element.
- ✅ Cache-friendly and CPU-efficient.
- ✅ Enables **vectorized operations**, **SIMD**, and **broadcasting**.



**Result**

- NumPy arrays use significantly **less memory** than Python lists.
- Ideal for **large-scale numerical data**, especially in **machine learning** and **scientific computing**.
  
