
### **1. What is NumPy and why is it used in Python?**

**Answer:**
NumPy (Numerical Python) is a fundamental library for scientific computing in Python. It provides:

* **N-dimensional arrays (`ndarray`)** for efficient storage and computation.
* **Vectorized operations** that are much faster than Python lists.
* Mathematical functions for linear algebra, statistics, and more.

**Use in AI/ML:** NumPy arrays are the backbone of ML data pipelines. Libraries like **Pandas, TensorFlow, PyTorch, Scikit-learn** internally rely on NumPy arrays for efficient computation.

---

### **2. Difference between a Python list and a NumPy array**

**Answer:**

| Feature     | Python List                    | NumPy Array                    |
| ----------- | ------------------------------ | ------------------------------ |
| Homogeneous | Can store mixed types          | Must store same type           |
| Speed       | Slower for numerical ops       | Much faster due to C backend   |
| Memory      | Higher overhead                | More memory-efficient          |
| Operations  | Element-wise ops require loops | Supports vectorized operations |

**Scenario question:** Why is NumPy preferred for ML?

* Large datasets, matrix multiplications, and element-wise operations are computationally expensive with lists. NumPy is optimized for these tasks.

---

### **3. How do you create a NumPy array?**

**Answer:**

```python
import numpy as np

# From Python list
arr = np.array([1, 2, 3, 4])

# Using built-in functions
zeros = np.zeros((2,3))      # 2x3 array of zeros
ones = np.ones((3,2))        # 3x2 array of ones
arange_arr = np.arange(0, 10, 2)  # 0,2,4,6,8
linspace_arr = np.linspace(0, 1, 5)  # 0,0.25,0.5,0.75,1
```

**Interview twist:** Explain `arange` vs `linspace` → `arange` uses step size, `linspace` uses number of points.

---

### **4. What is the difference between shape, size, ndim, and dtype in NumPy arrays?**

**Answer:**

```python
arr = np.array([[1,2,3],[4,5,6]])

arr.shape   # (2,3) → rows, columns
arr.size    # 6 → total elements
arr.ndim    # 2 → number of dimensions
arr.dtype   # dtype('int64') → data type
```

**Scenario:** Reshaping arrays for ML models → often required for feature matrices.

---

### **5. How does broadcasting work in NumPy?**

**Answer:**
Broadcasting allows NumPy to perform operations on arrays of **different shapes** without explicit replication. Rules:

1. If shapes differ, prepend 1 to smaller shape.
2. Dimensions are compatible if they are equal or one of them is 1.

```python
arr1 = np.array([[1,2,3],[4,5,6]])
arr2 = np.array([1,2,3])
result = arr1 + arr2
# [[2,4,6],[5,7,9]]
```

**Scenario:** Useful in normalizing data, adding bias terms in ML models.

---

### **6. Explain some commonly used NumPy functions for ML**

**Answer:**

* `np.mean(arr)`, `np.std(arr)` → for normalization/standardization
* `np.sum(arr, axis=0)` → column-wise sum
* `np.dot(a,b)` → matrix multiplication (linear regression, neural nets)
* `np.argmax(arr, axis=1)` → classification outputs
* `np.random.seed()` → reproducibility
* `np.random.randn()` → initializing weights

---

### **7. How do you select, slice, and index arrays in NumPy?**

**Answer:**

```python
arr = np.array([1,2,3,4,5])
arr[0]      # 1st element
arr[-1]     # last element
arr[1:4]    # slice 2nd to 4th element
arr[::2]    # every 2nd element
```

**2D array example:**

```python
arr2d = np.array([[1,2,3],[4,5,6]])
arr2d[0,1]  # 2
arr2d[:,1]  # [2,5] → entire 2nd column
arr2d[1,:]  # [4,5,6] → entire 2nd row
```

**Scenario:** Extract features, batch processing for ML pipelines.

---

### **8. Difference between `copy()` and view (`arr.view()`)**

**Answer:**

* `view` → shallow copy, changes in one array affect the other.
* `copy` → deep copy, independent memory.

```python
a = np.array([1,2,3])
b = a.view()
c = a.copy()
b[0] = 100
print(a)  # [100,2,3]
c[0] = 500
print(a)  # [100,2,3]
```

**Scenario:** Avoid side-effects when preprocessing ML datasets.

---

### **9. Explain vectorization and why it is important**

**Answer:**

* Vectorization: performing operations on **entire arrays** instead of element-wise loops.
* Significantly faster because operations run in **compiled C code** internally.

```python
# Without vectorization
sum_list = [x+5 for x in range(1000000)]

# With vectorization
arr = np.arange(1000000)
arr + 5
```

**Scenario:** Training ML models with large datasets → speed is critical.

---



### **1. Find all even numbers in an array**

```python
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
even_nums = arr[arr % 2 == 0]
print(even_nums)   # [ 2  4  6  8 10 ]
```

✅ **Explanation:**

* Boolean masking (`arr % 2 == 0`) creates a True/False array.
* NumPy returns only elements where condition is `True`.

---

### **2. Compute column-wise mean and standard deviation**

```python
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

col_mean = np.mean(arr, axis=0)
col_std  = np.std(arr, axis=0)

print("Column-wise mean:", col_mean)  # [4. 5. 6.]
print("Column-wise std:", col_std)    # [2.449 2.449 2.449]
```

✅ **Explanation:**

* `axis=0` → operations across rows, giving column results.
* `np.mean` & `np.std` are frequently used for **data normalization/standardization**.

---

### **3. Normalize an array to 0–1 range**

```python
arr = np.array([10, 20, 30, 40, 50])

normalized = (arr - arr.min()) / (arr.max() - arr.min())
print(normalized)   # [0.   0.25 0.5  0.75 1.  ]
```

✅ **Explanation:**

* Formula:

$$
x_{norm} = \frac{x - \min(x)}{\max(x) - \min(x)}
$$

* Useful in ML pipelines → ensuring features are **scale-independent**.

---

### **4. Compute matrix multiplication without using loops**

```python
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

# Method 1: dot product
C = np.dot(A, B)

# Method 2: @ operator
C2 = A @ B

print(C)   # [[19 22]
           #  [43 50]]
```

✅ **Explanation:**

* `np.dot()` or `@` → optimized matrix multiplication.
* Avoids explicit loops → **faster & memory efficient**.
* Used in ML for **linear regression, neural networks, attention mechanisms**.

---

### **5. Reshape a 1D array into 2D (n, m) for ML input**

```python
arr = np.arange(12)   # 1D array → [0,1,...,11]

reshaped = arr.reshape(3, 4)   # 3 rows, 4 cols
print(reshaped)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]
```

✅ **Explanation:**

* `.reshape(n, m)` changes dimensions without copying data.
* **Common ML use-case:** Reshaping 1D feature vectors into `(num_samples, num_features)` format before feeding into models.

