***PHASE 3: Mathematical & Statistical Functions***

**Why Aggregation Functions Matter**

**Real-world data is not about individual values, it’s about:**
- Totals
- Averages
- Extremes
- Trends

**NumPy gives highly optimized C-level functions to compute these instantly.**

**The Most Important Concept: axis**

*Understand the Data Shape*

**What Does axis REALLY Mean?**

*axis tells NumPy which direction to collapse .*

In [None]:
import numpy as np

In [None]:
# axis = 0 → Collapse rows → Work column-wise


data = np.array([[10, 20, 30],
                 [40, 50, 60]])
np.sum(data, axis=0)


In [None]:
# axis = 1 → Collapse columns → Work row-wise
data = np.array([[10, 20, 30],
                 [40, 50, 60]])
np.sum(data,axis = 1)



**Aggregation Functions**

| Function    | Purpose  |
| ----------- | -------- |
| `np.sum()`  | Total    |
| `np.mean()` | Average  |
| `np.min()`  | Smallest |
| `np.max()`  | Largest  |


Example:
- np.mean(data)          # mean of all elements
- np.mean(data, axis=0)  # column-wise mean
- np.mean(data, axis=1)  # row-wise mean


**Interview**

*Q: What happens if axis is omitted?*

*NumPy flattens the array and computes over all elements.*

**Statistical Functions**

In [None]:
# standard Deviation & Variance
np.std(data)
np.var(data)


In [None]:
# argmax & argmin
arr = np.array([20,30,40,1,50])
print(np.argmax(arr))
print(np.argmin(arr))

Use when:

- Finding best/worst performer

- Ranking

- Decision logic

In [None]:
# With Axis
np.argmin(arr, axis=0)


**Cumulative Operations**

In [None]:
# cumsum  (Running total)
arr = np.array([10,20,30])
np.cumsum(arr)

In [None]:
# cumprod (Running multiplication)
arr = np.array([2,3,4,2,1,3,2])
np.cumprod(arr)

**NumPy Method vs Function**

In [None]:
# np.mean() → functional style
# arr.mean() → object-oriented style

print(np.mean(arr))
print(arr.mean())

In [None]:
scores = np.array([[20,30,40],
                  [30,40,70],
                  [50,30,80]])
avg_student = np.mean(scores,axis = 1)
avg_subject = np.mean(scores , axis = 0)
topper_student = np.argmax(avg_student)
print(avg_student)
print(avg_subject)
print(topper_student)

**Interview Questions**

Q: What does axis=0 mean?

 - Operation is performed column-wise by collapsing rows.

Q: Difference between sum() and cumsum()?

 - One returns total, the other returns running totals.

Q: Why avoid loops for aggregation?

 - NumPy functions are vectorized and faster.