#  **2. Statistical Functions in NumPy** 📊





NumPy provides various functions for **descriptive statistics**.

| Function         | Description                          |
|------------------|--------------------------------------|
| `np.mean()`        | Mean (average) of values             |
| `np.median()`      | Median of values                     |
| `np.std()`         | Standard deviation                   |
| `np.var()`         | Variance                             |
| `np.min()`         | Minimum value                        |
| `np.max()`         | Maximum value                        |
| `np.percentile()`  | nth percentile of array elements     |


 


In [1]:
import numpy as np

In [2]:
arr = np.array([4,5,9,7,8,9,2,5,15,69])

print("Mean:", np.mean(arr))

Mean: 13.3


In [3]:
print("Median:",np.median(arr))

Median: 7.5


In [4]:
print("Standard Deviation:",np.std(arr))

Standard Deviation: 18.87352643254567


In [5]:
print("Variance:",np.var(arr))

Variance: 356.21000000000004


In [6]:
print("min:",np.min(arr))


min: 2


In [7]:
print("Max:",np.max(arr))

Max: 69


In [9]:
print("90th percentile:", np.percentile(arr,90))

90th percentile: 20.39999999999998


#  **Sorting and Searching** 📉

### ➤ Sorting an Array

In [10]:
arr = np.array([5,9,6,7,2,6])

In [11]:
print(np.sort(arr))

[2 5 6 6 7 9]


### ➤ Searching for Elements

* ```np.where()``` → Returns the indices of elements that satisfy a condition.

* ```np.argmax()``` → Returns the index of the maximum value.

* ```np.argmin()``` → Returns the index of the minimum value.

In [16]:
print(arr)

[5 9 6 7 2 6]


In [14]:
print(np.where(arr>15))
print(np.where(arr>5))

(array([], dtype=int64),)
(array([1, 2, 3, 5], dtype=int64),)


In [15]:
print(np.argmax(arr))

1


In [17]:
print(np.argmin(arr))

4


### ✅ Practice Tasks
1. Generate a 3x3 random integer matrix with values between 1 and 100.

    * Find its mean, median, and standard deviation.

2. Create an array of 10 random numbers between 1 and 50 and sort them.

3. Create an array of 20 random numbers and filter all values greater than 25.

4. Generate a 4x4 matrix of random values from a normal distribution.

# Statistical Concepts in NumPy

## 1. Mean (Average)
### Formula: 
μ = (Σ x) / n
Where:
- μ is the mean
- Σ x represents the sum of all values
- n is the total number of values

**Meaning**: The arithmetic average that represents the central tendency of a dataset. It's calculated by adding all values and dividing by the total number of values.

## 2. Median
### Formula:
- For odd number of values: Middle value after sorting
- For even number of values: Average of two middle values

**Meaning**: The middle value in a sorted dataset. It divides the data into two equal halves, making it less sensitive to extreme outliers compared to the mean.

## 3. Standard Deviation (σ)
### Formula:
σ = √[Σ(x - μ)² / n]
Where:
- σ is standard deviation
- x represents each value
- μ is the mean
- n is the total number of values

**Meaning**: A measure of data dispersion that indicates how spread out the values are from the mean. A low standard deviation suggests values are close to the mean, while a high value indicates more variability.

## 4. Variance
### Formula:
σ² = Σ(x - μ)² / n
Where:
- σ² is variance
- x represents each value
- μ is the mean
- n is the total number of values

**Meaning**: The average of the squared differences from the mean. It's essentially the square of standard deviation and provides insight into data variability.

## 5. Min and Max
### Formulas:
- Min: Lowest value in the dataset
- Max: Highest value in the dataset

**Meaning**: Provide the range boundaries of the dataset, helping to understand the spread of data.

## 6. Percentile
### Formula:
P = (k/100) * (n+1)
Where:
- P is the percentile position
- k is the desired percentile (e.g., 90)
- n is the total number of values

**Meaning**: The value below which a given percentage of observations fall. The 90th percentile means 90% of the data points are below this value.

### Practical Example
In the given array `[1, 2, 3, 4, 5, 6]`:
- Mean (3.5): Central tendency
- Median (3.5): Middle point
- Standard Deviation (1.707): Spread of data
- Variance (2.916): Squared spread
- Min (1) and Max (6): Data range
- 90th Percentile (5.5): 90% of values are below 5.5