## `scipy.stats` 📊✨

Imagine you’re a scientist or data enthusiast, sitting with piles of data, and you're thinking, “Yaar, how can I **understand the hidden patterns** or **test hypotheses** with all this data?” Well, that’s exactly where **`scipy.stats`** comes to the rescue! This magical module in SciPy offers you everything you need to analyze, visualize, and make sense of data in the **simplest yet most powerful ways**.

In **`scipy.stats`**, you get:
1. **Tools to Summarize Data** – Need the average or median? Done! Want to see how much variation or spread your data has? Sorted!
2. **Probability Distributions Galore** – You get access to every kind of distribution you can imagine—normal, binomial, Poisson, and more. 
3. **Hypothesis Testing** – Curious to see if two groups are different? Or if a variable truly affects your outcome? `scipy.stats` is your best friend here.

So, in short, whether it’s about **basic statistics** like mean, median, and mode or **advanced tests and distributions**, `scipy.stats` has everything under one roof, making it the go-to for statisticians, data scientists, and even curious learners!

**Ready? Chalo, let’s explore `scipy.stats` together—step by step!** 🌟

## **Descriptive Statistics** - Data ka "Trailer"

**Descriptive statistics** are like the first impression of our data. Before going deep, we check things like **average**, **spread**, and **percentiles** to understand the data's basic behavior. Let’s start with a few functions to calculate these:

### 1.1 **Mean and Median**

- **Mean (average)**: Sum of all values divided by the count. Great for a quick look at central tendency.
- **Median**: The middle value when data is sorted. Super helpful when there are extreme values (outliers).

In [1]:
from scipy import stats

# Imagine we have the following exam scores:
scores = [88, 92, 79, 93, 85, 91]

# Calculating mean
mean_score = stats.tmean(scores)
# Calculating median using 50th percentile
median_score = stats.scoreatpercentile(scores, 50)

print("Mean score:", mean_score)
print("Median score:", median_score)

Mean score: 88.0
Median score: 89.5


tmean() gives us the average score, while scoreatpercentile() lets us find the median by setting it to the 50th percentile. This works even when data has odd values, unlike some simpler functions.

### **Variance and Standard Deviation**

- **Variance**: Measures how spread out the numbers are. A higher variance means data points are more spread from the mean.
- **Standard Deviation**: The square root of variance; tells us how much values typically vary from the average.

In [2]:
# Calculating variance and standard deviation
variance = stats.tvar(scores)
std_dev = stats.tstd(scores)

print("Variance:", variance)
print("Standard Deviation:", std_dev)

Variance: 28.0
Standard Deviation: 5.291502622129181


Here, tvar() calculates variance, and tstd() gives us the standard deviation. The lower the standard deviation, the closer your data points are to the mean, indicating consistency in scores!