# **Population vs Sample Data**

In statistics, data can be divided into two main categories: **Population** and **Sample.**

---

## **Population**
- The **population** is the **entire group** of individuals or observations that we are interested in studying.  
- It includes **all possible data points** in the group.  
- Denoted by **N (capital letter).**  
- Example:  
  - All 100,000 people in a city.  
  - Collecting the **weights of all the people** in a region.  

---

## **Sample**
- A **sample** is a **small part or subset** of the population that is selected for analysis.  
- Denoted by **n (small letter).**  
- It is used when studying the entire population is impractical or time-consuming.  
- Example:  
  - Taking data from **10,000 people (sample)** out of **100,000 (population)** to estimate average weight.  
  - **Exit polls:** Conducted on a sample of voters to predict election results.

---

## **Example Visualization**
| Concept | Description | Symbol |
|----------|--------------|--------|
| **Population** | Entire data (e.g., 100K people) | N |
| **Sample** | Subset of the population (e.g., 10K people) | n |

---

## **Conclusion**
- **Population data** represents the full set of observations.  
- **Sample data** represents a smaller set taken from the population for analysis.  
- We use the **sample** to make inferences about the **population.**


In [1]:
# Example: Population vs Sample Average Calculation

import numpy as np

# Population data (weights in kg)
population = np.random.normal(70, 10, 100000)  # mean=70kg, std=10kg, 100K people

# Sample data (10K people randomly chosen)
sample = np.random.choice(population, 10000, replace=False)

# Calculate averages
population_mean = np.mean(population)
sample_mean = np.mean(sample)

print(f"Population Mean: {population_mean:.2f} kg")
print(f"Sample Mean: {sample_mean:.2f} kg")
print(f"Difference between Population and Sample Mean: {abs(population_mean - sample_mean):.2f} kg")


Population Mean: 70.05 kg
Sample Mean: 70.07 kg
Difference between Population and Sample Mean: 0.02 kg
