# What is a Unit of Observation?

Understanding the context behind numbers

### Python Libraries

In [1]:
import pandas as pd

## Situation 1 - Income

Imagine:

1. You want to study **‚Äúincome.‚Äù**
2. You collect numbers: Row 1 = individual? Row 2 = family? Row 3 = household?  

You can calculate averages, sums, percentages‚Ä¶ but then a question arises:  

> What do these numbers really represent?  
> Who or what is each number describing?  

üî∫ At this point, you realize there‚Äôs something missing that makes the numbers meaningful.

**Final question:**  

If numbers exist, but we don‚Äôt know **who or what they are measured for**, **what gives them comparative meaning?**

### Problem Demonstration

We start with a small dataset **without a clearly defined unit**:

In [2]:
data_income = {
    "record": ["A", "B", "C", "D", "E"],
    "income": [50000, 60000, 55000, 45000, 70000]
}

df_income = pd.DataFrame(data_income)
df_income

Unnamed: 0,record,income
0,A,50000
1,B,60000
2,C,55000
3,D,45000
4,E,70000


üî∫ We have numbers, but we don‚Äôt know if they represent individuals, households, or families. Any summary (mean, sum) is ambiguous.

### Solution: Define the Unit

Let‚Äôs say we define ‚Äúunit = individual‚Äù. Now every row represents a single person‚Äôs income.

In [3]:
# Assign unit
df_income["unit"] = "individual"

cols_income = ["record", "unit", "income"]
df_income = df_income[cols_income]

df_income

Unnamed: 0,record,unit,income
0,A,individual,50000
1,B,individual,60000
2,C,individual,55000
3,D,individual,45000
4,E,individual,70000


### Demonstrating Analysis

Now you can, for example, compute meaningful summaries

In [4]:
# Average income
average_income = df_income["income"].mean()
print("Average income per individual:", average_income)

# Total income
total_income = df_income["income"].sum()
print("Total income for all individuals:", total_income)

Average income per individual: 56000.0
Total income for all individuals: 280000


## Conclusions

### Units define context

Numbers are meaningless without knowing who or what they describe.

### Units give meaning to summaries

Averages, sums, and percentages are interpretable only per defined unit (e.g., per student, per individual).

### Assign units before analysis

This ensures all calculations, visualizations, and comparisons are consistent and meaningful.

## üîπ Key takeaway: 

Always ask: ‚ÄúWho or what does this number describe?‚Äù