# 📌 Why We Need Pandas – Introduction to the Library

In data science, we work with a lot of structured data — 
CSV files, Excel sheets, databases, APIs, etc.

Using only core Python (like lists, loops, and dictionaries) 
for data manipulation becomes:

- ❌ Slow
- ❌ Complex
- ❌ Error-prone

Let's look at an example.

In [None]:
# Trying to find the average age using plain Python lists

data = [
    ["Alice", 24],
    ["Bob", 27],
    ["Charlie", 22],
    ["Diana", 29]
]

total = 0
for person in data:
    total += person[1]

average_age = total / len(data)
print("Average Age:", average_age)
# Output: Average Age: 25.5

This works — but:

- We access data using `person[1]`, which is not readable.
- No column names.
- For larger datasets, this gets hard to manage.

📉 That's where **Pandas** helps!


🐼 **Pandas** is an open-source Python library that makes data handling easy and powerful.

It introduces two main structures:

- `Series`: 1D labeled data
- `DataFrame`: 2D table with rows and columns (like Excel)

Let’s see how we can do the same task using Pandas.


In [None]:
import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({
    "Name": ["Alice", "Bob", "Charlie", "Diana"],
    "Age": [24, 27, 22, 29]
})

# Calculate average age
average_age = df["Age"].mean()
print("Average Age:", average_age)
# Output: Average Age: 25.5

✅ Cleaner code  
✅ Column names make it readable  
✅ Built-in functions like `.mean()` save time

Pandas makes it easier to clean, filter, and analyze data — all in fewer lines of code.


🔍 **Summary**

- Core Python is not efficient for working with structured data.
- Pandas provides fast, readable, and flexible tools to handle real-world datasets.
- That’s why Pandas is a key library in every data science project.
