# Module 2.2: Intro to Pandas - Series and DataFrames

If NumPy is the foundation for numerical data, **Pandas** is the ultimate tool for real-world data analysis and manipulation. It introduces two powerful data structures, the **Series** and the **DataFrame**, which allow us to work with labeled and relational data in an intuitive way. 🐼

Think of Pandas as bringing the power of spreadsheets or SQL tables directly into Python, but with far more flexibility and power.

**Goal of this Notebook:**
This notebook introduces the two core components of Pandas:

1.  **Series:** A 1D labeled array (like a single column in a spreadsheet).
2.  **DataFrame:** A 2D labeled data structure with columns of potentially different types (like a whole spreadsheet).
3.  How to create and inspect these objects.

In [None]:
import numpy as np
import pandas as pd

## 1. The Pandas Series

A Series is a one-dimensional array-like object that can hold any data type. What makes it special is its **index**, which is a set of labels for the data.

In [None]:
# Creating a Series from a Python list
labels = ['a', 'b', 'c']
data = [10, 20, 30]

my_series = pd.Series(data=data, index=labels)

print(my_series)

In [None]:
# You can access data using its label
print(f"Value at label 'b' is: {my_series['b']}")

## 2. The Pandas DataFrame

The DataFrame is the real workhorse of Pandas. It's a 2D table of data where each column can be a different data type. It has both a row index and a column index.

You can think of a DataFrame as a collection of Series that share the same index.

### Creating a DataFrame

In [None]:
# Let's create a DataFrame from a dictionary of lists
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 28],
    'City': ['New York', 'Paris', 'London']
}

df = pd.DataFrame(data)

# Displaying the DataFrame
df

### Basic DataFrame Operations

In [None]:
# Select a single column (this returns a Series!)
df['Age']

In [None]:
# Create a new column
# Let's add a 'Salary' column
df['Salary'] = [70000, 85000, 80000]

df

In [None]:
# Get the first few rows of the DataFrame
df.head(2) # Shows the first 2 rows

In [None]:
# Get a quick summary of the DataFrame
# This is useful for checking data types and missing values
df.info()

In [None]:
# Get descriptive statistics for numerical columns
df.describe()

## ✅ What's Next?

You've now been introduced to the fundamental building blocks of Pandas. With this knowledge, you are ready to start working with real datasets.

In the next notebook, we'll cover the most critical and common task in any data science project: **Data Cleaning**. We'll learn how to handle missing values, duplicates, and incorrect data types using a real dataset.