# Title: Python Series – Day 47: Introduction to Pandas (Data Analysis in Python)

## 1. Introduction
**Pandas** is the most popular Python library for data manipulation and analysis.

**Why Pandas?**
- Handles large datasets efficiently.
- Great for data cleaning and transformation.
- Integrates with other libraries like NumPy, Matplotlib, and Scikit-Learn.

**Key Structures:**
- **Series:** 1D labeled array (like a column).
- **DataFrame:** 2D labeled data structure (like a table).

## 2. Installing Pandas
Run the command below if you haven't installed it.

In [None]:
# !pip install pandas

## 3. Import Pandas
Standard alias is `pd`.

In [None]:
import pandas as pd

## 4. Creating Series
A Series is like a list but with an index.

In [None]:
s = pd.Series([10, 20, 30, 40])
print("Series:")
print(s)
print("\nAccess element at index 1:", s[1])

## 5. Creating DataFrames
A DataFrame is a table with rows and columns. Can be created from a dictionary.

In [None]:
data = {
    "Name": ["Ali", "Sara", "Omar", "Zara"],
    "Age": [20, 22, 21, 23],
    "Marks": [88, 92, 75, 85]
}

df = pd.DataFrame(data)
print("DataFrame:")
display(df) # In Jupyter, display() shows nice HTML table
print("\nShape:", df.shape)

## 6. Reading Data from Files
Pandas can read CSV, Excel, JSON, and more.

*(First, let's create a dummy CSV file locally to read)*

In [None]:
# Creating a sample CSV file
df.to_csv("students.csv", index=False)
print("Created students.csv")

# Reading it back
df_load = pd.read_csv("students.csv")
df_load.head()

## 7. Displaying Data
Quick ways to inspect your data.

In [None]:
print("First 2 rows:")
display(df.head(2))

print("\nInfo:")
df.info()

print("\nStatistics:")
display(df.describe())

## 8. Selecting Data
- Column Selection: `df['col']`
- Row Selection by Label: `df.loc[]`
- Row Selection by Position: `df.iloc[]`

In [None]:
# Select Column
ages = df["Age"]
print("Ages:\n", ages)

# Select Row (Index 0)
first_row = df.loc[0]
print("\nFirst Row:\n", first_row)

## 9. Filtering Data
Conditional selection.

In [None]:
high_marks = df[df["Marks"] > 80]
print("Students with Marks > 80:")
display(high_marks)

## 10. Adding & Removing Columns

In [None]:
df["Grade"] = ["A", "A+", "B", "A"] # Add
display(df)

# Remove (axis=1 means column)
# df.drop("Age", axis=1, inplace=True) # Uncommenting will delete Age

## 11. Sorting Data

In [None]:
sorted_df = df.sort_values("Marks", ascending=False)
display(sorted_df)

## 12. Handling Missing Values
Real data is messy. `fillna()` and `dropna()` help.

In [None]:
import numpy as np
df.loc[4] = ["John", np.nan, np.nan, None] # Adding messy row

print("Null Check:")
print(df.isnull().sum())

# Fill NaNs with 0 or mean
df_clean = df.fillna(0)
display(df_clean)

## 13. Exporting Data
Saving your transformed data set.

In [None]:
df_clean.to_csv("output.csv", index=False)
print("Saved cleaned data to output.csv")

## 14. Practice Exercises
1. Create a DataFrame for 5 products (Name, Price, Quantity).
2. Filter products with Price > 50.
3. Add a "Total Value" column (Price * Quantity).
4. Sort by Total Value in descending order.
5. Save the result to a JSON file.

## 15. Mini Project – Student Result Analyzer
Analyze a dataset of student marks.

In [None]:
# Setup data for project
project_data = {
    "Student": ["A", "B", "C", "D", "E", "F"],
    "Score": [45, 90, 85, 30, 95, 60]
}
df_res = pd.DataFrame(project_data)

print("--- Student Result Analyzer ---")
df_res["Status"] = df_res["Score"].apply(lambda x: "Pass" if x >= 50 else "Fail")

passed = df_res[df_res["Status"] == "Pass"]
failed = df_res[df_res["Status"] == "Fail"]

print(f"Total Students: {len(df_res)}")
print(f"Passed: {len(passed)}")
print(f"Failed: {len(failed)}")

print("\nTop 3 Toppers:")
display(passed.sort_values("Score", ascending=False).head(3))

# Export
passed.to_csv("passed_students.csv", index=False)
print("Saved passed_students.csv")

## 16. Day 47 Summary
- **Pandas** = Powerful Data Analysis.
- **DataFrame** = Core table structure.
- **Workflow**: Read -> Inspect -> Clean -> Analyze -> Export.

**Next topic: Day 48 – Pandas Data Cleaning & Transformation (Advanced)**