# Pandas — Python Data Analysis Library

## What is Pandas?

- Pandas is a powerful **data analysis and manipulation** library for Python.
- Used for:
  - Tabular data (like spreadsheets, CSVs)
  - Time series
  - Data cleaning & analysis

📦 Install: `pip install pandas`

---




## 📌 Importing Pandas

```python
import pandas as pd

## Creating DataFrames
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.



In [5]:
import pandas as pd
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 22],
    'City': ['Delhi', 'Mumbai', 'Chennai']
}

df = pd.DataFrame(data)
print(df)

      Name  Age     City
0    Alice   25    Delhi
1      Bob   30   Mumbai
2  Charlie   22  Chennai


## Reading CSV/Excel

In [None]:
df = pd.read_csv('data.csv')      # Read CSV
df = pd.read_excel('data.xlsx')   # Read Excel

## Viewing Data

In [None]:
df.head()         # First 5 rows
df.tail()         # Last 5 rows
df.info()         # Data types and non-null info
df.describe()     # Summary statistics

## Selecting Columns and Rows



In [None]:
df['Name']             # Single column
df[['Name', 'City']]   # Multiple columns

#Pandas use the loc attribute to return one or more specified row(s)
df.iloc[0]             # First row (by index)
df.loc[0]              # First row (by label)

## Filtering Data (Conditions)

In [None]:
# People older than 24
df[df['Age'] > 24]

# People from Mumbai
df[df['City'] == 'Mumbai']

## Adding & Modifying Columns


In [None]:
df['Senior'] = df['Age'] > 28

## Deleting Columns or Rows

In [None]:
df.drop('Senior', axis=1, inplace=True)   # Drop column
df.drop(0, axis=0)                         # Drop first row

## Sorting Data

In [None]:
df.sort_values(by='Age', ascending=False)

## Grouping Data



In [None]:
df.groupby('City')['Age'].mean()


## Handling Missing Values

In [None]:
df.isnull().sum()            # Count missing values
df.fillna(0)                 # Fill missing with 0
df.dropna()                  # Drop rows with missing

## Merging & Joining

In [None]:
pd.merge(df1, df2, on='ID')  # Merge on common column

## Saving to File

In [None]:
df.to_csv('output.csv', index=False)

## Practice Mini Task



In [9]:
# Task: Load student marks and analyze
marks = {
    'Student': ['A', 'B', 'C'],
    'Math': [80, 90, 85],
    'Science': [75, 88, 82]
}

df = pd.DataFrame(marks)
df['Total'] = df['Math'] + df['Science']
df['Average'] = df[['Math', 'Science']].mean(axis=1)
df.sort_values(by='Average', ascending=False)


Unnamed: 0,Student,Math,Science,Total,Average
1,B,90,88,178,89.0
2,C,85,82,167,83.5
0,A,80,75,155,77.5


## Summary Table

| Task            | Code                    |
| --------------- | ----------------------- |
| Read CSV        | `pd.read_csv()`         |
| View top rows   | `df.head()`             |
| Select column   | `df['col']`             |
| Filter rows     | `df[df['col'] > value]` |
| Add new column  | `df['new'] = ...`       |
| Drop column/row | `df.drop()`             |
| Group by        | `df.groupby('col')`     |
| Export          | `df.to_csv()`           |


✅ Pandas makes working with structured data easy, efficient, and intuitive — it’s a must-know for data science and analysis!