NumPy Questions

**Pandas** library in Python:

---

## 🐼 **Pandas – Python Library for Data Analysis**

### 🔹 What is Pandas?

**Pandas** is an open-source Python library providing fast, flexible, and expressive data structures designed to work with structured (tabular), semi-structured, and time-series data.

It is especially useful for:

* Cleaning and preparing data
* Analyzing large datasets
* Converting data formats
* Performing data wrangling

---

### 🔹 Key Data Structures in Pandas:

| Structure   | Description                                                                 |
| ----------- | --------------------------------------------------------------------------- |
| `Series`    | One-dimensional labeled array (like a column in Excel)                      |
| `DataFrame` | Two-dimensional labeled data structure (like a table with rows and columns) |

---

### 🔹 Why Use Pandas?

* Easy handling of **missing data**
* Powerful **grouping** and **aggregation**
* High-performance **merging** and **joining**
* Built-in support for **time-series** data
* Easy **reading and writing** to CSV, Excel, SQL, JSON, etc.

---

### 🔹 Common Pandas Functions:

| Function                             | Purpose                             |
| ------------------------------------ | ----------------------------------- |
| `read_csv()`                         | Load data from a CSV file           |
| `head()` / `tail()`                  | View top/bottom rows of a DataFrame |
| `info()` / `describe()`              | Get data summary and statistics     |
| `isnull()` / `dropna()` / `fillna()` | Handle missing data                 |
| `groupby()`                          | Group and aggregate data            |
| `merge()` / `join()` / `concat()`    | Combine DataFrames                  |
| `to_csv()`                           | Export data to a CSV file           |

---

### 🔹 Example Code:

```python
import pandas as pd

# Load CSV
df = pd.read_csv('data.csv')

# Show first 5 rows
print(df.head())

# Get basic info
print(df.info())

# Filter data
filtered = df[df['age'] > 25]

# Group by and mean
grouped = df.groupby('department')['salary'].mean()
```

---

### 🔹 Real-World Use Cases:

* Analyzing sales data
* Preprocessing datasets for machine learning
* Financial data analysis
* Automating reports and dashboards

---

