# Important Pandas Dataframe Methods

## Common Pandas Data Operations

These are frequently used Pandas functions for data cleaning, exploration, and transformation:

---

### 🔢 `value_counts()`

- Returns a Series containing counts of unique values.
```python
df['column'].value_counts()
```

---

### 📊 `sort_values()`

- Sorts a Series or DataFrame by values.
```python
df.sort_values(by='column', ascending=False)
```

---

### 🏅 `rank()`

- Assigns ranks to entries in a Series or DataFrame.
```python
df['column'].rank()
```

---

### 🔠 `sort_index()`

- Sorts by row or column index.
```python
df.sort_index()
```

---

### 📝 `rename()` (formerly known as rename index)

- Renames columns or index labels.
```python
df.rename(columns={'old': 'new'}, index={0: 'row1'})
```

---

### 🆔 `unique()` and `nunique()`

- `unique()` returns the unique values.
- `nunique()` returns the count of unique values.
```python
df['column'].unique()
df['column'].nunique()
```

---

### ❓ `isnull()` / `notnull()` / `hasnans`

- Detects missing values.
```python
df.isnull()
df.notnull()
df.hasnans  # DataFrame or Series attribute (returns True/False)
```

---

### 🧹 `dropna()`

- Removes rows/columns with missing values.
```python
df.dropna()
```

---

### 🧴 `fillna()`

- Replaces missing values with a specified value.
```python
df.fillna(0)
```

---

### 🧼 `drop_duplicates()`

- Removes duplicate rows.
```python
df.drop_duplicates()
```

---

### 🗑️ `drop()`

- Drops specified rows or columns.
```python
df.drop(columns=['col1'], index=[0])
```

---

### ⚙️ `apply()`

- Applies a function to each element (or row/column) in a Series or DataFrame.
```python
df['column'].apply(lambda x: x * 2)
```

---

These functions are essential tools in any data analysis or data cleaning workflow using **Pandas**.


## 🧱 GroupBy Object in Pandas

### 🔹 GroupBy Foundation

In **pandas**, the `GroupBy` object is essential for data **aggregation**, **transformation**, and **analysis**.

- Created using the `.groupby()` method.
- Allows operations to be performed on **subsets** of the DataFrame based on grouping criteria.
- Useful for summarizing and analyzing patterns in the data.

```python
grouped = df.groupby('column_name')
```

You can group by:
- A single column
- Multiple columns
- Index levels

---

### 🔧 Common Aggregation Functions

| Function   | Description                        |
|------------|------------------------------------|
| `.sum()`   | Total of values in each group      |
| `.mean()`  | Mean value of each group           |
| `.count()` | Number of non-null values          |
| `.max()`   | Maximum value per group            |
| `.min()`   | Minimum value per group            |

---

### 🧰 GroupBy Attributes & Methods

| Method / Attribute | Description |
|--------------------|-------------|
| `len(grouped)`     | Total number of groups |
| `grouped.size()`   | Number of items in each group |
| `grouped.first()`  | First item from each group |
| `grouped.nth(n)`   | nth item from each group |
| `grouped.last()`   | Last item from each group |
| `grouped.get_group('key')` | Access a specific group |
| `grouped.groups`   | Dictionary of groups and their indices |
| `grouped.describe()` | Summary statistics per group |
| `grouped.sample(n=1)` | Sample n rows from each group |
| `grouped.nunique()` | Unique values count in each group |
| `grouped.agg()`     | Apply multiple aggregations |
| `grouped.apply()`   | Apply a custom or built-in function to each group |

---

### ✅ Example

```python
import pandas as pd

df = pd.DataFrame({
    'Department': ['HR', 'HR', 'IT', 'IT', 'Finance', 'Finance'],
    'Salary': [40000, 42000, 60000, 61000, 50000, 52000]
})

grouped = df.groupby('Department')
print(grouped.mean())
```

---

### 📌 Summary

The **GroupBy object** in Pandas is powerful for:
- Summarizing large datasets
- Performing group-level transformations
- Extracting insights from categories or segments

Use `.groupby()` along with aggregation or transformation functions to unlock the full potential of your data!
