# 📊 Sorting, Aggregation & Grouping in Pandas

This notebook demonstrates:
- Sorting by one or more columns
- Aggregation functions (mean, sum, min, max, etc.)
- Grouping data with aggregation

We'll use pandas to handle tabular data efficiently.


In [2]:
# 🔃 Importing pandas
import pandas as pd

## 📋 Sample DataFrame

This is the sample data we'll use for all our operations.


In [3]:
# Creating sample data
data = {
    "Name": ['Ram', 'Shyam', 'Ghanshyam', 'Dhanshyam', 'Aditi', 'Jagdish', 'Raj', 'Simran'],
    "Age": [28, 32, 47, 57, 17, 27, 77, 25],
    "Salary": [5000, 6000, 45000, 5200, 4900, 7000, 9000, 17000],
    "Performance Score": [43, 71, 26, 59, 84, 38, 67, 22]
}
df = pd.DataFrame(data)
print(df)

        Name  Age  Salary  Performance Score
0        Ram   28    5000                 43
1      Shyam   32    6000                 71
2  Ghanshyam   47   45000                 26
3  Dhanshyam   57    5200                 59
4      Aditi   17    4900                 84
5    Jagdish   27    7000                 38
6        Raj   77    9000                 67
7     Simran   25   17000                 22


## 🔢 Sorting - Single Column

Use `df.sort_values(by="column_name", ascending=True/False, inplace=True)`

- `ascending=True` → Ascending order
- `ascending=False` → Descending order


In [4]:
# Sort by Name (ascending)
df.sort_values(by="Name", ascending=True, inplace=True)
print("📌 After Sorting by Name (ascending):")
print(df)

📌 After Sorting by Name (ascending):
        Name  Age  Salary  Performance Score
4      Aditi   17    4900                 84
3  Dhanshyam   57    5200                 59
2  Ghanshyam   47   45000                 26
5    Jagdish   27    7000                 38
6        Raj   77    9000                 67
0        Ram   28    5000                 43
1      Shyam   32    6000                 71
7     Simran   25   17000                 22


## 🔢 Sorting - Multiple Columns

Syntax:
```python
df.sort_values(by=["col1", "col2"], ascending=[True, False], inplace=True)


In [6]:
# Recreating DataFrame for fresh sort
df = pd.DataFrame(data)

# Sort by Age first, then Salary
df.sort_values(by=["Age", "Salary"], ascending=[True, True], inplace=True)
print("📌 After Sorting by Age and Salary:")
print(df)

📌 After Sorting by Age and Salary:
        Name  Age  Salary  Performance Score
4      Aditi   17    4900                 84
7     Simran   25   17000                 22
5    Jagdish   27    7000                 38
0        Ram   28    5000                 43
1      Shyam   32    6000                 71
2  Ghanshyam   47   45000                 26
3  Dhanshyam   57    5200                 59
6        Raj   77    9000                 67


## 📈 Aggregation Functions

We can use the following functions on a column:
- `df["Column"].mean()` → Average
- `df["Column"].sum()` → Total
- `df["Column"].min()` → Minimum
- `df["Column"].max()` → Maximum
- `df["Column"].std()` → Standard Deviation
- `df["Column"].count()` → Count of non-null values


In [None]:
# Aggregation Examples on Age column
print("📊 Aggregation on Age column:")
print("Mean:", df["Age"].mean())
print("Sum:", df["Age"].sum())
print("Min:", df["Age"].min())
print("Max:", df["Age"].max())
print("Std Dev:", df["Age"].std())
print("Count:", df["Age"].count())

📊 Aggregation on Age column:
Mean: 38.75
Sum: 310
Min: 17
Max: 77
Std Dev: 20.090865017287264
Count: 8


## 👥 Grouping and Aggregation

### ➤ Group by Single Column
Syntax:
```python
df.groupby("column")["target_column"].aggregation_function()


In [8]:
# Group by Age and calculate total Salary
group1 = df.groupby("Age")["Salary"].sum()
print("📌 Grouped by Age (Total Salary):")
print(group1)

📌 Grouped by Age (Total Salary):
Age
17     4900
25    17000
27     7000
28     5000
32     6000
47    45000
57     5200
77     9000
Name: Salary, dtype: int64
