# üêº Pandas - Class 5: Sorting & Basic Statistics
Welcome to **Class 5** of our Pandas series. Today we‚Äôll learn how to sort data and calculate basic statistics.

## 1. Sorting Data
- Use `sort_values(by='col')` to sort by column values.
- Use `ascending=False` for descending order.
- Sort by multiple columns by passing a list.
- `sort_index()` sorts by row or column index.

In [16]:
import pandas as pd

data = {
    'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Speaker', 'Headphones', 'Printer', 'External SSD', 'Router'],
    'Price': [1200, 25, 75, 300, 50, 150, 100, 250, 80, 60],
    'Stock': [10, 50, 30, 8, 20, 15, 40, 5, 25, 35],
    'Rating': [4.5, 3.8, 4.2, 4.0, 3.5, 4.1, 4.3, 3.9, 4.5, 4.0]
}

df = pd.DataFrame(data)
display(df)

Unnamed: 0,Product,Price,Stock,Rating
0,Laptop,1200,10,4.5
1,Mouse,25,50,3.8
2,Keyboard,75,30,4.2
3,Monitor,300,8,4.0
4,Webcam,50,20,3.5
5,Speaker,150,15,4.1
6,Headphones,100,40,4.3
7,Printer,250,5,3.9
8,External SSD,80,25,4.5
9,Router,60,35,4.0


In [17]:
df.sort_values(by='Price', ascending=False) #You can use 1/0 as True/False

Unnamed: 0,Product,Price,Stock,Rating
0,Laptop,1200,10,4.5
3,Monitor,300,8,4.0
7,Printer,250,5,3.9
5,Speaker,150,15,4.1
6,Headphones,100,40,4.3
8,External SSD,80,25,4.5
2,Keyboard,75,30,4.2
9,Router,60,35,4.0
4,Webcam,50,20,3.5
1,Mouse,25,50,3.8


In [18]:
df.sort_values(by=['Stock', 'Price'], ascending=[True, False]) #Priority Will be the first Element

Unnamed: 0,Product,Price,Stock,Rating
7,Printer,250,5,3.9
3,Monitor,300,8,4.0
0,Laptop,1200,10,4.5
5,Speaker,150,15,4.1
4,Webcam,50,20,3.5
8,External SSD,80,25,4.5
2,Keyboard,75,30,4.2
9,Router,60,35,4.0
6,Headphones,100,40,4.3
1,Mouse,25,50,3.8


## 2. Descriptive Statistics
- `mean()`, `median()`, `mode()` for central tendency.
- `std()` for standard deviation.
- `describe()` for a summary of stats (count, mean, std, min, quartiles, max).

In [19]:
df.mean(numeric_only=True)

Unnamed: 0,0
Price,229.0
Stock,23.8
Rating,4.08


In [20]:
df.median(numeric_only=True)

Unnamed: 0,0
Price,90.0
Stock,22.5
Rating,4.05


In [21]:
df.mode(numeric_only=True)

Unnamed: 0,Price,Stock,Rating
0,25,5,4.0
1,50,8,4.5
2,60,10,
3,75,15,
4,80,20,
5,100,25,
6,150,30,
7,250,35,
8,300,40,
9,1200,50,


In [22]:
df.std(numeric_only=True)

Unnamed: 0,0
Price,352.662886
Stock,14.905629
Rating,0.311983


In [23]:
df.describe()

Unnamed: 0,Price,Stock,Rating
count,10.0,10.0,10.0
mean,229.0,23.8,4.08
std,352.662886,14.905629,0.311983
min,25.0,5.0,3.5
25%,63.75,11.25,3.925
50%,90.0,22.5,4.05
75%,225.0,33.75,4.275
max,1200.0,50.0,4.5


## 3. Counting Values
- `value_counts()` shows the frequency of unique values in a Series.
- Use `normalize=True` to see percentages instead of counts.

In [24]:
df.value_counts(subset='Rating')

Unnamed: 0_level_0,count
Rating,Unnamed: 1_level_1
4.0,2
4.5,2
3.8,1
3.5,1
3.9,1
4.1,1
4.2,1
4.3,1


In [26]:
df.value_counts(subset='Rating', normalize=True)

Unnamed: 0_level_0,proportion
Rating,Unnamed: 1_level_1
4.0,0.2
4.5,0.2
3.8,0.1
3.5,0.1
3.9,0.1
4.1,0.1
4.2,0.1
4.3,0.1


## 4. Correlation & Covariance
- `corr()` computes correlation between numeric columns.
- `cov()` computes covariance.
- Correlation values range from -1 (negative) to +1 (positive).

In [27]:
df.corr(numeric_only=True)

Unnamed: 0,Price,Stock,Rating
Price,1.0,-0.507335,0.46636
Stock,-0.507335,1.0,-0.060689
Rating,0.46636,-0.060689,1.0


In [29]:
df['Price'].corr(df['Stock'])

np.float64(-0.5073350363123069)

In [28]:
df.cov(numeric_only=True)

Unnamed: 0,Price,Stock,Rating
Price,124371.111111,-2666.888889,51.311111
Stock,-2666.888889,222.177778,-0.282222
Rating,51.311111,-0.282222,0.097333


## Mini Practice
1. Build a DataFrame with columns: Name, Age, Score, Height.
2. Sort the data by Score (descending).
3. Get mean, median, mode, and std of numeric columns.
4. Use value_counts on Name or any categorical column.
5. Compute corr() and cov() for all numeric columns.

In [25]:
# Write your mini practice solution here

# 1. Build a DataFrame with columns: Name, Age, Score, Height

# 2. Sort the data by Score (descending)
# Write your code here

# 3. Get mean, median, mode, and std of numeric columns
# Write your code here

# 4. Use value_counts on Name or any categorical column
# Write your code here

# 5. Compute corr() and cov() for all numeric columns
# Write your code here


---
## Summary
- Learned to sort by columns with `sort_values` and by index with `sort_index`.
- Explored descriptive statistics: mean, median, mode, std, describe.
- Counted values with `value_counts`.
- Calculated correlation and covariance.