# 📓 Lesson 7: Applying Functions and Sorting Data
📘 What you will learn:
1. Apply custom or built-in functions to columns using apply()
2. Use map() for simple value transformations
3. Use lambda functions for inline operations
4. Sort data with sort_values() and sort_index()
5. Create new calculated columns

## Step 1: Load the Dataset
We’ll use Sales_January_2019.csv.

In [None]:
import pandas as pd

df = pd.read_csv('../data/Sales_January_2019.csv')

# Convert numeric columns
df['Quantity Ordered'] = pd.to_numeric(df['Quantity Ordered'], errors='coerce')
df['Price Each'] = pd.to_numeric(df['Price Each'], errors='coerce')

# Remove invalid rows
df = df.dropna(subset=['Quantity Ordered', 'Price Each'])


## Step 2: Create a New Column
Let’s calculate Total Price = Quantity Ordered × Price Each

In [None]:
df['Total Price'] = df['Quantity Ordered'] * df['Price Each']
print(df[['Product', 'Quantity Ordered', 'Price Each', 'Total Price']].head())


## Step 3: Use apply() to Modify Data
You can apply custom functions to each row or column:

In [None]:
# Example: Capitalize product names
df['Product'] = df['Product'].apply(lambda x: str(x).title())

💡 apply() lets you pass a function (like lambda) to process each value.

You can also define your own function:

In [None]:
def label_price(row):
    if row['Total Price'] > 500:
        return 'High'
    elif row['Total Price'] > 100:
        return 'Medium'
    else:
        return 'Low'

df['Price Label'] = df.apply(label_price, axis=1)
print(df[['Total Price', 'Price Label']].head())

💡 axis=1 means apply the function to each row (not column).

## Step 4: Use map() for Simple Replacements
If you only want to change values based on a lookup:

In [None]:
# Shorten product names
df['Short Name'] = df['Product'].map({
    'Usb-C Charging Cable': 'USB Cable',
    'Bose Soundsport Headphones': 'Headphones',
    'Lightning Charging Cable': 'Charging Cable',
    'Wired Headphones': 'Headphones',
    'Macbook Pro Laptop': 'Macbook'
})
print(df[['Product', 'Short Name', 'Price Each']].head())

💡 map() is great for replacing specific known values.

## Step 5: Sort the Data
You can sort by column values:

In [None]:
# Sort by total price (high to low)
df_sorted = df.sort_values(by='Total Price', ascending=False)
print(df_sorted[['Product', 'Total Price']].head())

You can also sort by index:

In [None]:
# Sort by index (low to high)
df_sorted = df.sort_index()

print(df_sorted[['Product', 'Total Price']].head())

## Practice Exercises
1. Create a new column Total Price
2. Add a column Price Label with values High, Medium, Low based on price
3. Map short names for 3 product types
4. Sort the dataset by Total Price in descending order

In [None]:
# 1
df['Total Price'] = df['Quantity Ordered'] * df['Price Each']

# 2
def label(row):
    if row['Total Price'] > 500:
        return 'High'
    elif row['Total Price'] > 100:
        return 'Medium'
    return 'Low'

df['Price Label'] = df.apply(label, axis=1)

# 3
df['Short Name'] = df['Product'].map({
    'Apple Airpods Headphones': 'Airpods',
    'Macbook Pro Laptop': 'Macbook',
    'Google Phone': 'Pixel'
})

# 4
df = df.sort_values('Total Price', ascending=False)
print(df[['Product', 'Short Name', 'Price Each', 'Total Price']].head())

## Summary
- In this lesson, you learned:
- How to use apply() to create calculated or labeled columns
- How to use map() to replace values
- How to sort data with sort_values()
- How to use lambda and custom functions for flexibility

👉 In the next lesson, you will learn how to merge and combine multiple datasets using merge(), concat(), and join().