# Arithmetic Operations and Functions in Pandas

So far we explored how to inspect and select data in Pandas. Once you have access to the right rows and columns, the next step is to perform calculations and apply functions. Pandas makes this process very intuitive by allowing you to apply arithmetic directly to DataFrames or Series, and by offering tools like apply(), map(), and applymap() for more flexibility.

## Arithmetic Operations on Columns

You can directly apply mathematical operations to Pandas Series or DataFrame columns. Operations are vectorized, meaning they are applied element-wise across the column.

In below example, you can notice how the operations are automatically applied to each row.

In [None]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [24, 30, 28],
    'Salary': [50000, 60000, 55000]
}
df = pd.DataFrame(data)
print(df)

# Increase all salaries by 10%
df['Salary'] = df['Salary'] * 1.10

# Add 5 years to everyone’s age
df['Age'] = df['Age'] + 5
print(df)

## Arithmetic Between Columns
You can also perform arithmetic between two or more columns to create new features.

In [None]:
# Create a new column 'Income_per_Age'
df['Income_per_Age'] = df['Salary'] / df['Age']
print(df)

# Applying Built-in Pandas/Numpy Functions

Pandas integrates with NumPy functions, allowing you to apply common statistics directly.


In [None]:
import numpy as np

# Calculate average salary
print(df['Salary'].mean())

# Standard deviation of Age
print(df['Age'].std())

# Apply numpy square root
print(np.sqrt(df['Age']))

### Applying Functions with **apply()**

Sometimes you need custom transformations. The apply() method lets you apply a function to an entire column (Series) or to each row/column in a DataFrame.


In [None]:
# Apply to a Series
df['Age_squared'] = df['Age'].apply(lambda x: x**2)

# Apply to DataFrame across rows
df['Total'] = df[['Age','Salary']].apply(lambda row: row['Age'] + row['Salary'], axis=1)
print(df)

Note that, we can also apply a Function Elementwise with applymap() and to a Single Column with map() but not covering in this course.

# Filtering Data in Pandas

Once you know how to select columns and rows, the next step is learning how to filter data. Filtering helps you focus on only the relevant part of your dataset, whether that means removing unnecessary columns, isolating rows that meet certain conditions, or preparing features for modeling.

## Filtering Columns

Column filtering is about selecting only the columns you need or dropping the ones you don’t. This reduces memory usage and keeps your DataFrame manageable.

In [None]:
# Select a single column
df['Age']

# Select multiple columns
df[['Name', 'City']]



## Dropping Unused Columns

In [None]:
# Drop the 'City' column
df = df.drop(columns=['City'])
print(df)

This is especially useful when preparing data for machine learning, where only selected features are required.

## Filtering Rows (using Boolean Indexing)

Row filtering is usually done with Boolean indexing, where you apply a condition and return only the rows where that condition is true.

In [None]:
# Filter rows where Age > 30
df[df['Age'] > 30]

# Combining Multiple Conditions
You can combine conditions using & (and) or | (or).

In [None]:
# Filter rows where Age > 30 AND Salary > 60000
df[(df['Age'] > 30) & (df['Salary'] > 60000)]


> Remember to wrap each condition in parentheses.

# Filtering Strings

You can filter rows where a text column contains specific values

In [None]:
# Filter rows where City contains "York"
df[df['City'].str.contains("York")]

## Unique Values and Counting

Sometimes you want to check how many unique values a column has, or count how often each appears.

In [None]:
# Unique cities
print(df['City'].unique())

# Count frequency of each city
print(df['City'].value_counts())
